Spelling suggestions: "subject:"cofficient inference"" "subject:"cofficient lnference""
1 |
Confidence Distillation for Efficient Action RecognitionManzuri Shalmani, Shervin January 2020 (has links)
Modern neural networks are powerful predictive models. However, when it comes
to recognizing that they may be wrong about their predictions and measuring the
certainty of beliefs, they perform poorly. For one of the most common activation
functions, the ReLU and its variants, even a well-calibrated model can produce incorrect
but high confidence predictions. In the related task of action recognition, most
current classification methods are based on clip-level classifiers that densely sample a
given video for non-overlapping, same sized clips and aggregate the results using an
aggregation function - typically averaging - to achieve video level predictions. While
this approach has shown to be effective, it is sub-optimal in recognition accuracy
and has a high computational overhead. To mitigate both these issues, we propose
the confidence distillation framework to firstly teach a representation of uncertainty
of the teacher to the student and secondly divide the task of full video prediction
between the student and the teacher models. We conduct extensive experiments
on three action recognition datasets and demonstrate that our framework achieves
state-of-the-art results in action recognition accuracy and computational efficiency. / Thesis / Master of Science (MSc) / We devise a distillation loss function to train an efficient sampler/classifier for video-based action recognition tasks.
|
2 |
Optimizing Accuracy-Efficiency Tradeoffs in Emerging Neural WorkloadsAmrit Nagarajan (17593524) 11 December 2023 (has links)
<p>Deep Neural Networks (DNNs) are constantly evolving, enabling the power of deep learning to be applied to an ever-growing range of applications, such as Natural Language Processing (NLP), recommendation systems, graph processing, etc. However, these emerging neural workloads present large computational demands for both training and inference. In this dissertation, we propose optimizations that take advantage of the unique characteristics of different emerging workloads to simultaneously improve accuracy and computational efficiency.</p>
<p><br></p>
<p>First, we consider Language Models (LMs) used in NLP. We observe that the design process of LMs (pre-train a foundation model, and subsequently fine-tune it for different downstream tasks) leads to models that are highly over-parameterized for the downstream tasks. We propose AxFormer, a systematic framework that applies accuracy-driven approximations to create accurate and efficient LMs for a given downstream task. AxFormer eliminates task-irrelevant knowledge, and helps the model focus only on the relevant parts of the input.</p>
<p><br></p>
<p>Second, we find that during fine-tuning of LMs, the presence of variable-length input sequences necessitates the use of padding tokens when batching sequences, leading to ineffectual computations. It is also well known that LMs over-fit to the small task-specific training datasets used during fine-tuning, despite the use of known regularization techniques. Based on these insights, we present TokenDrop + BucketSampler, a framework that synergistically combines a new regularizer that drops a random subset of insignificant words in each sequence in every epoch, and a length-aware batching method to simultaneously reduce padding and address the overfitting issue.</p>
<p><br></p>
<p>Next, we address the computational challenges of Transformers used for processing inputs of several important modalities, such as text, images, audio and videos. We present Input Compression with Positional Consistency (ICPC), a new data augmentation method that applies varying levels of compression to each training sample in every epoch, thereby simultaneously reducing over-fitting and improving training efficiency. ICPC also enables efficient variable-effort inference, where easy samples can be inferred at high compression levels, and vice-versa.</p>
<p><br></p>
<p>Finally, we focus on optimizing Graph Neural Networks (GNNs), which are commonly used for learning on non-Euclidean data. Few-shot learning with GNNs is an important challenge, since real-world graphical data is often sparsely labeled. Self-training, wherein the GNN is trained in stages by augmenting the training data with a subset of the unlabeled data and their pseudolabels, has emerged as a promising approach. However, self-training significantly increases the computational demands of training. We propose FASTRAIN-GNN, a framework for efficient and accurate self-training of GNNs with few labeled nodes. FASTRAIN-GNN optimizes the GNN architecture, training data, training parameters, and the graph topology during self-training.</p>
<p><br></p>
<p>At inference time, we find that ensemble GNNs are significantly more accurate and robust than single-model GNNs, but suffer from high latency and storage requirements. To address this challenge, we propose GNN Ensembles through Error Node Isolation (GEENI). The key concept in GEENI is to identify nodes that are likely to be incorrectly classified (error nodes) and suppress their outgoing messages, leading to simultaneous accuracy and efficiency improvements. </p>
<p><br></p>
|
3 |
Cascaded Ensembling for Resource-Efficient Multivariate Time Series Anomaly DetectionMapitigama Boththanthrige, Dhanushki Pavithya January 2024 (has links)
The rapid evolution of Connected and Autonomous Vehicles (CAVs) has led to a surge in research on efficient anomaly detection methods to ensure their safe and reliable operation. While state-of-the-art deep learning models offer promising results in this domain, their high computational requirements present challenges for deployment in resource-constrained environments, such as Electronic Control Units (ECU) in vehicles. In this context, we consider using the ensemble learning technique specifically the cascaded modeling approach for real-time and resource-efficient multivariate time series anomaly detection in CAVs. The study was done in collaboration with SCANIA, a transport solutions provider. The company is now undergoing a transformation of providing autonomous and sustainable solutions and this work will contribute towards that transformation. Our methodology employs unsupervised learning techniques to construct a cascade of models, comprising a coarse-grained model with lower computational complexity at level one, and a more intricate fine-grained model at level two. Furthermore, we incorporate cascaded model training to refine the complex model's ability to make decisions on uncertain and anomalous events, leveraging insights from the simpler model. Through extensive experimentation, we investigate the trade-off between model performance and computational complexity, demonstrating that our proposed cascaded model achieves greater efficiency with no performance degradation. Further, we do a comparative analysis of the impact of probabilistic versus deterministic approaches and assess the feasibility of model training at edge environments using the Federated Learning concept.
|
Page generated in 0.06 seconds