Global ETD Search

21	Beyond Disagreement-based Learning for Contextual Bandits Pinaki Ranjan Mohanty (16522407) 26 July 2023 (has links) <p>While instance-dependent contextual bandits have been previously studied, their analysis<br> has been exclusively limited to pure disagreement-based learning. This approach lacks a<br> nuanced understanding of disagreement and treats it in a binary and absolute manner.<br> In our work, we aim to broaden the analysis of instance-dependent contextual bandits by<br> studying them under the framework of disagreement-based learning in sub-regions. This<br> framework allows for a more comprehensive examination of disagreement by considering its<br> varying degrees across different sub-regions.<br> To lay the foundation for our analysis, we introduce key ideas and measures widely<br> studied in the contextual bandit and disagreement-based active learning literature. We<br> then propose a novel, instance-dependent contextual bandit algorithm for the realizable<br> case in a transductive setting. Leveraging the ability to observe contexts in advance, our<br> algorithm employs a sophisticated Linear Programming subroutine to identify and exploit<br> sub-regions effectively. Next, we provide a series of results tying previously introduced<br> complexity measures and offer some insightful discussion on them. Finally, we enhance the<br> existing regret bounds for contextual bandits by integrating the sub-region disagreement<br> coefficient, thereby showcasing significant improvement in performance against the pure<br> disagreement-based approach.<br> In the concluding section of this thesis, we do a brief recap of the work done and suggest<br> potential future directions for further improving contextual bandit algorithms within the<br> framework of disagreement-based learning in sub-regions. These directions offer opportuni-<br> ties for further research and development, aiming to refine and enhance the effectiveness of<br> contextual bandit algorithms in practical applications.<br> <br> </p> Planning and decision making Statistical theory Contextual bandits Disagreement based learning Active Learning Interactive Learning Data Driven ML Linear Programming Transductive learning
22	NETWORK-AWARE FEDERATED LEARNING ACROSS HIGHLY HETEROGENEOUS EDGE/FOG NETWORKS Su Wang (17592381) 09 December 2023 (has links) <p dir="ltr">The parallel growth of contemporary machine learning (ML) technologies alongside edge/-fog networking has necessitated the development of novel paradigms to effectively manage their intersection. Specifically, the proliferation of edge devices equipped with data generation and ML model training capabilities has given rise to an alternative paradigm called federated learning (FL), moving away from traditional centralized ML common in cloud-based networks. FL involves training ML models directly on edge devices where data are generated.</p><p dir="ltr">A fundamental challenge of FL lies in the extensive heterogeneity inherent to edge/fog networks, which manifests in various forms such as (i) statistical heterogeneity: edge devices have distinct underlying data distributions, (ii) structural heterogeneity: edge devices have diverse physical hardware, (iii) data quality heterogeneity: edge devices have varying ratios of labeled and unlabeled data, and (iv) adversarial compromise: some edge devices may be compromised by adversarial attacks. This dissertation endeavors to capture and model these intricate relationships at the intersection of FL and highly heterogeneous edge/fog networks. To do so, this dissertation will initially develop closed-form expressions for the trade-offs between ML performance and resource cost considerations within edge/fog networks. Subsequently, it optimizes the fundamental processes of FL, encompassing aspects such as batch size control for stochastic gradient descent (SGD) and sampling for global aggregations. This optimization is jointly formulated with networking considerations, which include communication resource consumption and device-to-device (D2D) cooperation.</p><p dir="ltr">In the former half of the dissertation, the emphasis is first on optimizing device sampling for global aggregations in FL, and then on developing a self-sufficient hierarchical meta-learning approach for FL. These methodologies maximize expected ML model performance while addressing common challenges associated with statistical and system heterogeneity. Novel techniques, such as management of D2D data offloading, adaptive CPU clock cycle control, integration of meta-learning, and much more, enable these methodologies. In particular, the proposed hierarchical meta-learning approach enables rapid integration of new devices in large-scale edge/fog networks.</p><p dir="ltr">The latter half of the dissertation directs its ocus towards emerging forms of heterogeneity in FL scenarios, namely (i) heterogeneity in quantity and quality of local labeled and unlabeled data at edge devices and (ii) heterogeneity in terms of adversarially comprised edge devices. To deal with heterogeneous labeled/unlabeled data across edge networks, this dissertation proposes a novel methodology that enables multi-source to multi-target federated domain adaptation. This proposed methodology views edge devices as sources – devices with mostly labeled data that perform ML model training, or targets - devices with mostly unlabeled data that rely on sources’ ML models, and subsequently optimizes the network relationships. In the final chapter, a novel methodology to improve FL robustness is developed in part by viewing adversarial attacks on FL as a form of heterogeneity.</p> Networking and communications
23	MULTI-SPECTRAL FUSION FOR SEMANTIC SEGMENTATION NETWORKS Justin Cody Edwards (14700769) 31 May 2023 (has links) <p> </p> <p>Semantic segmentation is a machine learning task that is seeing increased utilization in multiples fields, from medical imagery, to land demarcation, and autonomous vehicles. Semantic segmentation performs the pixel-wise classification of images, creating a new, segmented representation of the input that can be useful for detected various terrain and objects within and image. Recently, convolutional neural networks have been heavily utilized when creating neural networks tackling the semantic segmentation task. This is particularly true in the field of autonomous driving systems.</p> <p>The requirements of automated driver assistance systems (ADAS) drive semantic segmentation models targeted for deployment on ADAS to be lightweight while maintaining accuracy. A commonly used method to increase accuracy in the autonomous vehicle field is to fuse multiple sensory modalities. This research focuses on leveraging the fusion of long wave infrared (LWIR) imagery with visual spectrum imagery to fill in the inherent performance gaps when using visual imagery alone. This comes with a host of benefits, such as increase performance in various lighting conditions and adverse environmental conditions. Utilizing this fusion technique is an effective method of increasing the accuracy of a semantic segmentation model. Being a lightweight architecture is key for successful deployment on ADAS, as these systems often have resource constraints and need to operate in real-time. Multi-Spectral Fusion Network (MFNet) [ 1 ] accomplishes these parameters by leveraging a sensory fusion approach, and as such was selected as the baseline architecture for this research.</p> <p>Many improvements were made upon the baseline architecture by leveraging a variety of techniques. Such improvements include the proposal of a novel loss function categorical cross-entropy dice loss, introduction of squeeze and excitation (SE) blocks, addition of pyramid pooling, a new fusion technique, and drop input data augmentation. These improvements culminated in the creation of the Fast Thermal Fusion Network (FTFNet). Further improvements were made by introducing depthwise separable convolutional layers leading to lightweight FTFNet variants, FTFNet Lite 1 & 2.</p> Computer vision Neural networks Semantic Segmentation Convolutional Neural Networks CNN Thermal Imagery Sensory Fusion Data Augmentation Loss Function Multi-Spectral Neural Networks
24	Deep Image Processing with Spatial Adaptation and Boosted Efficiency & Supervision for Accurate Human Keypoint Detection and Movement Dynamics Tracking Chao Yang Dai (14709547) 31 May 2023 (has links) <p>This thesis aims to design and develop the spatial adaptation approach through spatial transformers to improve the accuracy of human keypoint recognition models. We have studied different model types and design choices to gain an accuracy increase over models without spatial transformers and analyzed how spatial transformers increase the accuracy of predictions. A neural network called Widenet has been leveraged as a specialized network for providing the parameters for the spatial transformer. Further, we have evaluated methods to reduce the model parameters, as well as the strategy to enhance the learning supervision for further improving the performance of the model. Our experiments and results have shown that the proposed deep learning framework can effectively detect the human key points, compared with the baseline methods. Also, we have reduced the model size without significantly impacting the performance, and the enhanced supervision has improved the performance. This study is expected to greatly advance the deep learning of human key points and movement dynamics. </p> Computer vision Deep learning computer vision method Artifical intelligence HUMAN POSE ESTIMATION human keypoint estimation Deep Learning (DL) spatial transformers Machine Learning (ML)
25	Efficient Continual Learning in Deep Neural Networks Gobinda Saha (18512919) 07 May 2024 (has links) <p dir="ltr">Humans exhibit remarkable ability in continual adaptation and learning new tasks throughout their lifetime while maintaining the knowledge gained from past experiences. In stark contrast, artificial neural networks (ANNs) under such continual learning (CL) paradigm forget the information learned in the past tasks upon learning new ones. This phenomenon is known as ‘Catastrophic Forgetting’ or ‘Catastrophic Interference’. The objective of this thesis is to enable efficient continual learning in deep neural networks while mitigating this forgetting phenomenon. Towards this, first, a continual learning algorithm (SPACE) is proposed where a subset of network filters or neurons is allocated for each task using Principal Component Analysis (PCA). Such task-specific network isolation not only ensures zero forgetting but also creates structured sparsity in the network which enables energy-efficient inference. Second, a fast and more efficient training algorithm for CL is proposed by introducing Gradient Projection Memory (GPM). Here, the most important gradient spaces (GPM) for each task are computed using Singular Value Decomposition (SVD) and the new tasks are learned in the orthogonal direction to GPM to minimize forgetting. Third, to improve new learning while minimizing forgetting, a Scaled Gradient Projection (SGP) method is proposed that, in addition to orthogonal gradient updates, allows scaled updates along the important gradient spaces of the past task. Next, for continual learning on an online stream of tasks a memory efficient experience replay method is proposed. This method utilizes saliency maps explaining network’s decision for selecting memories that are replayed during new tasks for preventing forgetting. Finally, a meta-learning based continual learner - Amphibian - is proposed that achieves fast online continual learning without any experience replay. All the algorithms are evaluated on short and long sequences of tasks from standard image-classification datasets. Overall, the methods proposed in this thesis address critical limitations of DNNs for continual learning and advance the state-of-the-art in this domain.</p> Computer vision Continual learning deep neural networks (DNNs) Machine Learning Optimization
26	INVESTIGATING DATA ACQUISITION TO IMPROVE FAIRNESS OF MACHINE LEARNING MODELS Ekta (18406989) 23 April 2024 (has links) <p dir="ltr">Machine learning (ML) algorithms are increasingly being used in a variety of applications and are heavily relied upon to make decisions that impact people’s lives. ML models are often praised for their precision, yet they can discriminate against certain groups due to biased data. These biases, rooted in historical inequities, pose significant challenges in developing fair and unbiased models. Central to addressing this issue is the mitigation of biases inherent in the training data, as their presence can yield unfair and unjust outcomes when models are deployed in real-world scenarios. This study investigates the efficacy of data acquisition, i.e., one of the stages of data preparation, akin to the pre-processing bias mitigation technique. Through experimental evaluation, we showcase the effectiveness of data acquisition, where the data is acquired using data valuation techniques to enhance the fairness of machine learning models.</p> Algorithmic Fairness Bias influence functions Data Acquisition Fairness Machine Learning Models Bias mitigation German credit data Adult census dataset COMPAS dataset
27	<b>A Study on the Use of Unsupervised, Supervised, and Semi-supervised Modeling for Jamming Detection and Classification in Unmanned Aerial Vehicles</b> Margaux Camille Marie Catafort--Silva (18477354) 02 May 2024 (has links) <p dir="ltr">In this work, first, unsupervised machine learning is proposed as a study for detecting and classifying jamming attacks targeting unmanned aerial vehicles (UAV) operating at a 2.4 GHz band. Three scenarios are developed with a dataset of samples extracted from meticulous experimental routines using various unsupervised learning algorithms, namely K-means, density-based spatial clustering of applications with noise (DBSCAN), agglomerative clustering (AGG) and Gaussian mixture model (GMM). These routines characterize attack scenarios entailing barrage (BA), single- tone (ST), successive-pulse (SP), and protocol-aware (PA) jamming in three different settings. In the first setting, all extracted features from the original dataset are used (i.e., nine in total). In the second setting, Spearman correlation is implemented to reduce the number of these features. In the third setting, principal component analysis (PCA) is utilized to reduce the dimensionality of the dataset to minimize complexity. The metrics used to compare the algorithms are homogeneity, completeness, v-measure, adjusted mutual information (AMI) and adjusted rank index (ARI). The optimum model scored 1.00, 0.949, 0.791, 0.722, and 0.791, respectively, allowing the detection and classification of these four jamming types with an acceptable degree of confidence.</p><p dir="ltr">Second, following a different study, supervised learning (i.e., random forest modeling) is developed to achieve a binary classification to ensure accurate clustering of samples into two distinct classes: clean and jamming. Following this supervised-based classification, two-class and three-class unsupervised learning is implemented considering three of the four jamming types: BA, ST, and SP. In this initial step, the four aforementioned algorithms are used. This newly developed study is intended to facilitate the visualization of the performance of each algorithm, for example, AGG performs a homogeneity of 1.0, a completeness of 0.950, a V-measure of 0.713, an ARI of 0.557 and an AMI of 0.713, and GMM generates 1, 0.771, 0.645, 0.536 and 0.644, respectively. Lastly, to improve the classification of this study, semi-supervised learning is adopted instead of unsupervised learning considering the same algorithms and dataset. In this case, GMM achieves results of 1, 0.688, 0.688, 0.786 and 0.688 whereas DBSCAN achieves 0, 0.036, 0.028, 0.018, 0.028 for homogeneity, completeness, V-measure, ARI and AMI respectively. Overall, this unsupervised learning is approached as a method for jamming classification, addressing the challenge of identifying newly introduced samples.</p> Unmanned arial Vehicle Machine Learning unsupervised learning algorithm semi-supervised learning Supervised learning Classification Jamming detection
28	Machine Learning with Hard Constraints:Physics-Constrained Constitutive Models with Neural ODEs and Diffusion Vahidullah Tac (19138804) 15 July 2024 (has links) <p dir="ltr">Our current constitutive models of material behavior fall short of being able to describe the mechanics of soft tissues. This is because soft tissues like skin and rubber, unlike traditional engineering materials, exhibit extremely nonlinear mechanical behavior and usually undergo large deformations. Developing accurate constitutive models for such materials requires using flexible tools at the forefront of science, such as machine learning methods. However, our past experiences show that it is crucial to incorporate physical knowledge in models of physical phenomena. The past few years has witnessed the rise of physics-informed models where the goal is to impose governing physical laws by incorporating them in the loss function. However, we argue that such "soft" constraints are not enough. This "persuasion" method has no theoretical guarantees on the satisfaction of physics and result in overly complicated loss functions that make training of the models cumbersome. </p><p dir="ltr">We propose imposing the relevant physical laws as "hard" constraints. In this approach the physics of the problem are "baked in" into the structure of the model preventing it from ever violating them. We demonstrate the power of this paradigm on a number of constitutive models of soft tissue, including hyperelasticity, viscoelasticity and continuum damage models. </p><p dir="ltr">We also argue that new uncertainty quantification strategies have to be developed to address the rise in dimensionality and the inherent symmetries present in most machine learning models compared to traditional constitutive models. We demonstrate that diffusion models can be used to construct a generative framework for physics-constrained hyperelastic constitutive models.</p> Solid mechanics Biomechanics Neural networks Machine Learning Neural ODEs Constitutive material models Solid Mechanics Continuum mechanics
29	Multi-Agent-Based Collaborative Machine Learning in Distributed Resource Environments Ahmad Esmaeili (19153444) 18 July 2024 (has links) <p dir="ltr">This dissertation presents decentralized and agent-based solutions for organizing machine learning resources, such as datasets and learning models. It aims to democratize the analysis of these resources through a simple yet flexible query structure, automate common ML tasks such as training, testing, model selection, and hyperparameter tuning, and enable privacy-centric building of ML models over distributed datasets. Based on networked multi-agent systems, the proposed approach represents ML resources as autonomous and self-reliant entities. This representation makes the resources easily movable, scalable, and independent of geographical locations, alleviating the need for centralized control and management units. Additionally, as all machine learning and data mining tasks are conducted near their resources, providers can apply customized rules independently of other parts of the system. </p><p><br></p> Autonomous agents and multiagent systems Distributed systems and algorithms Multi-Agent Systems Machine Learning Distributed Machine Learning
30	DEEP ECG MINING FOR ARRHYTHMIA DETECTION TOWARDS PRECISION CARDIAC MEDICINE Shree Patnaik (18831547) 03 September 2024 (has links) <p dir="ltr">Cardiac disease is one of the prominent reasons of deaths worldwide. The timely de-<br>tection of arrhythmias, one of the highly prevalent cardiac abnormalities, is very important<br>and promising for treatment. Electrocardiography (ECG) is well applied to probe the car-<br>diac dynamics, nevertheless, it is still challenging to robustly detect the arrhythmia with<br>automatic algorithms, especially when the noise may contaminate the signal to some extent.<br>In this research study, we have not only built and assessed different neural network models<br>to understand their capability in terms of ECE-based arrhythmia detection, but also com-<br>prehensively investigated the detection under different kinds of signal-to-noise ratio (SNR).<br>Both Long Short-Term Memory (LSTM) model and Multi-Layer Perception (MLP) model<br>have been developed in the study. Further, we have studied the necessity of fine-tuning<br>of the neural network models, which are pre-trained on other data and demonstrated that<br>it is very important to boost the performance when ECG is contaminated by noise. In<br>the experiments, the LSTM model achieves an accuracy of 99.0%, F1 score of 97.9%, and<br>high precision and recall, with the clean ECE signal. Further, in the high SNR scenario,<br>the LSTM maintains an attractive performance. With the low SNR scenario, though there<br>is some performance drop, the fine-tuning approach helps performance improvement criti-<br>cally. Overall, this study has built the neural network models, and investigated different<br>kinds of signal fidelity including clean, high-SNR, and low-SNR, towards robust arrhythmia<br>detection.</p> ECG signal processing lstm model ensured neural networks labeled MLP network arrhythmia classification Machine learning applied to healthcare MIT-BIH Arrhythmia database

Search results