Global ETD Search

31	Multi-Agent-Based Collaborative Machine Learning in Distributed Resource Environments Ahmad Esmaeili (19153444) 18 July 2024 (has links) <p dir="ltr">This dissertation presents decentralized and agent-based solutions for organizing machine learning resources, such as datasets and learning models. It aims to democratize the analysis of these resources through a simple yet flexible query structure, automate common ML tasks such as training, testing, model selection, and hyperparameter tuning, and enable privacy-centric building of ML models over distributed datasets. Based on networked multi-agent systems, the proposed approach represents ML resources as autonomous and self-reliant entities. This representation makes the resources easily movable, scalable, and independent of geographical locations, alleviating the need for centralized control and management units. Additionally, as all machine learning and data mining tasks are conducted near their resources, providers can apply customized rules independently of other parts of the system. </p><p><br></p> Autonomous agents and multiagent systems Distributed systems and algorithms Multi-Agent Systems Machine Learning Distributed Machine Learning
32	DEEP ECG MINING FOR ARRHYTHMIA DETECTION TOWARDS PRECISION CARDIAC MEDICINE Shree Patnaik (18831547) 03 September 2024 (has links) <p dir="ltr">Cardiac disease is one of the prominent reasons of deaths worldwide. The timely de-<br>tection of arrhythmias, one of the highly prevalent cardiac abnormalities, is very important<br>and promising for treatment. Electrocardiography (ECG) is well applied to probe the car-<br>diac dynamics, nevertheless, it is still challenging to robustly detect the arrhythmia with<br>automatic algorithms, especially when the noise may contaminate the signal to some extent.<br>In this research study, we have not only built and assessed different neural network models<br>to understand their capability in terms of ECE-based arrhythmia detection, but also com-<br>prehensively investigated the detection under different kinds of signal-to-noise ratio (SNR).<br>Both Long Short-Term Memory (LSTM) model and Multi-Layer Perception (MLP) model<br>have been developed in the study. Further, we have studied the necessity of fine-tuning<br>of the neural network models, which are pre-trained on other data and demonstrated that<br>it is very important to boost the performance when ECG is contaminated by noise. In<br>the experiments, the LSTM model achieves an accuracy of 99.0%, F1 score of 97.9%, and<br>high precision and recall, with the clean ECE signal. Further, in the high SNR scenario,<br>the LSTM maintains an attractive performance. With the low SNR scenario, though there<br>is some performance drop, the fine-tuning approach helps performance improvement criti-<br>cally. Overall, this study has built the neural network models, and investigated different<br>kinds of signal fidelity including clean, high-SNR, and low-SNR, towards robust arrhythmia<br>detection.</p> ECG signal processing lstm model ensured neural networks labeled MLP network arrhythmia classification Machine learning applied to healthcare MIT-BIH Arrhythmia database
33	Resource-Aware Decentralized Federated Learning over Heterogeneous Networks Shahryar Zehtabi (19833777) 20 November 2024 (has links) <p dir="ltr">A recent emphasis of distributed learning research has been on federated learning (FL), in which model training is conducted by the data-collecting devices. In traditional FL algorithms, trained models at the edge are periodically sent to a central server for aggregation, utilizing a star topology as the underlying communication graph. However, assuming access to a central coordinator is not always practical, e.g., in ad hoc wireless network settings, motivating efforts to fully decentralize FL. Consequently, Decentralized federated learning (DFL) captures FL settings where both (i) model updates and (ii) model aggregations are exclusively carried out by the clients without a central server. Inherent challenges due to distributed nature of FL training, i.e., data heterogeneity and resource heterogeneity, become even more prevalent in DFL since it lacks a central server as a coordinator. In this thesis, we present two algorithms for resource-aware DFL, which result in achieving an overall desired performance across the clients in shorter amount of time compared to existing conventional DFL algorithms which do not factor in the resource availability of clients in their approaches.</p><p dir="ltr"><br></p><p dir="ltr">In the first project, we propose EF-HC, a novel methodology for distributed model aggregations via asynchronous, event-triggered consensus iterations over the network graph topology. We consider personalized/heterogeneous communication event thresholds at each device that weigh the change in local model parameters against the available local resources in deciding whether an aggregation would be beneficial enough to incur a communication delay on the system. In the second project, we propose Decentralized Sporadic Federated Learning (DSpodFL), a DFL methodology built on a generalized notion of sporadicity in both local gradient and aggregation processes. DSpodFL subsumes many existing decentralized optimization methods under a unified algorithmic framework by modeling the per-iteration (i) occurrence of gradient descent at each client and (ii) exchange of models between client pairs as arbitrary indicator random variables, thus capturing heterogeneous and time-varying computation/communication scenarios. We analytically characterize the convergence behavior of both algorithms for strongly convex models using both a constant and a diminishing learning rate, under mild assumptions on the communication graph connectivity, data heterogeneity across clients, and gradient noises. In DSpodFL, we do the same for non-convex models as well. Our numerical experiments demonstrate that both EF-HC and DSpodFL consistently achieve improved training speeds compared with baselines under various system settings.</p> Distributed systems and algorithms Optimisation Decentralized Federated Learning Federated Learning Distributed Optimization Sporadic Sporadicity Event-Triggered
34	ESTIMATING MODEL FAIRNESS USING DATA CHARACTERISTICS Kevin Varghese Chittilapilly (20234277) 17 November 2024 (has links) <p dir="ltr">The pursuit of fairness in machine learning (ML) systems is a critical challenge in today’s world that relies heavily on AI systems. However, computing and mitigating the bias necessitates substantial computational resources and time when evaluating across entire datasets. This research introduces an innovative approach to estimate fairness in ML systems by leveraging data characteristics and constructing a metafeatures dataframe. Using our methodology enables the prediction of fairness with significantly reduced computational cost and expedited analysis times. Furthermore, our approach is scalable to different distributions and requires minimal training to deal with out of sample data. This approach not only enhances the efficiency of fairness assessments in ML systems but also provides a scalable framework for future fairness evaluation methodologies. Our findings suggest that using data characteristics to estimate fairness is not only feasible but also effective, offering a promising avenue for developing more equitable ML systems with reduced resource consumption.</p> Responsible AI Machine Learning Analysis Data Mining Fairness Bias Meta feature learning
35	Interpretable Frameworks for User-Centered Structured AI Models Ying-Chun Lin (20371953) 17 December 2024 (has links) <p dir="ltr">User-centered structured models are designed to enhance user experience and assist individuals in making more informed decisions. In these models, user behavior is typically represented through graphs that illustrate the relationships between users (graph models) or through conversations that depict user interactions with AI systems (language models). However, even with their success, these complex models often remain opaque in many instances. As a result, there has been a growing focus on the rapid development of interpretable machine learning. Interpretable machine learning takes insights from a model and translates complex model concepts, e.g., node or sentence representations, or decisions, e.g., predicted labels, into concepts that are understandable to humans. Our goal of this thesis is to enhance the interpretability of <i>user-centered structured AI models</i> (graph and language models) through the provision of interpretations and explanations, while simultaneously enhancing the performance of these models on downstream tasks.</p><p dir="ltr">In the field of graphs, nodes represent real-world entities, and their relationships are depicted by edges. Graph models usually produce node representations, which are meaningful and low-dimension vectors that encapsulate the characteristics of the nodes. However, existing research on the interpretation of node representations is limited and lacks empirical validation, raising concerns about the reliability of interpretation methods. To solve this problem, we first introduce a novel evaluation method, IME Process to assess interpretation methods. Subsequently, we propose representations-Node Coherence Rate for Representation Interpretation (NCI)–which provides more accurate interpretation results compared to previous interpretation methods. After understanding the information captured in node representations, we further introduce Task-Aware Contrastive Learning (Task-aware CL) which aims to enhance downstream task performance for graph models by maximizing the mutual information between the downstream task and node representations with a contrastive learning process. Our experimental results demonstrate that Task-aware CL significantly enhances performance across downstream tasks.</p><p dir="ltr">In the context of conversations, user satisfaction estimation (USE) for conversational systems is crucial for ensuring the reliability and safety of the language models involved. We further emphasize that USE should be <i>interpretable </i>to guide continuous improvement during development of these models. Therefore, we propose Supervised Prompting for User satisfaction Rubrics (<i>SPUR</i>) to learn the reasons behind user satisfaction and dissatisfaction with an AI agent and to estimate user satisfaction with Large Language Models. In our experiment results, we demonstrate that <i>SPUR </i>not only offers enhanced interpretability by learning rubrics to understand user satisfaction/dissatisfaction with supervised signals, but it also exhibits superior accuracy via domain-specific in-context learning.</p> graph neural network (GNNs) model interpretability method Large Language Models (LLMs)
36	Efficient Continual Learning in Deep Neural Networks Gobinda Saha (18512919) 07 May 2024 (has links) <p dir="ltr">Humans exhibit remarkable ability in continual adaptation and learning new tasks throughout their lifetime while maintaining the knowledge gained from past experiences. In stark contrast, artificial neural networks (ANNs) under such continual learning (CL) paradigm forget the information learned in the past tasks upon learning new ones. This phenomenon is known as ‘Catastrophic Forgetting’ or ‘Catastrophic Interference’. The objective of this thesis is to enable efficient continual learning in deep neural networks while mitigating this forgetting phenomenon. Towards this, first, a continual learning algorithm (SPACE) is proposed where a subset of network filters or neurons is allocated for each task using Principal Component Analysis (PCA). Such task-specific network isolation not only ensures zero forgetting but also creates structured sparsity in the network which enables energy-efficient inference. Second, a fast and more efficient training algorithm for CL is proposed by introducing Gradient Projection Memory (GPM). Here, the most important gradient spaces (GPM) for each task are computed using Singular Value Decomposition (SVD) and the new tasks are learned in the orthogonal direction to GPM to minimize forgetting. Third, to improve new learning while minimizing forgetting, a Scaled Gradient Projection (SGP) method is proposed that, in addition to orthogonal gradient updates, allows scaled updates along the important gradient spaces of the past task. Next, for continual learning on an online stream of tasks a memory efficient experience replay method is proposed. This method utilizes saliency maps explaining network’s decision for selecting memories that are replayed during new tasks for preventing forgetting. Finally, a meta-learning based continual learner - Amphibian - is proposed that achieves fast online continual learning without any experience replay. All the algorithms are evaluated on short and long sequences of tasks from standard image-classification datasets. Overall, the methods proposed in this thesis address critical limitations of DNNs for continual learning and advance the state-of-the-art in this domain.</p> Computer vision Continual learning deep neural networks (DNNs) Machine Learning Optimization
37	INVESTIGATING DATA ACQUISITION TO IMPROVE FAIRNESS OF MACHINE LEARNING MODELS Ekta (18406989) 23 April 2024 (has links) <p dir="ltr">Machine learning (ML) algorithms are increasingly being used in a variety of applications and are heavily relied upon to make decisions that impact people’s lives. ML models are often praised for their precision, yet they can discriminate against certain groups due to biased data. These biases, rooted in historical inequities, pose significant challenges in developing fair and unbiased models. Central to addressing this issue is the mitigation of biases inherent in the training data, as their presence can yield unfair and unjust outcomes when models are deployed in real-world scenarios. This study investigates the efficacy of data acquisition, i.e., one of the stages of data preparation, akin to the pre-processing bias mitigation technique. Through experimental evaluation, we showcase the effectiveness of data acquisition, where the data is acquired using data valuation techniques to enhance the fairness of machine learning models.</p> Algorithmic Fairness Bias influence functions Data Acquisition Fairness Machine Learning Models Bias mitigation German credit data Adult census dataset COMPAS dataset
38	<b>A Study on the Use of Unsupervised, Supervised, and Semi-supervised Modeling for Jamming Detection and Classification in Unmanned Aerial Vehicles</b> Margaux Camille Marie Catafort--Silva (18477354) 02 May 2024 (has links) <p dir="ltr">In this work, first, unsupervised machine learning is proposed as a study for detecting and classifying jamming attacks targeting unmanned aerial vehicles (UAV) operating at a 2.4 GHz band. Three scenarios are developed with a dataset of samples extracted from meticulous experimental routines using various unsupervised learning algorithms, namely K-means, density-based spatial clustering of applications with noise (DBSCAN), agglomerative clustering (AGG) and Gaussian mixture model (GMM). These routines characterize attack scenarios entailing barrage (BA), single- tone (ST), successive-pulse (SP), and protocol-aware (PA) jamming in three different settings. In the first setting, all extracted features from the original dataset are used (i.e., nine in total). In the second setting, Spearman correlation is implemented to reduce the number of these features. In the third setting, principal component analysis (PCA) is utilized to reduce the dimensionality of the dataset to minimize complexity. The metrics used to compare the algorithms are homogeneity, completeness, v-measure, adjusted mutual information (AMI) and adjusted rank index (ARI). The optimum model scored 1.00, 0.949, 0.791, 0.722, and 0.791, respectively, allowing the detection and classification of these four jamming types with an acceptable degree of confidence.</p><p dir="ltr">Second, following a different study, supervised learning (i.e., random forest modeling) is developed to achieve a binary classification to ensure accurate clustering of samples into two distinct classes: clean and jamming. Following this supervised-based classification, two-class and three-class unsupervised learning is implemented considering three of the four jamming types: BA, ST, and SP. In this initial step, the four aforementioned algorithms are used. This newly developed study is intended to facilitate the visualization of the performance of each algorithm, for example, AGG performs a homogeneity of 1.0, a completeness of 0.950, a V-measure of 0.713, an ARI of 0.557 and an AMI of 0.713, and GMM generates 1, 0.771, 0.645, 0.536 and 0.644, respectively. Lastly, to improve the classification of this study, semi-supervised learning is adopted instead of unsupervised learning considering the same algorithms and dataset. In this case, GMM achieves results of 1, 0.688, 0.688, 0.786 and 0.688 whereas DBSCAN achieves 0, 0.036, 0.028, 0.018, 0.028 for homogeneity, completeness, V-measure, ARI and AMI respectively. Overall, this unsupervised learning is approached as a method for jamming classification, addressing the challenge of identifying newly introduced samples.</p> Unmanned arial Vehicle Machine Learning unsupervised learning algorithm semi-supervised learning Supervised learning Classification Jamming detection
39	Facility Assessment of Indoor Air Quality Using Machine Learning Jared A Wright (18387855) 03 June 2024 (has links) <p dir="ltr">The goal of this thesis is to develop a method of evaluating long-term IAQ performance of an industrial facility and use machine-learning to model the relationship between critical air pollutants and the facility’s HVAC systems and processes. The facility under study for this thesis is an electroplating manufacturer. The air pollutants at this facility that were studied were particulate matter, total-volatile organic compounds, and carbon-dioxide. Upon sensor installation, seven “zones” were identified to isolate areas of the plant for measurement and analysis. A statistical review of the long-term data highlighted how this facility performed in terms of compliance. Their gaseous pollutants were well within regulation. Particulate matter, however, was found to be a pressing issue. PM10 was outside of compliance more than 15% of the time in five out of seven of the zones of study. Some zones were out of compliance up to 80% of the total collection period. The six pollutants that met these criteria were deemed critical and moved on to machine learning modeling. Our model of best fit for each pollutant used a gaussian process regression model, which fits best for non-linear rightly skewed datasets. The performance of each of our models was deemed significant. Every model had at least a regression coefficient of 0.935 and above for both validation and testing. The maximum average error was 12.64 ug.m^3, which is less than 10% of the average PM10 concentration. Through our modeling, we were able to study how HVAC and production played a role in particulate matter presence for each zone. Exhaust systems of the west side of the plant were found to be insufficient at removing particulates from their facility. Overall, the methods developed in this thesis project were able to meet the goal of analyzing IAQ compliance, modeling critical pollutants using machine learning, and identifying a relationship between these pollutants and an industrial facility’s HVAC and production systems.</p> Air pollution modelling and control indoor air pollutant machine learning
40	<b>PHYSICS INSPIRED AI-DRIVEN PHOTONIC INVERSE DESIGN FOR HIGH-PERFORMANCE PHOTONIC DEVICES</b> Omer Yesilurt (19435210) 19 August 2024 (has links) <p dir="ltr">This thesis presents novel methodologies to integrate AI-driven and physics-inspired methodologies into photonic inverse design, setting new benchmarks for high-performance photonic devices in different branches of photonics. By blending advanced computational techniques with the foundational principles of electromagnetism, this research tackles key challenges in optimizing device efficiency, robustness, and functionality. The aim is to propel photonic technology beyond its current capabilities, offering transformative solutions for a range of novel applications.</p><p dir="ltr">The first major contribution focuses on adjoint-based topology optimization for on-chip single-photon coupling. We developed an adjoint topology optimization scheme to design high-efficiency couplers between photonic waveguides and single-photon sources (SPSs) in hexagonal boron nitride (hBN). This algorithm addresses fabrication constraints and SPS location uncertainties, achieving a remarkable average coupling efficiency of 78%. A library of designs is generated for different positions of the hBN flake containing an SPS relative to a silicon nitride (SiN) waveguide. These designs are then analyzed using dimensionality reduction techniques to investigate the relationship between device geometry and performance, infusing the design process with deep physical intuition and insight.</p><p dir="ltr">The second key advancement is presented through a neural network-based inverse design framework specifically developed for optimizing single-material, variable-index multilayer films. This neural network-driven technique, supported by a differentiable analytical solver, enables the realistic design and fabrication of these multilayer films, achieving high performance under ideal conditions. The approach also addresses the challenge of bridging the gap between these ideal designs and practical devices, which are subject to growth-related imperfections. By incorporating simulated systematic and random errors—reflecting actual deposition challenges—into the optimization process, we demonstrate that the neural network, initially trained to produce the ideal device, can be reconfigured to create designs that compensate for systematic deposition errors. This method remains effective even when random fabrication inconsistencies are present. The results provide a practical and experimentally viable strategy for developing single-material multilayer film stacks, ensuring reliable performance across a wide range of real-world applications.</p><p dir="ltr">The final cornerstone of this research investigates the two-stage inverse design of superchiral dielectric metasurfaces. We propose a two-stage inverse design scheme for dielectric lossless metasurfaces with central superchiral hot spots. By leveraging the excitation of high-quality factor modes with low mode volumes, we achieve up to 19,000-fold enhancements of optical chirality. This method extends the local density of field enhancements for non-chiral fields into the chiral regime and significantly surpasses previous enhancements in superchiral field generation. Our results open new avenues in chiral spectroscopy and chiral quantum photonics, exemplifying the powerful synergy of AI techniques and physics-based design principles in creating highly innovative and functional photonic structures.</p><p dir="ltr">Collectively, the methodologies developed in this thesis signify a major advancement in the field of photonic inverse design. By merging AI-driven techniques with rigorous physics-based optimization frameworks, this research paves the way for the next generation of photonic devices.</p> Engineering electromagnetics Nanophotonics inverse design methodology photonics Photonic Integrated Circuit (PIC) metasurface deep learning

Search results