1 |
A Study of Multimodal AI and Human Feedback loops in STEM Learning EnvironmentsKaren Michelle D'Souza (20284593) 19 November 2024 (has links)
<p dir="ltr">The study explores the integration of artificial intelligence (AI) systems in a pedagogical dynamic collaborative tool namely cyber peer-led team learning (cPLTL). The use case is based on an undergraduate chemistry class that has the in-person component with the educator and an online cPLTL workshop format. The research solutions involve the integration of multimodal AI, active learning and large language models.</p>
|
2 |
On Online Unsupervised Domain AdaptationJihoon Moon (17121610) 10 October 2023 (has links)
<p dir="ltr">Recent advances in Artificial Intelligence (AI) have been markedly accelerated by the convergence of advances in Machine Learning (ML) and the exponential growth in computational power. Within this dynamic landscape, the concept of Domain Adaptation (DA) is dedicated to the seamless transference of knowledge across domains characterized by disparate data distributions. This thesis ventures into the challenging and nuanced terrain of Online Unsupervised Domain Adaptation (OUDA), where the unlabeled data stream arrives from the target domain incrementally and gradually diverges from the source domain. This thesis presents two innovative and complementary approaches -- a manifold-based approach and a time-domain-based approach -- to effectively tackle the intricate OUDA challenges.</p><p dir="ltr">The manifold-based approach seeks to address this gap by incorporating the domain alignment process in an incremental computation manner, and this novel technique leverages the computation of transformation matrices, based on the projection of both source and target data onto the Grassmann manifold. This projection aligns both domains by incrementally minimizing their dissimilarities, effectively ameliorating the divergence between the source and target data. This manifold-based approach capitalizes on the cumulative temporal information within the data stream, utilizing the Incremental Computation of Mean-Subspace (ICMS) technique. This technique efficiently computes the average subspace of target subspaces on the Grassmann manifold, adeptly capturing the evolving dynamics of the data distribution. The alignment process is further fortified by integrating the flow of target subspaces on the manifold. As the target data stream unfolds over time, this approach incorporates this information, yielding robust and adaptive transformation matrices. In addition, the efficient computation of the mean-subspace, closely aligned with the Karcher mean, attests to the computational feasibility of the manifold-based approach, thus, enabling real-time feedback computations for the OUDA problem.</p><p dir="ltr">The time-domain-based approach utilizes the cluster-wise information and its flow information from each time-step to accurately predict target labels in the incoming target data, propagate consistently the class labels to future incoming target data, and efficiently utilize the predicted labels in the target data together with the source data to incrementally update the learning model in a supervised-learning scenario. This process effectively transforms the OUDA problem into a supervised-learning scenario. We leverage a neural-network-based model to align target features, cluster them class-wise and extend them linearly from the origin of the latent space as the time-step progresses. This alignment process enables accurate predictions and target label propagation based on the trajectories of the target features. We achieve target label propagation through the novel Flow-based Hierarchical Optimal Transport (FHOT) method, which considers element-wise, cluster-wise, and distribution-wise correspondences of adjacent target features. The learning model is continuously updated with incoming target data and their predicted labels.</p><p dir="ltr">To comprehensively assess the impact and contributions of these two approaches to the OUDA problem, we conducted extensive experiments across diverse datasets. Our analysis covered each stage of the manifold-based approach, comparing its performance with prior methods in terms of classification accuracy and computational efficiency. The time-domain-based approach was validated through linear feature alignment in the latent space, resulting in accurate label predictions. Notably, the flow-based hierarchical optimal transport technique substantially enhanced classification accuracy, particularly with increasing time-steps. Furthermore, learning model updates using target data and predicted labels significantly improved classification accuracy.</p>
|
3 |
Sparse Deep Learning and Stochastic Neural NetworkYan Sun (12425889) 13 May 2022 (has links)
<p>Deep learning has achieved state-of-the-art performance on many machine learning tasks. But the deep neural network(DNN) model still suffers a few issues. Over-parametrized neural network generally has better optimization landscape, but it is computationally expensive, hard to interpret and the model usually can not correctly quantify the prediction uncertainty. On the other hand, small DNN model could suffer from local trap and will be hard to optimize. In this dissertation, we tackle these issues from two directions, sparse deep learning and stochastic neural network. </p>
<p><br></p>
<p>For sparse deep learning, we proposed Bayesian neural network(BNN) model with mixture of normal prior. Theoretically, We established the posterior consistency and structure selection consistency, which ensures the sparse DNN model can be consistently identified. We also demonstrate the asymptotic normality of the prediction, which ensures the prediction uncertainty to be correctly quantified. Computationally, we proposed a prior annealing approach to optimize the posterior of BNN. The proposed methods share similar computation complexity to the standard stochastic gradient descent method for training DNN. Experiment results show that our model performs well on high dimensional variable selection as well as neural network pruning.</p>
<p><br></p>
<p>For stochastic neural network, we proposed a Kernel-Expanded Stochastic Neural Network model or K-StoNet model in short. We reformulate the DNN as a latent variable model and incorporate support vector regression (SVR) as the first hidden layer. The latent variable formulation breaks the training into a series of convex optimization problems and the model can be easily trained using the imputation-regularized optimization (IRO) algorithm. We provide theoretical guarantee for convergence of the algorithm and the prediction uncertainty quantification. Experiment results show that the proposed model can achieve good prediction performance and provide correct confidence region for prediction. </p>
|
4 |
ANOMALY DETECTION USING MACHINE LEARNING FORINTRUSION DETECTIONVaishnavi Rudraraju (18431880) 02 May 2024 (has links)
<p dir="ltr">This thesis examines machine learning approaches for anomaly detection in network security, particularly focusing on intrusion detection using TCP and UDP protocols. It uses logistic regression models to effectively distinguish between normal and abnormal network actions, demonstrating a strong ability to detect possible security concerns. The study uses the UNSW-NB15 dataset for model validation, allowing a thorough evaluation of the models' capacity to detect anomalies in real-world network scenarios. The UNSW-NB15 dataset is a comprehensive network attack dataset frequently used in research to evaluate intrusion detection systems and anomaly detection algorithms because of its realistic attack scenarios and various network activities.</p><p dir="ltr">Further investigation is carried out using a Multi-Task Neural Network built for binary and multi-class classification tasks. This method allows for the in-depth study of network data, making it easier to identify potential threats. The model is fine-tuned during successive training epochs, focusing on validation measures to ensure its generalizability. The thesis also applied early stopping mechanisms to enhance the ML model, which helps optimize the training process, reduces the risk of overfitting, and improves the model's performance on new, unseen data.</p><p dir="ltr">This thesis also uses blockchain technology to track model performance indicators, a novel strategy that improves data integrity and reliability. This blockchain-based logging system keeps an immutable record of the models' performance over time, which helps to build a transparent and verifiable anomaly detection framework.</p><p dir="ltr">In summation, this research enhances Machine Learning approaches for network anomaly detection. It proposes scalable and effective approaches for early detection and mitigation of network intrusions, ultimately improving the security posture of network systems.</p>
|
5 |
Differentially Private Federated Learning Algorithms for Sparse Basis RecoveryAjinkya K Mulay (18823252) 14 June 2024 (has links)
<p dir="ltr">Sparse basis recovery is an important learning problem when the number of model dimensions (<i>p</i>) is much larger than the number of samples (<i>n</i>). However, there has been little work that studies sparse basis recovery in the Federated Learning (FL) setting, where the Differential Privacy (DP) of the client data must also be simultaneously protected. Notably, the performance guarantees of existing DP-FL algorithms (such as DP-SGD) will degrade significantly when the system is ill-determined (i.e., <i>p >> n</i>), and thus they will fail to accurately learn the true underlying sparse model. The goal of my thesis is therefore to develop DP-FL sparse basis recovery algorithms that can recover the true underlying sparse basis provably accurately even when <i>p >> n</i>, yet still guaranteeing the differential privacy of the client data.</p><p dir="ltr">During my PhD studies, we developed three DP-FL sparse basis recovery algorithms for this purpose. Our first algorithm, SPriFed-OMP, based on the Orthogonal Matching Pursuit (OMP) algorithm, can achieve high accuracy even when <i>n = O(\sqrt{p})</i> under the stronger Restricted Isometry Property (RIP) assumption for least-square problems. Our second algorithm, Humming-Bird, based on a carefully modified variant of the Forward-Backward Algorithm (FoBA), can achieve differentially private sparse recovery for the same setup while requiring the much weaker Restricted Strong Convexity (RSC) condition. We further extend Humming-Bird to support loss functions beyond least-square satisfying the RSC condition. To the best of our knowledge, these are the first DP-FL results guaranteeing sparse basis recovery in the <i>p >> n</i> setting.</p>
|
6 |
<b>IMPROVING MACHINE LEARNING FAIRNESS BY REPAIRING MISLABELED DATA</b>Shashank A Thandri (20161635) 15 November 2024 (has links)
<p dir="ltr">As Machine learning (ML) and Artificial intelligence (AI) are becoming increasingly prevalent in high-stake decision-making, fairness has emerged as a critical societal issue. Individuals belonging to diverse groups receive different algorithmic outcomes largely due to the inherent errors and biases in the underlying training data, thus resulting in violations of group fairness or bias. </p><p dir="ltr">This study investigates the problem of resolving group fairness by detecting mislabeled data and flipping the label instances in the training data. Four solutions are proposed to obtain an ordering in which the labels of training data instances should be flipped to reduce the bias in predictions of a model trained over the modified data. Through experimental evaluation, we showcase the effectiveness of repairing mislabeled data using mislabel detection techniques to improve the fairness of machine learning models.</p>
|
7 |
MUTUAL LEARNING ALGORITHMS IN MACHINE LEARNINGSabrina Tarin Chowdhury (14846524) 18 May 2023 (has links)
<p> </p>
<p>Mutual learning algorithm is a machine learning algorithm where multiple machine learning algorithms learns from different sources and then share their knowledge among themselves so that all the agents can improve their classification and prediction accuracies simultaneously. Mutual learning algorithm can be an efficient mechanism for improving the machine learning and neural network efficiency in a multi-agent system. Usually, in knowledge distillation algorithms, a big network plays the role of a static teacher and passes the data to smaller networks, known as student networks, to improve the efficiency of the latter. In this thesis, it is showed that two small networks can dynamically and interchangeably play the changing roles of teacher and student to share their knowledge and hence, the efficiency of both the networks improve simultaneously. This type of dynamic learning mechanism can be very useful in mobile environment where there is resource constraint for training with big dataset. Data exchange in multi agent, teacher-student network system can lead to efficient learning. </p>
|
8 |
IMAGE CAPTIONING USING TRANSFORMER ARCHITECTUREWrucha A Nanal (14216009) 06 December 2022 (has links)
<p> </p>
<p>The domain of Deep Learning that is related to generation of textual description of images is called ‘Image Captioning.’ The central idea behind Image Captioning is to identify key features of an image and create meaningful sentences that describe the image. The current popular models include image captioning using Convolution Neural Network - Long Short-Term Memory (CNN-LSTM) based models and Attention based models. This research work first identifies the drawbacks of existing image captioning models namely – sequential style of execution, vanishing gradient problem and lack of context during training.</p>
<p>This work aims at resolving the discovered problems by creating a Contextually Aware Image Captioning (CATIC) Model. The Transformer architecture, which solves the issues of vanishing gradients and sequential execution, forms the basis of the suggested model. In order to inject the contextualized embeddings of the caption sentences, this work uses Bidirectional Encoder Representation of Transformers (BERT). This work uses Remote Sensing Image Captioning Dataset. The results of the CATIC model are evaluated using BLEU, METEOR and ROGUE scores. On comparison the proposed model outperforms the CNN-LSTM model in all metrices. When compared to the Attention based model’s metrices, the CATIC model outperforms for BLEU2 and ROGUE metrices and gives competitive results for others.</p>
|
9 |
Neural Network Models For Neurophysiology DataBryan Jimenez (13979295) 25 October 2022 (has links)
<p> </p>
<p>Over the last decade, measurement technology that records neural activity such as ECoG and Utah array has dramatically improved. These advancements have given researchers access to recordings from multiple neurons simultaneously. Efficient computational and statistical methods are required to analyze this data type successfully. The time-series model is one of the most common approaches for analyzing this data type. Unfortunately, even with all the advances made with time-series models, it is not always enough since these models often need massive amounts of data to achieve good results. This is especially true in the field of neuroscience, where the datasets are often limited, therefore imposing constraints on the type and complexity of the models we can use. Not only that, but the Signal-to- noise ratio tends to be lower than in other machine learning datasets. This paper will introduce different architectures and techniques to overcome constraints imposed by these small datasets. There are two major experiments that we will discuss. (1) We will strive to develop models for participants who lost the ability to speak by building upon the previous state-of-the-art model for decoding neural activity (ECoG data) into English text. (2) We will introduce two new models, RNNF and Neural RoBERTa. These new models impute missing neural data from neural recordings (Utah arrays) of monkeys performing kinematic tasks. These new models with the help of novel data augmentation techniques (dynamic masking) outperformed state-of-the-art models such as Neural Data Transformer (NDT) in the Neural Latents Benchmark competition. </p>
|
10 |
Machine-Learning Based Assessment of Cystic FibrosisJuan Antonio Kim Hoo Chong Chie (18010987) 28 February 2024 (has links)
<p dir="ltr">Cystic fibrosis is a genetic disease that affects over 162,428 people worldwide. Currently, assessing cystic fibrosis from medical images requires a trained expert to manually annotate regions in the patient's lungs to determine the stage and severity of the disease. This process takes a substantial amount of time and effort to achieve an accurate assessment. </p><p dir="ltr">Recent advancements in machine learning and deep learning have been effective in solving classification, decision-making, identification, and segmentation problems in various disciplines. In medical research, these techniques have been used to perform image analyses that aid in organ identification, tissue classification, and lesion segmentation, which reduces the time required for physicians to analyze medical images. However, these techniques have yet to be widely applied in the assessment of cystic fibrosis. </p><p dir="ltr">This thesis describes an automated framework employed to assess the severity and extent of cystic fibrosis. The framework comprises three analysis stages: airways analysis, texture analysis, and lung lesions detection, that are utilized to extract cystic fibrosis features from CT scans, and which are used to assess the severity and extent of cystic fibrosis. The framework achieved an accuracy of 86.96\% in the staging process. The main contribution of this work is the development of a data-driven methodology used to design a quantitative cystic fibrosis staging and grading model.</p>
|
Page generated in 0.0864 seconds