Global ETD Search

1	On Online Unsupervised Domain Adaptation Jihoon Moon (17121610) 10 October 2023 (has links) <p dir="ltr">Recent advances in Artificial Intelligence (AI) have been markedly accelerated by the convergence of advances in Machine Learning (ML) and the exponential growth in computational power. Within this dynamic landscape, the concept of Domain Adaptation (DA) is dedicated to the seamless transference of knowledge across domains characterized by disparate data distributions. This thesis ventures into the challenging and nuanced terrain of Online Unsupervised Domain Adaptation (OUDA), where the unlabeled data stream arrives from the target domain incrementally and gradually diverges from the source domain. This thesis presents two innovative and complementary approaches -- a manifold-based approach and a time-domain-based approach -- to effectively tackle the intricate OUDA challenges.</p><p dir="ltr">The manifold-based approach seeks to address this gap by incorporating the domain alignment process in an incremental computation manner, and this novel technique leverages the computation of transformation matrices, based on the projection of both source and target data onto the Grassmann manifold. This projection aligns both domains by incrementally minimizing their dissimilarities, effectively ameliorating the divergence between the source and target data. This manifold-based approach capitalizes on the cumulative temporal information within the data stream, utilizing the Incremental Computation of Mean-Subspace (ICMS) technique. This technique efficiently computes the average subspace of target subspaces on the Grassmann manifold, adeptly capturing the evolving dynamics of the data distribution. The alignment process is further fortified by integrating the flow of target subspaces on the manifold. As the target data stream unfolds over time, this approach incorporates this information, yielding robust and adaptive transformation matrices. In addition, the efficient computation of the mean-subspace, closely aligned with the Karcher mean, attests to the computational feasibility of the manifold-based approach, thus, enabling real-time feedback computations for the OUDA problem.</p><p dir="ltr">The time-domain-based approach utilizes the cluster-wise information and its flow information from each time-step to accurately predict target labels in the incoming target data, propagate consistently the class labels to future incoming target data, and efficiently utilize the predicted labels in the target data together with the source data to incrementally update the learning model in a supervised-learning scenario. This process effectively transforms the OUDA problem into a supervised-learning scenario. We leverage a neural-network-based model to align target features, cluster them class-wise and extend them linearly from the origin of the latent space as the time-step progresses. This alignment process enables accurate predictions and target label propagation based on the trajectories of the target features. We achieve target label propagation through the novel Flow-based Hierarchical Optimal Transport (FHOT) method, which considers element-wise, cluster-wise, and distribution-wise correspondences of adjacent target features. The learning model is continuously updated with incoming target data and their predicted labels.</p><p dir="ltr">To comprehensively assess the impact and contributions of these two approaches to the OUDA problem, we conducted extensive experiments across diverse datasets. Our analysis covered each stage of the manifold-based approach, comparing its performance with prior methods in terms of classification accuracy and computational efficiency. The time-domain-based approach was validated through linear feature alignment in the latent space, resulting in accurate label predictions. Notably, the flow-based hierarchical optimal transport technique substantially enhanced classification accuracy, particularly with increasing time-steps. Furthermore, learning model updates using target data and predicted labels significantly improved classification accuracy.</p> Online Unsupervised Domain Adaptation
2	Sparse Deep Learning and Stochastic Neural Network Yan Sun (12425889) 13 May 2022 (has links) <p>Deep learning has achieved state-of-the-art performance on many machine learning tasks. But the deep neural network(DNN) model still suffers a few issues. Over-parametrized neural network generally has better optimization landscape, but it is computationally expensive, hard to interpret and the model usually can not correctly quantify the prediction uncertainty. On the other hand, small DNN model could suffer from local trap and will be hard to optimize. In this dissertation, we tackle these issues from two directions, sparse deep learning and stochastic neural network. </p> <p><br></p> <p>For sparse deep learning, we proposed Bayesian neural network(BNN) model with mixture of normal prior. Theoretically, We established the posterior consistency and structure selection consistency, which ensures the sparse DNN model can be consistently identified. We also demonstrate the asymptotic normality of the prediction, which ensures the prediction uncertainty to be correctly quantified. Computationally, we proposed a prior annealing approach to optimize the posterior of BNN. The proposed methods share similar computation complexity to the standard stochastic gradient descent method for training DNN. Experiment results show that our model performs well on high dimensional variable selection as well as neural network pruning.</p> <p><br></p> <p>For stochastic neural network, we proposed a Kernel-Expanded Stochastic Neural Network model or K-StoNet model in short. We reformulate the DNN as a latent variable model and incorporate support vector regression (SVR) as the first hidden layer. The latent variable formulation breaks the training into a series of convex optimization problems and the model can be easily trained using the imputation-regularized optimization (IRO) algorithm. We provide theoretical guarantee for convergence of the algorithm and the prediction uncertainty quantification. Experiment results show that the proposed model can achieve good prediction performance and provide correct confidence region for prediction. </p> Computational statistics Sparse Deep Learning Stochastic Neural Network
3	ANOMALY DETECTION USING MACHINE LEARNING FORINTRUSION DETECTION Vaishnavi Rudraraju (18431880) 02 May 2024 (has links) <p dir="ltr">This thesis examines machine learning approaches for anomaly detection in network security, particularly focusing on intrusion detection using TCP and UDP protocols. It uses logistic regression models to effectively distinguish between normal and abnormal network actions, demonstrating a strong ability to detect possible security concerns. The study uses the UNSW-NB15 dataset for model validation, allowing a thorough evaluation of the models' capacity to detect anomalies in real-world network scenarios. The UNSW-NB15 dataset is a comprehensive network attack dataset frequently used in research to evaluate intrusion detection systems and anomaly detection algorithms because of its realistic attack scenarios and various network activities.</p><p dir="ltr">Further investigation is carried out using a Multi-Task Neural Network built for binary and multi-class classification tasks. This method allows for the in-depth study of network data, making it easier to identify potential threats. The model is fine-tuned during successive training epochs, focusing on validation measures to ensure its generalizability. The thesis also applied early stopping mechanisms to enhance the ML model, which helps optimize the training process, reduces the risk of overfitting, and improves the model's performance on new, unseen data.</p><p dir="ltr">This thesis also uses blockchain technology to track model performance indicators, a novel strategy that improves data integrity and reliability. This blockchain-based logging system keeps an immutable record of the models' performance over time, which helps to build a transparent and verifiable anomaly detection framework.</p><p dir="ltr">In summation, this research enhances Machine Learning approaches for network anomaly detection. It proposes scalable and effective approaches for early detection and mitigation of network intrusions, ultimately improving the security posture of network systems.</p> anomaly detection, Machine Learning Algorithm etc
4	Differentially Private Federated Learning Algorithms for Sparse Basis Recovery Ajinkya K Mulay (18823252) 14 June 2024 (has links) <p dir="ltr">Sparse basis recovery is an important learning problem when the number of model dimensions (<i>p</i>) is much larger than the number of samples (<i>n</i>). However, there has been little work that studies sparse basis recovery in the Federated Learning (FL) setting, where the Differential Privacy (DP) of the client data must also be simultaneously protected. Notably, the performance guarantees of existing DP-FL algorithms (such as DP-SGD) will degrade significantly when the system is ill-determined (i.e., <i>p >> n</i>), and thus they will fail to accurately learn the true underlying sparse model. The goal of my thesis is therefore to develop DP-FL sparse basis recovery algorithms that can recover the true underlying sparse basis provably accurately even when <i>p >> n</i>, yet still guaranteeing the differential privacy of the client data.</p><p dir="ltr">During my PhD studies, we developed three DP-FL sparse basis recovery algorithms for this purpose. Our first algorithm, SPriFed-OMP, based on the Orthogonal Matching Pursuit (OMP) algorithm, can achieve high accuracy even when <i>n = O(\sqrt{p})</i> under the stronger Restricted Isometry Property (RIP) assumption for least-square problems. Our second algorithm, Humming-Bird, based on a carefully modified variant of the Forward-Backward Algorithm (FoBA), can achieve differentially private sparse recovery for the same setup while requiring the much weaker Restricted Strong Convexity (RSC) condition. We further extend Humming-Bird to support loss functions beyond least-square satisfying the RSC condition. To the best of our knowledge, these are the first DP-FL results guaranteeing sparse basis recovery in the <i>p >> n</i> setting.</p> Data and information privacy Sparse Basis Recovery Differential Privacy Federated Learning
5	MUTUAL LEARNING ALGORITHMS IN MACHINE LEARNING Sabrina Tarin Chowdhury (14846524) 18 May 2023 (has links) <p> </p> <p>Mutual learning algorithm is a machine learning algorithm where multiple machine learning algorithms learns from different sources and then share their knowledge among themselves so that all the agents can improve their classification and prediction accuracies simultaneously. Mutual learning algorithm can be an efficient mechanism for improving the machine learning and neural network efficiency in a multi-agent system. Usually, in knowledge distillation algorithms, a big network plays the role of a static teacher and passes the data to smaller networks, known as student networks, to improve the efficiency of the latter. In this thesis, it is showed that two small networks can dynamically and interchangeably play the changing roles of teacher and student to share their knowledge and hence, the efficiency of both the networks improve simultaneously. This type of dynamic learning mechanism can be very useful in mobile environment where there is resource constraint for training with big dataset. Data exchange in multi agent, teacher-student network system can lead to efficient learning. </p> Machine Learning and Discovery mutual learning model image classification problems
6	IMAGE CAPTIONING USING TRANSFORMER ARCHITECTURE Wrucha A Nanal (14216009) 06 December 2022 (has links) <p> </p> <p>The domain of Deep Learning that is related to generation of textual description of images is called ‘Image Captioning.’ The central idea behind Image Captioning is to identify key features of an image and create meaningful sentences that describe the image. The current popular models include image captioning using Convolution Neural Network - Long Short-Term Memory (CNN-LSTM) based models and Attention based models. This research work first identifies the drawbacks of existing image captioning models namely – sequential style of execution, vanishing gradient problem and lack of context during training.</p> <p>This work aims at resolving the discovered problems by creating a Contextually Aware Image Captioning (CATIC) Model. The Transformer architecture, which solves the issues of vanishing gradients and sequential execution, forms the basis of the suggested model. In order to inject the contextualized embeddings of the caption sentences, this work uses Bidirectional Encoder Representation of Transformers (BERT). This work uses Remote Sensing Image Captioning Dataset. The results of the CATIC model are evaluated using BLEU, METEOR and ROGUE scores. On comparison the proposed model outperforms the CNN-LSTM model in all metrices. When compared to the Attention based model’s metrices, the CATIC model outperforms for BLEU2 and ROGUE metrices and gives competitive results for others.</p> Natural language processing Computer vision Deep learning Transformer Architecture Remote Sensing Images
7	Neural Network Models For Neurophysiology Data Bryan Jimenez (13979295) 25 October 2022 (has links) <p> </p> <p>Over the last decade, measurement technology that records neural activity such as ECoG and Utah array has dramatically improved. These advancements have given researchers access to recordings from multiple neurons simultaneously. Efficient computational and statistical methods are required to analyze this data type successfully. The time-series model is one of the most common approaches for analyzing this data type. Unfortunately, even with all the advances made with time-series models, it is not always enough since these models often need massive amounts of data to achieve good results. This is especially true in the field of neuroscience, where the datasets are often limited, therefore imposing constraints on the type and complexity of the models we can use. Not only that, but the Signal-to- noise ratio tends to be lower than in other machine learning datasets. This paper will introduce different architectures and techniques to overcome constraints imposed by these small datasets. There are two major experiments that we will discuss. (1) We will strive to develop models for participants who lost the ability to speak by building upon the previous state-of-the-art model for decoding neural activity (ECoG data) into English text. (2) We will introduce two new models, RNNF and Neural RoBERTa. These new models impute missing neural data from neural recordings (Utah arrays) of monkeys performing kinematic tasks. These new models with the help of novel data augmentation techniques (dynamic masking) outperformed state-of-the-art models such as Neural Data Transformer (NDT) in the Neural Latents Benchmark competition. </p> Natural Language Processing Electrocorticography intracranial Electroencephalography
8	Machine-Learning Based Assessment of Cystic Fibrosis Juan Antonio Kim Hoo Chong Chie (18010987) 28 February 2024 (has links) <p dir="ltr">Cystic fibrosis is a genetic disease that affects over 162,428 people worldwide. Currently, assessing cystic fibrosis from medical images requires a trained expert to manually annotate regions in the patient's lungs to determine the stage and severity of the disease. This process takes a substantial amount of time and effort to achieve an accurate assessment. </p><p dir="ltr">Recent advancements in machine learning and deep learning have been effective in solving classification, decision-making, identification, and segmentation problems in various disciplines. In medical research, these techniques have been used to perform image analyses that aid in organ identification, tissue classification, and lesion segmentation, which reduces the time required for physicians to analyze medical images. However, these techniques have yet to be widely applied in the assessment of cystic fibrosis. </p><p dir="ltr">This thesis describes an automated framework employed to assess the severity and extent of cystic fibrosis. The framework comprises three analysis stages: airways analysis, texture analysis, and lung lesions detection, that are utilized to extract cystic fibrosis features from CT scans, and which are used to assess the severity and extent of cystic fibrosis. The framework achieved an accuracy of 86.96\% in the staging process. The main contribution of this work is the development of a data-driven methodology used to design a quantitative cystic fibrosis staging and grading model.</p> Signal processing Image processing Cystic Fibrosis Volume Segmentation texture analysis method Clinical Assessment
9	<b>PROBABILISTIC ENSEMBLE MACHINE LEARNING APPROACHES FOR UNSTRUCTURED TEXTUAL DATA CLASSIFICATION</b> Srushti Sandeep Vichare (17277901) 26 April 2024 (has links) <p dir="ltr">The volume of big data has surged, notably in unstructured textual data, comprising emails, social media, and more. Currently, unstructured data represents over 80% of global data, the growth is propelled by digitalization. Unstructured text data analysis is crucial for various applications like social media sentiment analysis, customer feedback interpretation, and medical records classification. The complexity is due to the variability in language use, context sensitivity, and the nuanced meanings that are expressed in natural language. Traditional machine learning approaches, while effective in handling structured data, frequently fall short when applied to unstructured text data due to the complexities. Extracting value from this data requires advanced analytics and machine learning. Recognizing the challenges, we developed innovative ensemble approaches that combine the strengths of multiple conventional machine learning classifiers through a probabilistic approach. Response to the challenges , we developed two novel models: the Consensus-Based Integration Model (CBIM) and the Unified Predictive Averaging Model (UPAM).The CBIM and UPAM ensemble models were applied to Twitter (40,000 data samples) and the National Electronic Injury Surveillance System (NEISS) datasets (323,344 data samples) addressing various challenges in unstructured text analysis. The NEISS dataset achieved an unprecedented accuracy of 99.50%, demonstrating the effectiveness of ensemble models in extracting relevant features and making accurate predictions. The Twitter dataset, utilized for sentiment analysis, demonstrated a significant boost in accuracy over conventional approaches, achieving a maximum of 65.83%. The results highlighted the limitations of conventional machine learning approaches when dealing with complex, unstructured text data and the potential of ensemble models. The models exhibited high accuracy across various datasets and tasks, showcasing their versatility and effectiveness in obtaining valuable insights from unstructured text data. The results obtained extend the boundaries of text analysis and improve the field of natural language processing.</p> probablistic approach Machine Learning Ensemble Method ensemble approaches unstructured text data Injury classification sentiment analyis
10	NONLINEAR DIFFUSIONS ON GRAPHS FOR CLUSTERING, SEMI-SUPERVISED LEARNING AND ANALYZING PREDICTIONS Meng Liu (14075697) 09 November 2022 (has links) <p>Graph diffusion is the process of spreading information from one or few nodes to the rest of the graph through edges. The resulting distribution of the information often implies latent structure of the graph where nodes more densely connected can receive more signal. This makes graph diffusions a powerful tool for local clustering, which is the problem of finding a cluster or community of nodes around a given set of seeds. Most existing literatures on using graph diffusions for local graph clustering are linear diffusions as their dynamics can be fully interpreted through linear systems. They are also referred as eigenvector, spectral, or random walk based methods. While efficient, they often have difficulty capturing the correct boundary of a target label or target cluster. On the contrast, maxflow-mincut based methods that can be thought as 1-norm nonlinear variants of the linear diffusions seek to "improve'' or "refine'' a given cluster and can often capture the boundary correctly. However, there is a lack of literature to adopt them for problems such as community detection, local graph clustering, semi-supervised learning, etc. due to the complexity of their formulation. We addressed these issues by performing extensive numerical experiments to demonstrate the performance of flow-based methods in graphs from various sources. We also developed an efficient LocalGraphClustering Python Package that allows others to easily use these methods in their own problems. While studying these flow-based methods, we find that they cannot grow from small seed set. Although there are hybrid procedures that incorporate ideas from both linear diffusions and flow-based methods, they have many hard to set parameters. To tackle these issues, we propose a simple generalization of the objective function behind linear diffusion and flow-based methods which we call generalized local graph min-cut problem. We further show that by involving p-norm in this cut problem, we can develop a nonlinear diffusion procedure that can find local clusters from small seed set and capture the correct boundary simultaneously. Our method can be thought as a nonlinear generalization of the Anderson-Chung-Lang push procedure to approximate a personalized PageRank vector efficiently and is a strongly local algorithm-one whose runtime depends on the size of the output rather than the size of the graph. We also show that the p-norm cut functions improve on the standard Cheeger inequalities for linear diffusion methods. We further extend our generalized local graph min-cut problem and the corresponding diffusion solver to hypergraph-based machine learning problems. Although many methods for local graph clustering exist, there are relatively few for localized clustering in hypergraphs. Moreover, those that exist often lack flexibility to model a general class of hypergraph cut functions or cannot scale to large problems. Our new hypergraph diffusion method on the other hand enables us to compute with a wide variety of cardinality-based hypergraph cut functions and still maintains the strongly local property. We also show that the clusters found by solving the new objective function satisfy a Cheeger-like quality guarantee.</p> <p>Besides clustering, recent work on graph-based learning often focuses on node embeddings and graph neural networks. Although these GNN based methods can beat traditional ones especially when node attributes data is available, it is challenging to understand them because they are highly over-parameterized. To solve this issue, we propose a novel framework that combines topological data analysis and diffusion to transform the complex prediction space into human understandable pictures. The method can be applied to other datasets not in graph formats and scales up to large datasets across different domains and enable us to find many useful insights about the data and the model.</p> Graph, social and multimedia data Topology topological data analysis clustering pagerank semi-supervised learning visualization neural networks diffusions graph social network hypergraph

Search results