Spelling suggestions: "subject:"autoencoder"" "subject:"encoders""
11 |
Representation learning with a temporally coherent mixed-representationParkinson, Jon January 2017 (has links)
Guiding a representation towards capturing temporally coherent aspects present invideo improves object identity encoding. Existing models apply temporal coherenceuniformly over all features based on the assumption that optimal encoding of objectidentity only requires temporally stable components. We test the validity of this assumptionby exploring the effects of applying a mixture of temporally coherent invariantfeatures, alongside variable features, in a single 'mixed' representation. Applyingtemporal coherence to different proportions of the available features, we evaluate arange of models on a supervised object classification task. This series of experimentswas tested on three video datasets, each with a different complexity of object shape andmotion. We also investigated whether a mixed-representation improves the capture ofinformation components associated with object position, alongside object identity, ina single representation. Tests were initially applied using a single layer autoencoderas a test bed, followed by subsequent tests investigating whether similar behaviouroccurred in the more abstract features learned by a deep network. A representationapplying temporal coherence in some fashion produced the best results in all tests,on both single layered and deep networks. The majority of tests favoured a mixed representation,especially in cases where the quantity of labelled data available to thesupervised task was plentiful. This work is the first time a mixed-representation hasbeen investigated, and demonstrates its use as a method for representation learning.
|
12 |
Exploiting diversity for efficient machine learningGeras, Krzysztof Jerzy January 2018 (has links)
A common practice for solving machine learning problems is currently to consider each problem in isolation, starting from scratch every time a new learning problem is encountered or a new model is proposed. This is a perfectly feasible solution when the problems are sufficiently easy or, if the problem is hard when a large amount of resources, both in terms of the training data and computation, are available. Although this naive approach has been the main focus of research in machine learning for a few decades and had a lot of success, it becomes infeasible if the problem is too hard in proportion to the available resources. When using a complex model in this naive approach, it is necessary to collect large data sets (if possible at all) to avoid overfitting and hence it is also necessary to use large computational resources to handle the increased amount of data, first during training to process a large data set and then also at test time to execute a complex model. An alternative to this strategy of treating each learning problem independently is to leverage related data sets and computation encapsulated in previously trained models. By doing that we can decrease the amount of data necessary to reach a satisfactory level of performance and, consequently, improve the accuracy achievable and decrease training time. Our attack on this problem is to exploit diversity - in the structure of the data set, in the features learnt and in the inductive biases of different neural network architectures. In the setting of learning from multiple sources we introduce multiple-source cross-validation, which gives an unbiased estimator of the test error when the data set is composed of data coming from multiple sources and the data at test time are coming from a new unseen source. We also propose new estimators of variance of the standard k-fold cross-validation and multiple-source cross-validation, which have lower bias than previously known ones. To improve unsupervised learning we introduce scheduled denoising autoencoders, which learn a more diverse set of features than the standard denoising auto-encoder. This is thanks to their training procedure, which starts with a high level of noise, when the network is learning coarse features and then the noise is lowered gradually, which allows the network to learn some more local features. A connection between this training procedure and curriculum learning is also drawn. We develop further the idea of learning a diverse representation by explicitly incorporating the goal of obtaining a diverse representation into the training objective. The proposed model, the composite denoising autoencoder, learns multiple subsets of features focused on modelling variations in the data set at different levels of granularity. Finally, we introduce the idea of model blending, a variant of model compression, in which the two models, the teacher and the student, are both strong models, but different in their inductive biases. As an example, we train convolutional networks using the guidance of bidirectional long short-term memory (LSTM) networks. This allows to train the convolutional neural network to be more accurate than the LSTM network at no extra cost at test time.
|
13 |
TASK DETECTORS FOR PROGRESSIVE SYSTEMSMaxwell Joseph Jacobson (10669431) 30 April 2021 (has links)
While methods like learning-without-forgetting [11] and elastic weight consolidation [22] accomplish high-quality transfer learning while mitigating catastrophic forgetting, progressive techniques such as Deepmind’s progressive neural network accomplish this while completely nullifying forgetting. However, progressive systems like this strictly require task labels during test time. In this paper, I introduce a novel task recognizer built from anomaly detection autoencoders that is capable of detecting the nature of the required task from input data.Alongside a progressive neural network or other progressive learning system, this task-aware network is capable of operating without task labels during run time while maintaining any catastrophic forgetting reduction measures implemented by the task model.
|
14 |
Structured Disentangling Networks for Learning Deformation Invariant Latent SpacesJanuary 2019 (has links)
abstract: Disentangling latent spaces is an important research direction in the interpretability of unsupervised machine learning. Several recent works using deep learning are very effective at producing disentangled representations. However, in the unsupervised setting, there is no way to pre-specify which part of the latent space captures specific factors of variations. While this is generally a hard problem because of the non-existence of analytical expressions to capture these variations, there are certain factors like geometric
transforms that can be expressed analytically. Furthermore, in existing frameworks, the disentangled values are also not interpretable. The focus of this work is to disentangle these geometric factors of variations (which turn out to be nuisance factors for many applications) from the semantic content of the signal in an interpretable manner which in turn makes the features more discriminative. Experiments are designed to show the modularity of the approach with other disentangling strategies as well as on multiple one-dimensional (1D) and two-dimensional (2D) datasets, clearly indicating the efficacy of the proposed approach. / Dissertation/Thesis / Masters Thesis Electrical Engineering 2019
|
15 |
Vertical federated learning using autoencoders with applications in electrocardiogramsChorney, Wesley William 08 August 2023 (has links) (PDF)
Federated learning is a framework in machine learning that allows for training a model while maintaining data privacy. Moreover, it allows clients with their own data to collaborate in order to build a stronger, shared model. Federated learning is of particular interest to healthcare data, since it is of the utmost importance to respect patient privacy while still building useful diagnostic tools. However, healthcare data can be complicated — data format might differ across providers, leading to unexpected inputs and incompatibility between different providers. For example, electrocardiograms might differ in sampling rate or number of leads used, meaning that a classifier trained at one hospital might be useless to another. We propose using autoencoders to address this problem, transforming important information contained in electrocardiograms to a uniform input, where federated learning can then be used to train a strong classifier for multiple healthcare providers. Furthermore, we propose using statistically-guided hyperparameter tuning to ensure fast convergence of the model.
|
16 |
Neural Network-based Anomaly Detection Models and Interpretability Methods for Multivariate Time Series DataPrasad, Deepthy, Hampapura Sripada, Swathi January 2023 (has links)
Anomaly detection plays a crucial role in various domains, such as transportation, cybersecurity, and industrial monitoring, where the timely identification of unusual patterns or outliers is of utmost importance. Traditional statistical techniques have limitations in handling complex and highdimensional data, which motivates the use of deep learning approaches. The project proposes designing and implementing deep neural networks, tailored explicitly for time series multivariate data from sensors incorporated in vehicles, to effectively capture intricate temporal dependencies and interactions among variables. As this project is conducted in collaboration with Scania, Sweden, the models are trained on datasets encompassing various vehicle sensor data. Different deep learning architectures, including Long Short-Term Memory (LSTM) networks and Convolutional Neural Networks (CNNs), are explored and compared to identify the most suitable model for anomaly detection tasks for the specified time series data and CNN found to perform well for the data used in the study. Furthermore, interpretability techniques are incorporated into the developed models to enhance their transparency and provide insights into the reasons behind detected anomalies. Interpretability is crucial in real-world applications to facilitate trust, understanding, and decision-making. Both model-agnostic and model-specific interpretability methods were employed to highlight the relevant features and contribute to the interpretability of the anomaly detection models. The performance of the proposed models is evaluated using test datasets with anomaly data, and comparisons are made against existing anomaly detection methods to demonstrate their effectiveness. Evaluation metrics such as precision, recall, false positive rate, F1 score, and composite F1 score are employed to assess the anomaly detection models' detection accuracy and robustness. For evaluating the interpretability method, Kolmogorov-Smirnov Test is used on counterfactual examples. The outcomes of this research project will contribute to developing advanced anomaly detection techniques that can effectively analyse time series multivariate data collected from sensors incorporated in vehicles. Incorporating interpretability techniques will provide valuable insights into the detected anomalies, enabling better decision-making and improved trust in the deployed models. These advancements can potentially enhance anomaly detection systems across various domains, leading to more reliable and secure operations.
|
17 |
Socially Connected Internet-of-things Devices for Crowd Management SystemsHamrouni, Aymen 04 May 2023 (has links)
Autonomously monitoring and analyzing the behavior of the crowd is an open research topic in the transportation field because of its criticality to the safety of people. Real-time identification, tracking, and prediction of crowd behavior are primordial to ensure smooth crowd management operations and the welfare of the public in many public areas, such as public transport stations and streets. This being said, enabling such systems is not a straightforward procedure. First, the complexity brought by the interaction and fusion from individual to group needs to be assessed and analyzed. Second, the classification of these actions might be useful in identifying danger and avoiding any undesirable consequences. The adoption of the Internet-of-things (IoT) in such systems has made it possible to gather a large amount of data. However, it raises diverse compatibility and trustworthiness challenges, among others, hindering the use of conventional service discovery and network navigability processes for enabling crowd management systems. In fact, as the IoT network is known for its highly dynamic topology and frequently changing characteristics (e.g., the devices' status, such as availability, battery capacity, and memory usage), traditional methods fail to learn and understand the evolving behavior of the network so as to enable real-time and context-aware service discovery to assign and select relevant IoT devices for monitoring and managing the crowd. In large-scale IoT networks, crowd management systems usually collect large data streams of images from different heterogeneous sources (e.g., CCTVs, IoT devices, or people with their smartphones) in an inadvertent way. Due to the limitations and challenges related to communication bandwidth, storage, and processing capabilities, it is unwise to transfer unselectively all the collected images since some of these images either contain duplicate information, are inaccurate, or might be falsely submitted by end-users; hence, a filtering and quality check mechanism must be put in place. As images can only provide limited information about the crowd by capturing only a snapshot of the scene at a specific point in time with limited context, an extension to deal with videos to enable efficient analysis such as crowd tracking and identification is essential for the success of crowd management systems.
In this thesis, we propose to design a smart image enhancement and quality control system for resource pooling and allocation in the Internet-of-Things applied to crowd management systems. We first rely on the Social IoT (SIoT) concept, which defines the relationships among the connected objects, to extract accurate information about the network and enable trustworthy and context-aware service exchange and resource allocation. We investigate the service discovery process in SIoT networks and essentially focus on graph-based techniques while overviewing their utilization in SIoT and discussing their advantages. We also propose an alternative to these scalable methods by introducing a low-complexity context-aware Graph Neural Network (GNN) approach to enable rapid and dynamic service discovery in a large-scale heterogeneous IoT network to enable efficient crowd management systems. Secondly, we propose to design a smart image selection procedure using an asymmetric multi-modal neural network autoencoder to select a subset of photos with high utility coverage for multiple incoming streams in the IoT network. The proposed architecture enables the selection of high-context data from an evolving picture stream and ensures relevance while discarding images that are irrelevant or falsely submitted by smartphones, for example. The approach uses the photo's metadata, such as geolocation and timestamps, along with the pictures' semantics to decide which photos can be submitted and which ones must be discarded. To extend our framework beyond just images and deal with real-time videos, we propose a transformer-based crowd management monitoring framework called V3Trans-Crowd that captures information from video data and extracts meaningful output to categorize the crowd's behavior. The proposed 3D Video Transformer is inspired from Video Swin-Transformer/VIVIT and provides an improved hierarchical transformer for multi-modal tasks with spatial and temporal fusion layers.
Our simulations show that due to its ability to embed the devices' features and relations, the GNN is capable of providing more concise clusters compared to traditional techniques, allowing for better IoT network learning and understanding. Moreover, we show that the GNN approach speeds up the service lookup search space and outperforms the traditional graph-based techniques to select suitable IoT devices for reporting and monitoring. Simulation results for three different multi-modal autoencoder architectures indicate that a hierarchical asymmetric autoencoder approach can yield better results, outperforming the mixed asymmetric autoencoder and a concatenated input autoencoder, while leveraging user-side rendering to reduce bandwidth consumption and computational overhead. Also, performance evaluation for the proposed V3Trans-Crowd model has shown great results in terms of accuracy for crowd behavior classification compared to state-of-the-art methods such as C3D pre-trained, I3D pre-trained, and ResNet 3D pre-trained on the Crowd-11 and MED datasets.
|
18 |
Towards Latent Space Disentanglement of Variational AutoEncoders for LanguageGarcía de Herreros García, Paloma January 2022 (has links)
Variational autoencoders (VAEs) are a neural network architecture broadly used in image generation (Doersch 2016). VAEs are neural network models that encode data from some domain and project it into a latent space (Doersch 2016). In doing so, the resulting encoding space goes from being a discrete distribution of vectors to a series of continuous manifolds. The latent space is subject to a Gaussian prior, giving the space some convenient properties for the distribution of said manifolds. Several strategies have been presented to try to disentangle said latent space to force each of their dimensions to have an interpretable meaning, for example, 𝛽-VAE, Factor-VAE, 𝛽-TCVAE. In this thesis, some previous VAE models for NaturalLanguage Processing (like Park and Lee (2021), and Bowman et al. (2015), where they finetune pretrained transformer models so they behave as VAEs, and where they used recurrent neural network language model to create a VAEs model that generates sentences in the continuous latent space, respectively) are combined with these disentangling techniques, to show if we can find any understandable meaning in the associated dimensions. The obtained results indicate that the techniques cannot be applied to text-based data without causing the model to suffer from posterior collapse.
|
19 |
An Investigation of Neural Network Structure with Topological Data Analysis / En undersökning av neuronnätsstruktur med topologisk dataanalysPolianskii, Vladislav January 2018 (has links)
Artificial neural networks at the present time gain notable popularity and show astounding results in many machine learning tasks. This, however, also results in a drawback that the understanding of the processes happening inside of learning algorithms decreases. In many cases, the process of choosing a neural network architecture for a problem comes down to selection of network layers by intuition and to manual tuning of network parameters. Therefore, it is important to build a strong theoretical base in this area, both to try to reduce the amount of manual work in the future and to get a better understanding of capabilities of neural networks. In this master thesis, the ideas of applying different topological and geometric methods for the analysis of neural networks were investigated. Despite the difficulties which arise from the novelty of the approach, such as limited amount of related studies, some promising methods of network analysis were established and tested on baseline machine learning datasets. One of the most notable results of the study reveals how neural networks preserve topological features of the data when it is projected into space with low dimensionality. For example, the persistence for MNIST dataset with added rotations of images gets preserved after the projection into 3D space with the use of simple autoencoders; on the other hand, autoencoders with a relatively high weight regularization parameter might be losing this ability. / Artificiella neuronnät har för närvarande uppnått märkbar popularitet och visar häpnadsväckande resultat i många maskininlärningsuppgifter. Dock leder detta också till nackdelen att förståelsen av de processer som sker inom inlärningsalgoritmerna minskar. I många fall måste man använda intuition och ställa in parametrar manuellt under processen att välja en nätverksarkitektur. Därför är det viktigt att bygga en stark teoretisk bas inom detta område, både för att försöka minska manuellt arbete i framtiden och för att få en bättre förståelse för kapaciteten hos neuronnät. I detta examensarbete undersöktes idéerna om att tillämpa olika topologiska och geometriska metoder för analys av neuronnät. Många av svårigheterna härrör från det nya tillvägagångssättet, såsom en begränsad mängd av relaterade studier, men några lovande nätverksanalysmetoder upprättades och testades på standarddatauppsättningar för maskininlärning. Ett av de mest anmärkningsvärda resultaten av examensarbetet visar hur neurala nätverk bevarar de topologiska egenskaperna hos data när den projiceras till vektorrum med låg dimensionalitet. Till exempel bevaras den topologiska persistensen för MNIST-datasetet med tillagda rotationer av bilder efter projektion i ett tredimensionellt vektorrum med användning av en basal autoencoder; å andra sidan kan autoencoders med en relativt hög viktregleringsparameter förlora denna egenskap.
|
20 |
Quantifying uncertainty in structural condition with Bayesian deep learning : A study on the Z-24 bridge benchmark / Kvantifiering av osäkerhet i strukturella tillstånd med Bayesiansk djupinlärningAsgrimsson, David Steinar January 2019 (has links)
A machine learning approach to damage detection is presented for a bridge structural health monitoring system, validated on the renowned Z-24 bridge benchmark dataset where a sensor instrumented, threespan bridge was realistically damaged in stages. A Bayesian autoencoder neural network is trained to reconstruct raw sensor data sequences, with uncertainty bounds in prediction. The reconstruction error is then compared with a healthy-state error distribution and the sequence determined to come from a healthy state or not. Several realistic damage stages were successfully detected, making this a viable approach in a data-based monitoring system of an operational bridge. This is a fully operational, machine learning based bridge damage detection system, that is learned directly from raw sensor data. / En maskininlärningsmetod för strukturell skadedetektering av broar presenteras. Metoden valideras på det kända referensdataset Z-24, där en sensor-instrumenterad trespannsbro stegvist skadats. Ett Bayesianskt neuralt nätverk med autoenkoders tränas till att rekonstruera råa sensordatasekvenser, med osäkerhetsgränser i förutsägningen. Rekonstrueringsavvikelsen jämförs med avvikelsesfördelningen i oskadat tillstånd och sekvensen bedöms att komma från ett skadad eller icke skadat tillstånd. Flera realistiska stegvisa skadetillstånd upptäcktes, vilket gör metoden användbar i ett databaserat skadedetektionssystem för en bro i full storlek. Detta är ett lovande steg mot ett helt operativt databaserat skadedetektionssystem.
|
Page generated in 0.0518 seconds