• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • Tagged with
  • 14
  • 14
  • 11
  • 11
  • 10
  • 8
  • 7
  • 6
  • 6
  • 5
  • 5
  • 4
  • 4
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Towards Latent Space Disentanglement of Variational AutoEncoders for Language

García de Herreros García, Paloma January 2022 (has links)
Variational autoencoders (VAEs) are a neural network architecture broadly used in image generation (Doersch 2016). VAEs are neural network models that encode data from some domain and project it into a latent space (Doersch 2016). In doing so, the resulting encoding space goes from being a discrete distribution of vectors to a series of continuous manifolds. The latent space is subject to a Gaussian prior, giving the space some convenient properties for the distribution of said manifolds. Several strategies have been presented to try to disentangle said latent space to force each of their dimensions to have an interpretable meaning, for example, 𝛽-VAE, Factor-VAE, 𝛽-TCVAE. In this thesis, some previous VAE models for NaturalLanguage Processing (like Park and Lee (2021), and Bowman et al. (2015), where they finetune pretrained transformer models so they behave as VAEs, and where they used recurrent neural network language model to create a VAEs model that generates sentences in the continuous latent space, respectively) are combined with these disentangling techniques, to show if we can find any understandable meaning in the associated dimensions. The obtained results indicate that the techniques cannot be applied to text-based data without causing the model to suffer from posterior collapse.
2

Sign of the Times : Unmasking Deep Learning for Time Series Anomaly Detection / Skyltarna på Tiden : Avslöjande av djupinlärning för detektering av anomalier i tidsserier

Richards Ravi Arputharaj, Daniel January 2023 (has links)
Time series anomaly detection has been a longstanding area of research with applications across various domains. In recent years, there has been a surge of interest in applying deep learning models to this problem domain. This thesis presents a critical examination of the efficacy of deep learning models in comparison to classical approaches for time series anomaly detection. Contrary to the widespread belief in the superiority of deep learning models, our research findings suggest that their performance may be misleading and the progress illusory. Through rigorous experimentation and evaluation, we reveal that classical models outperform deep learning counterparts in various scenarios, challenging the prevailing assumptions. In addition to model performance, our study delves into the intricacies of evaluation metrics commonly employed in time series anomaly detection. We uncover how it inadvertently inflates the performance scores of models, potentially leading to misleading conclusions. By identifying and addressing these issues, our research contributes to providing valuable insights for researchers, practitioners, and decision-makers in the field of time series anomaly detection, encouraging a critical reevaluation of the role of deep learning models and the metrics used to assess their performance. / Tidsperiods avvikelsedetektering har varit ett långvarigt forskningsområde med tillämpningar inom olika områden. Under de senaste åren har det uppstått ett ökat intresse för att tillämpa djupinlärningsmodeller på detta problemområde. Denna avhandling presenterar en kritisk granskning av djupinlärningsmodellers effektivitet jämfört med klassiska metoder för tidsperiods avvikelsedetektering. I motsats till den allmänna övertygelsen om överlägsenheten hos djupinlärningsmodeller tyder våra forskningsresultat på att deras prestanda kan vara vilseledande och framsteg illusoriskt. Genom rigorös experimentell utvärdering avslöjar vi att klassiska modeller överträffar djupinlärningsalternativ i olika scenarier och därmed utmanar de rådande antagandena. Utöver modellprestanda går vår studie in på detaljerna kring utvärderings-metoder som oftast används inom tidsperiods avvikelsedetektering. Vi avslöjar hur dessa oavsiktligt överdriver modellernas prestandapoäng och kan därmed leda till vilseledande slutsatser. Genom att identifiera och åtgärda dessa problem bidrar vår forskning till att erbjuda värdefulla insikter för forskare, praktiker och beslutsfattare inom området tidsperiods avvikelsedetektering, och uppmanar till en kritisk omvärdering av djupinlärningsmodellers roll och de metoder som används för att bedöma deras prestanda.
3

A New Approach to Synthetic Image Evaluation

Memari, Majid 01 December 2023 (has links) (PDF)
This study is dedicated to enhancing the effectiveness of Optical Character Recognition (OCR) systems, with a special emphasis on Arabic handwritten digit recognition. The choice to focus on Arabic handwritten digits is twofold: first, there has been relatively less research conducted in this area compared to its English counterparts; second, the recognition of Arabic handwritten digits presents more challenges due to the inherent similarities between different Arabic digits.OCR systems, engineered to decipher both printed and handwritten text, often face difficulties in accurately identifying low-quality or distorted handwritten text. The quality of the input image and the complexity of the text significantly influence their performance. However, data augmentation strategies can notably improve these systems' performance. These strategies generate new images that closely resemble the original ones, albeit with minor variations, thereby enriching the model's learning and enhancing its adaptability. The research found Conditional Variational Autoencoders (C-VAE) and Conditional Generative Adversarial Networks (C-GAN) to be particularly effective in this context. These two generative models stand out due to their superior image generation and feature extraction capabilities. A significant contribution of the study has been the formulation of the Synthetic Image Evaluation Procedure, a systematic approach designed to evaluate and amplify the generative models' image generation abilities. This procedure facilitates the extraction of meaningful features, computation of the Fréchet Inception Distance (LFID) score, and supports hyper-parameter optimization and model modifications.
4

Multivariate analysis of the parameters in a handwritten digit recognition LSTM system / Multivariat analys av parametrarna i ett LSTM-system för igenkänning av handskrivna siffror

Zervakis, Georgios January 2019 (has links)
Throughout this project, we perform a multivariate analysis of the parameters of a long short-term memory (LSTM) system for handwritten digit recognition in order to understand the model’s behaviour. In particular, we are interested in explaining how this behaviour precipitate from its parameters, and what in the network is responsible for the model arriving at a certain decision. This problem is often referred to as the interpretability problem, and falls under scope of Explainable AI (XAI). The motivation is to make AI systems more transparent, so that we can establish trust between humans. For this purpose, we make use of the MNIST dataset, which has been successfully used in the past for tackling digit recognition problem. Moreover, the balance and the simplicity of the data makes it an appropriate dataset for carrying out this research. We start by investigating the linear output layer of the LSTM, which is directly associated with the models’ predictions. The analysis includes several experiments, where we apply various methods from linear algebra such as principal component analysis (PCA) and singular value decomposition (SVD), to interpret the parameters of the network. For example, we experiment with different setups of low-rank approximations of the weight output matrix, in order to see the importance of each singular vector for each class of the digits. We found out that cutting off the fifth left and right singular vectors the model practically losses its ability to predict eights. Finally, we present a framework for analysing the parameters of the hidden layer, along with our implementation of an LSTM based variational autoencoder that serves this purpose. / I det här projektet utför vi en multivariatanalys av parametrarna för ett long short-term memory system (LSTM) för igenkänning av handskrivna siffror för att förstå modellens beteende. Vi är särskilt intresserade av att förklara hur detta uppträdande kommer ur parametrarna, och vad i nätverket som ligger bakom den modell som kommer fram till ett visst beslut. Detta problem kallas ofta för interpretability problem och omfattas av förklarlig AI (XAI). Motiveringen är att göra AI-systemen öppnare, så att vi kan skapa förtroende mellan människor. I detta syfte använder vi MNIST-datamängden, som tidigare framgångsrikt har använts för att ta itu med problemet med igenkänning av siffror. Dessutom gör balansen och enkelheten i uppgifterna det till en lämplig uppsättning uppgifter för att utföra denna forskning. Vi börjar med att undersöka det linjära utdatalagret i LSTM, som är direkt kopplat till modellernas förutsägelser. Analysen omfattar flera experiment, där vi använder olika metoder från linjär algebra, som principalkomponentanalys (PCA) och singulärvärdesfaktorisering (SVD), för att tolka nätverkets parametrar. Vi experimenterar till exempel med olika uppsättningar av lågrangordnade approximationer av viktutmatrisen för att se vikten av varje enskild vektor för varje klass av siffrorna. Vi upptäckte att om man skär av den femte vänster och högervektorn förlorar modellen praktiskt taget sin förmåga att förutsäga siffran åtta. Slutligen lägger vi fram ett ramverk för analys av parametrarna för det dolda lagret, tillsammans med vårt genomförande av en LSTM-baserad variational autoencoder som tjänar detta syfte.
5

Deep Synthetic Noise Generation for RGB-D Data Augmentation

Hammond, Patrick Douglas 01 June 2019 (has links)
Considerable effort has been devoted to finding reliable methods of correcting noisy RGB-D images captured with unreliable depth-sensing technologies. Supervised neural networks have been shown to be capable of RGB-D image correction, but require copious amounts of carefully-corrected ground-truth data to train effectively. Data collection is laborious and time-intensive, especially for large datasets, and generation of ground-truth training data tends to be subject to human error. It might be possible to train an effective method on a relatively smaller dataset using synthetically damaged depth-data as input to the network, but this requires some understanding of the latent noise distribution of the respective camera. It is possible to augment datasets to a certain degree using naive noise generation, such as random dropout or Gaussian noise, but these tend to generalize poorly to real data. A superior method would imitate real camera noise to damage input depth images realistically so that the network is able to learn to correct the appropriate depth-noise distribution.We propose a novel noise-generating CNN capable of producing realistic noise customized to a variety of different depth-noise distributions. In order to demonstrate the effects of synthetic augmentation, we also contribute a large novel RGB-D dataset captured with the Intel RealSense D415 and D435 depth cameras. This dataset pairs many examples of noisy depth images with automatically completed RGB-D images, which we use as proxy for ground-truth data. We further provide an automated depth-denoising pipeline which may be used to produce proxy ground-truth data for novel datasets. We train a modified sparse-to-dense depth-completion network on splits of varying size from our dataset to determine reasonable baselines for improvement. We determine through these tests that adding more noisy depth frames to each RGB-D image in the training set has a nearly identical impact on depth-completion training as gathering more ground-truth data. We leverage these findings to produce additional synthetic noisy depth images for each RGB-D image in our baseline training sets using our noise-generating CNN. Through use of our augmentation method, it is possible to achieve greater than 50% error reduction on supervised depth-completion training, even for small datasets.
6

Prediction of Persistence to Treatment for Patients with Rheumatoid Arthritis using Deep Learning / Prediktion av behandlingspersistens för patienter med Reumatoid Artrit med djupinlärning

Arda Yilal, Serkan January 2023 (has links)
Rheumatoid Arthritis is an inflammatory joint disease that is one of the most common autoimmune diseases in the world. The treatment usually starts with a first-line treatment called Methotrexate, but it is often insufficient. One of the most common second-line treatments is Tumor Necrosis Factor inhibitors (TNFi). Although some patients respond to TNFi, it has a risk of side effects, including infections. Hence, ability to predict patient responses to TNFi becomes important to choose the correct treatment. This work presents a new approach to predict if the patients were still on TNFi, 1 year after they started, by using a generative neural network architecture called Variational Autoencoder (VAE). We combined a VAE and a classifier neural network to create a supervised learning model called Supervised VAE (SVAE), trained on two versions of a tabular dataset containing Swedish register data. The datasets consist of 7341 patient records, and our SVAE achieved an AUROC score of 0.615 on validation data. Nevertheless, compared to machine learning models previously used for the same prediction task, SVAE achieved higher scores than decision trees and elastic net but lower scores than random forest and gradient-boosted decision tree. Despite the regularization effect that VAEs provide during classification training, the scores achieved by the SVAEs tested during this thesis were lower than the acceptable discrimination level. / Reumatoid artrit är en inflammatorisk ledsjukdom och är en av de vanligaste autoimmuna sjukdomarna i världen. Medicinsk behandling börjar ofta med Metotrexat. Vid brist på respons så fortsätter behandlingen ofta med Tumor Necrosis Inhibitors (TNFi). På grund av biverkningar av TNFi, såsom ökad risk för infektioner, är det viktigt att kunna prediktera patienters respons på behandlingen. Här presenteras ett nytt sätt att prediktera om patienter fortfarande stod på TNFi ett år efter initiering. Vi kombinerade Variational Autoencoder (VAE), ett generativt neuralt nätverk, med ett klassificeringsnätverk för att skapa en övervakad inlärningsmodell kallad Supervised VAE (SVAE). Denna tränades på två versioner av svenska registerdata, vilka innehöll information om 7341 patienter i tabellform. Vår SVAE-modell uppnådde 0,615 AUROC på valideringsdata. I jämförelse med maskininlärningsmodeller som tidigare använts för samma prediktionsuppgift uppnådde SVAE högre poäng än Decision Tree och Elastic Net men lägre poäng än Random Forest och Gradient-Boosted Decision Tree. Trots regulariseringseffekten som VAE ger under träning så var poängen som de testade SVAEmodellerna uppnår lägre än den acceptabla diskrimineringsnivån.
7

Cognitively Guided Modeling of Visual Perception in Intelligent Vehicles

Plebe, Alice 20 April 2021 (has links)
This work proposes a strategy for visual perception in the context of autonomous driving. Despite the growing research aiming to implement self-driving cars, no artificial system can claim to have reached the driving performance of a human, yet. Humans---when not distracted or drunk---are still the best drivers you can currently find. Hence, the theories about the human mind and its neural organization could reveal precious insights on how to design a better autonomous driving agent. This dissertation focuses specifically on the perceptual aspect of driving, and it takes inspiration from four key theories on how the human brain achieves the cognitive capabilities required by the activity of driving. The first idea lies at the foundation of current cognitive science, and it argues that thinking nearly always involves some sort of mental simulation, which takes the form of imagery when dealing with visual perception. The second theory explains how the perceptual simulation takes place in neural circuits called convergence-divergence zones, which expand and compress information to extract abstract concepts from visual experience and code them into compact representations. The third theory highlights that perception---when specialized for a complex task as driving---is refined by experience in a process called perceptual learning. The fourth theory, namely the free-energy principle of predictive brains, corroborates the role of visual imagination as a fundamental mechanism of inference. In order to implement these theoretical principles, it is necessary to identify the most appropriate computational tools currently available. Within the consolidated and successful field of deep learning, I select the artificial architectures and strategies that manifest a sounding resemblance with their cognitive counterparts. Specifically, convolutional autoencoders have a strong correspondence with the architecture of convergence-divergence zones and the process of perceptual abstraction. The free-energy principle of predictive brains is related to variational Bayesian inference and the use of recurrent neural networks. In fact, this principle can be translated into a training procedure that learns abstract representations predisposed to predicting how the current road scenario will change in the future. The main contribution of this dissertation is a method to learn conceptual representations of the driving scenario from visual information. This approach forces a semantic internal organization, in the sense that distinct parts of the representation are explicitly associated to specific concepts useful in the context of driving. Specifically, the model uses as few as 16 neurons for each of the two basic concepts here considered: vehicles and lanes. At the same time, the approach biases the internal representations towards the ability to predict the dynamics of objects in the scene. This property of temporal coherence allows the representations to be exploited to predict plausible future scenarios and to perform a simplified form of mental imagery. In addition, this work includes a proposal to tackle the problem of opaqueness affecting deep neural networks. I present a method that aims to mitigate this issue, in the context of longitudinal control for automated vehicles. A further contribution of this dissertation experiments with higher-level spaces of prediction, such as occupancy grids, which could conciliate between the direct application to motor controls and the biological plausibility.
8

Deep Time: Deep Learning Extensions to Time Series Factor Analysis with Applications to Uncertainty Quantification in Economic and Financial Modeling

Miller, Dawson Jon 12 September 2022 (has links)
This thesis establishes methods to quantify and explain uncertainty through high-order moments in time series data, along with first principal-based improvements on the standard autoencoder and variational autoencoder. While the first-principal improvements on the standard variational autoencoder provide additional means of explainability, we ultimately look to non-variational methods for quantifying uncertainty under the autoencoder framework. We utilize Shannon's differential entropy to accomplish the task of uncertainty quantification in a general nonlinear and non-Gaussian setting. Together with previously established connections between autoencoders and principal component analysis, we motivate the focus on differential entropy as a proper abstraction of principal component analysis to this more general framework, where nonlinear and non-Gaussian characteristics in the data are permitted. Furthermore, we are able to establish explicit connections between high-order moments in the data to those in the latent space, which induce a natural latent space decomposition, and by extension, an explanation of the estimated uncertainty. The proposed methods are intended to be utilized in economic and financial factor models in state space form, building on recent developments in the application of neural networks to factor models with applications to financial and economic time series analysis. Finally, we demonstrate the efficacy of the proposed methods on high frequency hourly foreign exchange rates, macroeconomic signals, and synthetically generated autoregressive data sets. / Master of Science / This thesis establishes methods to quantify and explain uncertainty in time series data, along with improvements on some latent variable neural networks called autoencoders and variational autoencoders. Autoencoders and varitational autoencodes are called latent variable neural networks since they can estimate a representation of the data that has less dimension than the original data. These neural network architectures have a fundamental connection to a classical latent variable method called principal component analysis, which performs a similar task of dimension reduction but under more restrictive assumptions than autoencoders and variational autoencoders. In contrast to principal component analysis, a common ailment of neural networks is the lack of explainability, which accounts for the colloquial term black-box models. While the improvements on the standard autoencoders and variational autoencoders help with the problem of explainability, we ultimately look to alternative probabilistic methods for quantifying uncertainty. To accomplish this task, we focus on Shannon's differential entropy, which is entropy applied to continuous domains such as time series data. Entropy is intricately connected to the notion of uncertainty, since it depends on the amount of randomness in the data. Together with previously established connections between autoencoders and principal component analysis, we motivate the focus on differential entropy as a proper abstraction of principal component analysis to a general framework that does not require the restrictive assumptions of principal component analysis. Furthermore, we are able to establish explicit connections between high-order moments in the data to the estimated latent variables (i.e., the reduced dimension representation of the data). Estimating high-order moments allows for a more accurate estimation of the true distribution of the data. By connecting the estimated high-order moments in the data to the latent variables, we obtain a natural decomposition of the uncertainty surrounding the latent variables, which allows for increased explainability of the proposed autoencoder. The methods introduced in this thesis are intended to be utilized in a class of economic and financial models called factor models, which are frequently used in policy and investment analysis. A factor model is another type of latent variable model, which in addition to estimating a reduced dimension representation of the data, provides a means to forecast future observations. Finally, we demonstrate the efficacy of the proposed methods on high frequency hourly foreign exchange rates, macroeconomic signals, and synthetically generated autoregressive data sets. The results support the superiority of the entropy-based autoencoder to the standard variational autoencoder both in capability and computational expense.
9

Image generation through feature extraction and learning using a deep learning approach

Bruneel, Tibo January 2023 (has links)
With recent advancements, image generation has become more and more possible with the introduction of stronger generative artificial intelligence (AI) models. The idea and ability of generating non-existing images that highly resemble real world images is interesting for many use cases. Generated images could be used, for example, to augment, extend or replace real data sets for training AI models, therefore being capable of minimising costs on data collection and similar processes. Deep learning, a sub-field within the AI field has been on the forefront of such methodologies due to its nature of being able to capture and learn highly complex and feature-rich data. This work focuses on deep generative learning approaches within a forestry application, with the goal of generating tree log end images in order to enhance an AI model that uses such images. This approach would not only reduce costs of data collection for this model, but also many other information extraction models within the forestry field. This thesis study includes research on the state of the art within deep generative modelling and experiments using a full pipeline from a deep generative modelling stage to a log end recognition model. On top of this, a variant architecture and image sampling algorithm are proposed to add in this pipeline and evaluate its performance. The experiments and findings show that the applied generative model approaches show good feature learning, but lack the high-quality and realistic generation, resulting in more blurry results. The variant approach resulted in slightly better feature learning with a trade-off in generation quality. The proposed sampling algorithm proved to work well on a qualitative basis. The problems found in the generative models propagated further into the training of the recognition model, making the improvement of another AI model based on purely generated data impossible at this point in the research. The results of this research show that more work is needed on improving the application and generation quality to make it resemble real world data more, so that other models can be trained on artificial data. The variant approach does not improve much and its findings contribute to the field by proving its strengths and weaknesses, as with the proposed image sampling algorithm. At last this study provides a good starting point for research within this application, with many different directions and opportunities for future work.
10

Navigating the Metric Zoo: Towards a More Coherent Model For Quantitative Evaluation of Generative ML Models

Dozier, Robbie 26 August 2022 (has links)
No description available.

Page generated in 0.1264 seconds