• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 99
  • 14
  • 13
  • 12
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 175
  • 175
  • 175
  • 92
  • 60
  • 57
  • 55
  • 49
  • 34
  • 33
  • 32
  • 29
  • 28
  • 28
  • 28
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Machine Learning for Disease Prediction

Frandsen, Abraham Jacob 01 June 2016 (has links)
Millions of people in the United States alone suffer from undiagnosed or late-diagnosed chronic diseases such as Chronic Kidney Disease and Type II Diabetes. Catching these diseases earlier facilitates preventive healthcare interventions, which in turn can lead to tremendous cost savings and improved health outcomes. We develop algorithms for predicting disease occurrence by drawing from ideas and techniques in the field of machine learning. We explore standard classification methods such as logistic regression and random forest, as well as more sophisticated sequence models, including recurrent neural networks. We focus especially on the use of medical code data for disease prediction, and explore different ways for representing such data in our prediction algorithms.
62

Feature Fusion Deep Learning Method for Video and Audio Based Emotion Recognition

Yanan Song (11825003) 20 December 2021 (has links)
In this thesis, we proposed a deep learning based emotion recognition system in order to improve the successive classification rate. We first use transfer learning to extract visual features and use Mel frequency Cepstral Coefficients(MFCC) to extract audio features, and then apply the recurrent neural networks(RNN) with attention mechanism to process the sequential inputs. After that, the outputs of both channels are fused into a concatenate layer, which is processed using batch normalization, to reduce internal covariate shift. Finally, the classification result is obtained by the softmax layer. From our experiments, the video and audio subsystem achieve 78% and 77% respectively, and the feature fusion system with video and audio achieves 92% accuracy based on the RAVDESS dataset for eight emotion classes. Our proposed feature fusion system outperforms conventional methods in terms of classification prediction.
63

Translating LaTeX to Coq: A Recurrent Neural Network Approach to Formalizing Natural Language Proofs

Carman, Benjamin Andrew 18 May 2021 (has links)
No description available.
64

Arrival Time Predictions for Buses using Recurrent Neural Networks / Ankomsttidsprediktioner för bussar med rekurrenta neurala nätverk

Fors Johansson, Christoffer January 2019 (has links)
In this thesis, two different types of bus passengers are identified. These two types, namely current passengers and passengers-to-be have different needs in terms of arrival time predictions. A set of machine learning models based on recurrent neural networks and long short-term memory units were developed to meet these needs. Furthermore, bus data from the public transport in Östergötland county, Sweden, were collected and used for training new machine learning models. These new models are compared with the current prediction system that is used today to provide passengers with arrival time information. The models proposed in this thesis uses a sequence of time steps as input and the observed arrival time as output. Each input time step contains information about the current state such as the time of arrival, the departure time from thevery first stop and the current position in Cartesian coordinates. The targeted value for each input is the arrival time at the next time step. To predict the rest of the trip, the prediction for the next step is simply used as input in the next time step. The result shows that the proposed models can improve the mean absolute error per stop between 7.2% to 40.9% compared to the system used today on all eight routes tested. Furthermore, the choice of loss function introduces models thatcan meet the identified passengers need by trading average prediction accuracy for a certainty that predictions do not overestimate or underestimate the target time in approximately 95% of the cases.
65

Structure, Dynamics and Self-Organization in Recurrent Neural Networks: From Machine Learning to Theoretical Neuroscience

Vilimelis Aceituno, Pau 03 July 2020 (has links)
At a first glance, artificial neural networks, with engineered learning algorithms and carefully chosen nonlinearities, are nothing like the complicated self-organized spiking neural networks studied by theoretical neuroscientists. Yet, both adapt to their inputs, keep information from the past in their state space and are able of learning, implying that some information processing principles should be common to both. In this thesis we study those principles by incorporating notions of systems theory, statistical physics and graph theory into artificial neural networks and theoretical neuroscience models. % TO DO: What is different in this thesis? -> classical signal processing with complex systems on top The starting point for this thesis is \ac{RC}, a learning paradigm used both in machine learning\cite{jaeger2004harnessing} and in theoretical neuroscience\cite{maass2002real}. A neural network in \ac{RC} consists of two parts, a reservoir – a directed and weighted network of neurons that projects the input time series onto a high dimensional space – and a readout which is trained to read the state of the neurons in the reservoir and combine them linearly to give the desired output. In classical \ac{RC}, the reservoir is randomly initialized and left untrained, which alleviates the training costs in comparison to other recurrent neural networks. However, this lack of training implies that reservoirs are not adapted to specific tasks and thus their performance is often lower than that of other neural networks. Our contribution has been to show how knowledge about a task can be integrated into the reservoir architecture, so that reservoirs can be tailored to specific problems without training. We do this design by identifying two features that are useful for machine learning: the memory of the reservoir and its power spectra. First we show that the correlations between neurons limit the capacity of the reservoir to retain traces of previous inputs, and demonstrate that those correlations are controlled by moduli of the eigenvalues of the adjacency matrix of the reservoir. Second, we prove that when the reservoir resonates at the frequencies that are present on the desired output signal, the performance of the readout increases. Knowing the features of the reservoir dynamics that we need, the next question is how to impose them. The simplest way to design a network with that resonates at a certain frequency is by adding cycles, which act as feedback loops, but this also induces correlations and hence memory modifications. To disentangle the frequencies and the memory design, we studied how the addition of cycles modifies the eigenvalues in the adjacency matrix of the network. Surprisingly, the shape of the eigenvalues is quite beautiful \cite{aceituno2019universal} and can be characterized using random matrix theory tools. Combining this knowledge with our result relating eigenvalues and correlations, we designed an heuristic that tailors reservoirs to specific tasks and showed that it improves upon state of the art \ac{RC} in three different machine learning tasks. Although this idea works in the machine learning version of \ac{RC}, there is one fundamental problem when we try to translate to the world of theoretical neuroscience: the proposed frequency adaptation requires prior knowledge of the task, which might not be plausible in a biological neural network. Therefore the following questions are whether those resonances can emerge by unsupervised learning, and which kind of learning rules would be required. Remarkably, these resonances can be induced by the well-known Spike Time-Dependent Plasticity (STDP) combined with homeostatic mechanisms. We show this by deriving two self-consistent equations: one where the activity of every neuron can be calculated from its synaptic weights and its external inputs and a second one where the synaptic weights can be obtained from the neural activity. By considering spatio-temporal symmetries in our inputs we obtained two families of solutions to those equations where a periodic input is enhanced by the neural network after STDP. This approach shows that periodic and quasiperiodic inputs can induce resonances that agree with the aforementioned \ac{RC} theory. Those results, although rigorous, are expressed on a language of statistical physics and cannot be easily tested or verified in real, scarce data. To make them more accessible to the neuroscience community we showed that latency reduction, a well-known effect of STDP\cite{song2000competitive} which has been experimentally observed \cite{mehta2000experience}, generates neural codes that agree with the self-consistency equations and their solutions. In particular, this analysis shows that metabolic efficiency, synchronization and predictions can emerge from that same phenomena of latency reduction, thus closing the loop with our original machine learning problem. To summarize, this thesis exposes principles of learning recurrent neural networks that are consistent with adaptation in the nervous system and also improve current machine learning methods. This is done by leveraging features of the dynamics of recurrent neural networks such as resonances and correlations in machine learning problems, then imposing the required dynamics into reservoir computing through control theory notions such as feedback loops and spectral analysis. Then we assessed the plausibility of such adaptation in biological networks, deriving solutions from self-organizing processes that are biologically plausible and align with the machine learning prescriptions. Finally, we relate those processes to learning rules in biological neurons, showing how small local adaptations of the spike times can lead to neural codes that are efficient and can be interpreted in machine learning terms.
66

Semantic Segmentation of Urban Scene Images Using Recurrent Neural Networks

Daliparthi, Venkata Satya Sai Ajay January 2020 (has links)
Background: In Autonomous Driving Vehicles, the vehicle receives pixel-wise sensor data from RGB cameras, point-wise depth information from the cameras, and sensors data as input. The computer present inside the Autonomous Driving vehicle processes the input data and provides the desired output, such as steering angle, torque, and brake. To make an accurate decision by the vehicle, the computer inside the vehicle should be completely aware of its surroundings and understand each pixel in the driving scene. Semantic Segmentation is the task of assigning a class label (Such as Car, Road, Pedestrian, or Sky) to each pixel in the given image. So, a better performing Semantic Segmentation algorithm will contribute to the advancement of the Autonomous Driving field. Research Gap: Traditional methods, such as handcrafted features and feature extraction methods, were mainly used to solve Semantic Segmentation. Since the rise of deep learning, most of the works are using deep learning to dealing with Semantic Segmentation. The most commonly used neural network architecture to deal with Semantic Segmentation was the Convolutional Neural Network (CNN). Even though some works made use of Recurrent Neural Network (RNN), the effect of RNN in dealing with Semantic Segmentation was not yet thoroughly studied. Our study addresses this research gap. Idea: After going through the existing literature, we came up with the idea of “Using RNNs as an add-on module, to augment the skip-connections in Semantic Segmentation Networks through residual connections.” Objectives and Method: The main objective of our work is to improve the Semantic Segmentation network’s performance by using RNNs. The Experiment was chosen as a methodology to conduct our study. In our work, We proposed three novel architectures called UR-Net, UAR-Net, and DLR-Net by implementing our idea to the existing networks U-Net, Attention U-Net, and DeepLabV3+ respectively. Results and Findings: We empirically showed that our proposed architectures have shown improvement in efficiently segmenting the edges and boundaries. Through our study, we found that there is a trade-off between using RNNs and Inference time of the model. Suppose we use RNNs to improve the performance of Semantic Segmentation Networks. In that case, we need to trade off some extra seconds during the inference of the model. Conclusion: Our findings will not contribute to the Autonomous driving field, where we need better performance in real-time. But, our findings will contribute to the advancement of Bio-medical Image segmentation, where doctors can trade-off those extra seconds during inference for better performance.
67

Multivariate analysis of the parameters in a handwritten digit recognition LSTM system / Multivariat analys av parametrarna i ett LSTM-system för igenkänning av handskrivna siffror

Zervakis, Georgios January 2019 (has links)
Throughout this project, we perform a multivariate analysis of the parameters of a long short-term memory (LSTM) system for handwritten digit recognition in order to understand the model’s behaviour. In particular, we are interested in explaining how this behaviour precipitate from its parameters, and what in the network is responsible for the model arriving at a certain decision. This problem is often referred to as the interpretability problem, and falls under scope of Explainable AI (XAI). The motivation is to make AI systems more transparent, so that we can establish trust between humans. For this purpose, we make use of the MNIST dataset, which has been successfully used in the past for tackling digit recognition problem. Moreover, the balance and the simplicity of the data makes it an appropriate dataset for carrying out this research. We start by investigating the linear output layer of the LSTM, which is directly associated with the models’ predictions. The analysis includes several experiments, where we apply various methods from linear algebra such as principal component analysis (PCA) and singular value decomposition (SVD), to interpret the parameters of the network. For example, we experiment with different setups of low-rank approximations of the weight output matrix, in order to see the importance of each singular vector for each class of the digits. We found out that cutting off the fifth left and right singular vectors the model practically losses its ability to predict eights. Finally, we present a framework for analysing the parameters of the hidden layer, along with our implementation of an LSTM based variational autoencoder that serves this purpose. / I det här projektet utför vi en multivariatanalys av parametrarna för ett long short-term memory system (LSTM) för igenkänning av handskrivna siffror för att förstå modellens beteende. Vi är särskilt intresserade av att förklara hur detta uppträdande kommer ur parametrarna, och vad i nätverket som ligger bakom den modell som kommer fram till ett visst beslut. Detta problem kallas ofta för interpretability problem och omfattas av förklarlig AI (XAI). Motiveringen är att göra AI-systemen öppnare, så att vi kan skapa förtroende mellan människor. I detta syfte använder vi MNIST-datamängden, som tidigare framgångsrikt har använts för att ta itu med problemet med igenkänning av siffror. Dessutom gör balansen och enkelheten i uppgifterna det till en lämplig uppsättning uppgifter för att utföra denna forskning. Vi börjar med att undersöka det linjära utdatalagret i LSTM, som är direkt kopplat till modellernas förutsägelser. Analysen omfattar flera experiment, där vi använder olika metoder från linjär algebra, som principalkomponentanalys (PCA) och singulärvärdesfaktorisering (SVD), för att tolka nätverkets parametrar. Vi experimenterar till exempel med olika uppsättningar av lågrangordnade approximationer av viktutmatrisen för att se vikten av varje enskild vektor för varje klass av siffrorna. Vi upptäckte att om man skär av den femte vänster och högervektorn förlorar modellen praktiskt taget sin förmåga att förutsäga siffran åtta. Slutligen lägger vi fram ett ramverk för analys av parametrarna för det dolda lagret, tillsammans med vårt genomförande av en LSTM-baserad variational autoencoder som tjänar detta syfte.
68

Universality and Individuality in Recurrent Networks extended to Biologically inspired networks

Joshi, Nishant January 2020 (has links)
Activities in the motor cortex are found to be dynamical in nature. Modeling these activities and comparing them with neural recordings helps in understanding the underlying mechanism for the generation of these activities. For this purpose, Recurrent Neural networks or RNNs, have emerged as an appropriate tool. A clear understanding of how the design choices associated with these networks affect the learned dynamics and internal representation still remains elusive. A previous work exploring the dynamical properties of discrete time RNN architectures (LSTM, UGRNN, GRU, and Vanilla) such as the fixed point topology and the linearised dynamics remains invariant when trained on 3 bit Flip- Flop task. In contrast, they show that these networks have unique representational geometry. The goal for this work is to understand if these observations also hold for networks that are more biologically realistic in terms of neural activity. Therefore, we chose to analyze rate networks that have continuous dynamics and biologically realistic connectivity constraints and on spiking neural networks, where the neurons communicate via discrete spikes as observed in the brain. We reproduce the aforementioned study for discrete architectures and then show that the fixed point topology and linearized dynamics remain invariant for the rate networks but the methods are insufficient for finding the fixed points of spiking networks. The representational geometry for the rate networks and spiking networks are found to be different from the discrete architectures but very similar to each other. Although, a small subset of discrete architectures (LSTM) are observed to be close in representation to the rate networks. We show that although these different network architectures with varying degrees of biological realism have individual internal representations, the underlying dynamics while performing the task are universal. We also observe that some discrete networks have close representational similarities with rate networks along with the dynamics. Hence, these discrete networks can be good candidates for reproducing and examining the dynamics of rate networks. / Aktiviteter i motorisk cortex visar sig vara dynamiska till sin natur. Att modellera dessa aktiviteter och jämföra dem med neurala inspelningar hjälper till att förstå den underliggande mekanismen för generering av dessa aktiviteter. För detta ändamål har återkommande neurala nätverk eller RNN uppstått som ett lämpligt verktyg. En tydlig förståelse för hur designvalen associerade med dessa nätverk påverkar den inlärda dynamiken och den interna representationen är fortfarande svårfångad. Ett tidigare arbete som utforskar de dynamiska egenskaperna hos diskreta RNN- arkitekturer (LSTM, UGRNN, GRU och Vanilla), såsom fastpunkts topologi och linjäriserad dynamik, förblir oförändrad när de tränas på 3-bitars Flip- Flop-uppgift. Däremot visar de att dessa nätverk har unik representationsgeometri. Målet för detta arbete är att förstå om dessa observationer också gäller för nätverk som är mer biologiskt realistiska när det gäller neural aktivitet. Därför valde vi att analysera hastighetsnätverk som har kontinuerlig dynamik och biologiskt realistiska anslutningsbegränsningar och på spikande neurala nätverk, där neuronerna kommunicerar via diskreta spikar som observerats i hjärnan. Vi reproducerar den ovannämnda studien för diskreta arkitekturer och visar sedan att fastpunkts topologi och linjäriserad dynamik förblir oförändrad för hastighetsnätverken men metoderna är otillräckliga för att hitta de fasta punkterna för spiknätverk. Representationsgeometrin för hastighetsnätverk och spiknätverk har visat sig skilja sig från de diskreta arkitekturerna men liknar varandra. Även om en liten delmängd av diskreta arkitekturer (LSTM) observeras vara nära i förhållande till hastighetsnäten. Vi visar att även om dessa olika nätverksarkitekturer med varierande grad av biologisk realism har individuella interna representationer, är den underliggande dynamiken under uppgiften universell. Vi observerar också att vissa diskreta nätverk har nära representationslikheter med hastighetsnätverk tillsammans med dynamiken. Följaktligen kan dessa diskreta nätverk vara bra kandidater för att reproducera och undersöka dynamiken i hastighetsnät.
69

Application of probabilistic deep learning models to simulate thermal power plant processes

Raidoo, Renita Anand 18 April 2023 (has links) (PDF)
Deep learning has gained traction in thermal engineering due to its applications to process simulations, the deeper insights it can provide and its abilities to circumvent the shortcomings of classic thermodynamic simulation approaches by capturing complex inter-dependencies. This works sets out to apply probabilistic deep learning to power plant operations using historic plant data. The first study presented, entails the development of a steady-state mixture density network (MDN) capable of predicting effective heat transfer coefficients (HTC) for the various heat exchanger components inside a utility scale boiler. Selected directly controllable input features, including the excess air ratio, steam temperatures, flow rates and pressures are used to predict the HTCs. In the second case study, an encoder-decoder mixturedensity network (MDN) is developed using recurrent neural networks (RNN) for the prediction of utility-scale air-cooled condenser (ACC) backpressure. The effects of ambient conditions and plant operating parameters, such as extraction flow rate, on ACC performance is investigated. In both case studies, hyperparameter searches are done to determine the best performing architectures for these models. Comparisons are drawn between the MDN model versus standard model architecture in both case studies. The HTC predictor model achieved 90% accuracy which equates to an average error of 4.89 W m2K across all heat exchangers. The resultant time-series ACC model achieved an average error of 3.14 kPa, which translate into a model accuracy of 82%.
70

INVESTIGATION OF DIFFERENT DATA DRIVEN APPROACHES FOR MODELING ENGINEERED SYSTEMS

Shrenik Vijaykumar Zinage (14212484) 05 December 2022 (has links)
<p>Every engineered system behaves slightly differently because of manufacturing and operational uncertainties. The ability to build system-specific predictive models that adapt to manufactured systems, also known as digital twins, opens up many possibilities for reducing operating and maintenance costs. Nonlinear dynamical systems with unknown governing equations and states characterize many engineered systems. As a result, learning their dynamics from data has become both the current research area and one of the biggest challenges. In this thesis, we do an investigation of different data driven approaches for modeling various engineered systems. Firstly, we develop a model to predict the transient and steady-state behavior of a turbocharger turbine using the Koopman operator which can be helpful for modelling, analysis and control design. Our approach is as follows. We use experimental data from a Cummins heavy-duty diesel engine to develop a turbine model using Extended Dynamic Mode Decomposition (EDMD), which approximates the action of the Koopman operator on a finite-dimensional subspace of the space of observables. The results demonstrate comparable performance with a tuned nonlinear autoregressive network with an exogenous input (NARX) model widely used in the literature. The performance of these two models is analyzed based on their ability to predict turbine transient and steady-state behavior. Furthermore, we assess the ability of liquid time-constant (LTC) networks to learn the dynamics of various oscillatory systems using noisy data. In this study, we analyze and compare the performance of the LTC network with various commonly used recurrent neural network (RNN) architectures like long short-term memory (LSTM) network, and gated recurrent units (GRU). Our approach is as follows. We first systematically generate synthetic data by exciting the system of interest with a band-limited white noise and simulating it using a forward Euler discretization scheme. After the output has been simulated, we then corrupt it with different levels of noise to replicate a practically measured signal and train the RNN architectures with that corrupted output. The model is then tested on various types of forcing excitations to analyze the robustness of these networks in capturing different behaviors exhibited by the system. We also analyze the ability of these networks to capture the resonance effect for various parameter settings. Case studies discussing standard benchmark oscillatory systems (i.e., spring-mass-damper (S-M-D) system, single degree of freedom (DOF) Bouc-Wen oscillator, and forced Van der pol oscillator) are used to test the performance of these methodologies. The results reveal that the LTC network showed better performance in modeling the S-M-D system and 1-DOF Bouc-Wen oscillator as compared to an LSTM network but was outperformed by the GRU network. None of the networks were able to model the forced Van der pol oscillator with a reasonable accuracy. Since the GRU network outperformed other networks in terms of the computational time and the model accuracy for most of the scenarios, we applied it to a real world experimental dataset (i.e. turbocharger turbine dynamics) to compare it against the EDMD and NARX model. The results showed better performance of the GRU network in modeling the transient behaviours of the turbine. However, it failed to predict the turbine outlet temperature with a reasonable accuracy in most of the regions for the steady state dataset. As future work, we plan to consider training the GRU network with a data sampling frequency of 100 Hz for a fair comparison with the NARX and the Koopman approach.</p>

Page generated in 0.2585 seconds