Spelling suggestions: "subject:"[een] RECURRENT NEURAL NETWORKS"" "subject:"[enn] RECURRENT NEURAL NETWORKS""
51 |
Translating LaTeX to Coq: A Recurrent Neural Network Approach to Formalizing Natural Language ProofsCarman, Benjamin Andrew 18 May 2021 (has links)
No description available.
|
52 |
Arrival Time Predictions for Buses using Recurrent Neural Networks / Ankomsttidsprediktioner för bussar med rekurrenta neurala nätverkFors Johansson, Christoffer January 2019 (has links)
In this thesis, two different types of bus passengers are identified. These two types, namely current passengers and passengers-to-be have different needs in terms of arrival time predictions. A set of machine learning models based on recurrent neural networks and long short-term memory units were developed to meet these needs. Furthermore, bus data from the public transport in Östergötland county, Sweden, were collected and used for training new machine learning models. These new models are compared with the current prediction system that is used today to provide passengers with arrival time information. The models proposed in this thesis uses a sequence of time steps as input and the observed arrival time as output. Each input time step contains information about the current state such as the time of arrival, the departure time from thevery first stop and the current position in Cartesian coordinates. The targeted value for each input is the arrival time at the next time step. To predict the rest of the trip, the prediction for the next step is simply used as input in the next time step. The result shows that the proposed models can improve the mean absolute error per stop between 7.2% to 40.9% compared to the system used today on all eight routes tested. Furthermore, the choice of loss function introduces models thatcan meet the identified passengers need by trading average prediction accuracy for a certainty that predictions do not overestimate or underestimate the target time in approximately 95% of the cases.
|
53 |
Semantic Segmentation of Urban Scene Images Using Recurrent Neural NetworksDaliparthi, Venkata Satya Sai Ajay January 2020 (has links)
Background: In Autonomous Driving Vehicles, the vehicle receives pixel-wise sensor data from RGB cameras, point-wise depth information from the cameras, and sensors data as input. The computer present inside the Autonomous Driving vehicle processes the input data and provides the desired output, such as steering angle, torque, and brake. To make an accurate decision by the vehicle, the computer inside the vehicle should be completely aware of its surroundings and understand each pixel in the driving scene. Semantic Segmentation is the task of assigning a class label (Such as Car, Road, Pedestrian, or Sky) to each pixel in the given image. So, a better performing Semantic Segmentation algorithm will contribute to the advancement of the Autonomous Driving field. Research Gap: Traditional methods, such as handcrafted features and feature extraction methods, were mainly used to solve Semantic Segmentation. Since the rise of deep learning, most of the works are using deep learning to dealing with Semantic Segmentation. The most commonly used neural network architecture to deal with Semantic Segmentation was the Convolutional Neural Network (CNN). Even though some works made use of Recurrent Neural Network (RNN), the effect of RNN in dealing with Semantic Segmentation was not yet thoroughly studied. Our study addresses this research gap. Idea: After going through the existing literature, we came up with the idea of “Using RNNs as an add-on module, to augment the skip-connections in Semantic Segmentation Networks through residual connections.” Objectives and Method: The main objective of our work is to improve the Semantic Segmentation network’s performance by using RNNs. The Experiment was chosen as a methodology to conduct our study. In our work, We proposed three novel architectures called UR-Net, UAR-Net, and DLR-Net by implementing our idea to the existing networks U-Net, Attention U-Net, and DeepLabV3+ respectively. Results and Findings: We empirically showed that our proposed architectures have shown improvement in efficiently segmenting the edges and boundaries. Through our study, we found that there is a trade-off between using RNNs and Inference time of the model. Suppose we use RNNs to improve the performance of Semantic Segmentation Networks. In that case, we need to trade off some extra seconds during the inference of the model. Conclusion: Our findings will not contribute to the Autonomous driving field, where we need better performance in real-time. But, our findings will contribute to the advancement of Bio-medical Image segmentation, where doctors can trade-off those extra seconds during inference for better performance.
|
54 |
Multivariate analysis of the parameters in a handwritten digit recognition LSTM system / Multivariat analys av parametrarna i ett LSTM-system för igenkänning av handskrivna siffrorZervakis, Georgios January 2019 (has links)
Throughout this project, we perform a multivariate analysis of the parameters of a long short-term memory (LSTM) system for handwritten digit recognition in order to understand the model’s behaviour. In particular, we are interested in explaining how this behaviour precipitate from its parameters, and what in the network is responsible for the model arriving at a certain decision. This problem is often referred to as the interpretability problem, and falls under scope of Explainable AI (XAI). The motivation is to make AI systems more transparent, so that we can establish trust between humans. For this purpose, we make use of the MNIST dataset, which has been successfully used in the past for tackling digit recognition problem. Moreover, the balance and the simplicity of the data makes it an appropriate dataset for carrying out this research. We start by investigating the linear output layer of the LSTM, which is directly associated with the models’ predictions. The analysis includes several experiments, where we apply various methods from linear algebra such as principal component analysis (PCA) and singular value decomposition (SVD), to interpret the parameters of the network. For example, we experiment with different setups of low-rank approximations of the weight output matrix, in order to see the importance of each singular vector for each class of the digits. We found out that cutting off the fifth left and right singular vectors the model practically losses its ability to predict eights. Finally, we present a framework for analysing the parameters of the hidden layer, along with our implementation of an LSTM based variational autoencoder that serves this purpose. / I det här projektet utför vi en multivariatanalys av parametrarna för ett long short-term memory system (LSTM) för igenkänning av handskrivna siffror för att förstå modellens beteende. Vi är särskilt intresserade av att förklara hur detta uppträdande kommer ur parametrarna, och vad i nätverket som ligger bakom den modell som kommer fram till ett visst beslut. Detta problem kallas ofta för interpretability problem och omfattas av förklarlig AI (XAI). Motiveringen är att göra AI-systemen öppnare, så att vi kan skapa förtroende mellan människor. I detta syfte använder vi MNIST-datamängden, som tidigare framgångsrikt har använts för att ta itu med problemet med igenkänning av siffror. Dessutom gör balansen och enkelheten i uppgifterna det till en lämplig uppsättning uppgifter för att utföra denna forskning. Vi börjar med att undersöka det linjära utdatalagret i LSTM, som är direkt kopplat till modellernas förutsägelser. Analysen omfattar flera experiment, där vi använder olika metoder från linjär algebra, som principalkomponentanalys (PCA) och singulärvärdesfaktorisering (SVD), för att tolka nätverkets parametrar. Vi experimenterar till exempel med olika uppsättningar av lågrangordnade approximationer av viktutmatrisen för att se vikten av varje enskild vektor för varje klass av siffrorna. Vi upptäckte att om man skär av den femte vänster och högervektorn förlorar modellen praktiskt taget sin förmåga att förutsäga siffran åtta. Slutligen lägger vi fram ett ramverk för analys av parametrarna för det dolda lagret, tillsammans med vårt genomförande av en LSTM-baserad variational autoencoder som tjänar detta syfte.
|
55 |
Application of probabilistic deep learning models to simulate thermal power plant processesRaidoo, Renita Anand 18 April 2023 (has links) (PDF)
Deep learning has gained traction in thermal engineering due to its applications to process simulations, the deeper insights it can provide and its abilities to circumvent the shortcomings of classic thermodynamic simulation approaches by capturing complex inter-dependencies. This works sets out to apply probabilistic deep learning to power plant operations using historic plant data. The first study presented, entails the development of a steady-state mixture density network (MDN) capable of predicting effective heat transfer coefficients (HTC) for the various heat exchanger components inside a utility scale boiler. Selected directly controllable input features, including the excess air ratio, steam temperatures, flow rates and pressures are used to predict the HTCs. In the second case study, an encoder-decoder mixturedensity network (MDN) is developed using recurrent neural networks (RNN) for the prediction of utility-scale air-cooled condenser (ACC) backpressure. The effects of ambient conditions and plant operating parameters, such as extraction flow rate, on ACC performance is investigated. In both case studies, hyperparameter searches are done to determine the best performing architectures for these models. Comparisons are drawn between the MDN model versus standard model architecture in both case studies. The HTC predictor model achieved 90% accuracy which equates to an average error of 4.89 W m2K across all heat exchangers. The resultant time-series ACC model achieved an average error of 3.14 kPa, which translate into a model accuracy of 82%.
|
56 |
INVESTIGATION OF DIFFERENT DATA DRIVEN APPROACHES FOR MODELING ENGINEERED SYSTEMSShrenik Vijaykumar Zinage (14212484) 05 December 2022 (has links)
<p>Every engineered system behaves slightly differently because of manufacturing and operational uncertainties. The ability to build system-specific predictive models that adapt to manufactured systems, also known as digital twins, opens up many possibilities for reducing operating and maintenance costs. Nonlinear dynamical systems with unknown governing equations and states characterize many engineered systems. As a result, learning their dynamics from data has become both the current research area and one of the biggest challenges. In this thesis, we do an investigation of different data driven approaches for modeling various engineered systems. Firstly, we develop a model to predict the transient and steady-state behavior of a turbocharger turbine using the Koopman operator which can be helpful for modelling, analysis and control design. Our approach is as follows. We use experimental data from a Cummins heavy-duty diesel engine to develop a turbine model using Extended Dynamic Mode Decomposition (EDMD), which approximates the action of the Koopman operator on a finite-dimensional subspace of the space of observables. The results demonstrate comparable performance with a tuned nonlinear autoregressive network with an exogenous input (NARX) model widely used in the literature. The performance of these two models is analyzed based on their ability to predict turbine transient and steady-state behavior. Furthermore, we assess the ability of liquid time-constant (LTC) networks to learn the dynamics of various oscillatory systems using noisy data. In this study, we analyze and compare the performance of the LTC network with various commonly used recurrent neural network (RNN) architectures like long short-term memory (LSTM) network, and gated recurrent units (GRU). Our approach is as follows. We first systematically generate synthetic data by exciting the system of interest with a band-limited white noise and simulating it using a forward Euler discretization scheme. After the output has been simulated, we then corrupt it with different levels of noise to replicate a practically measured signal and train the RNN architectures with that corrupted output. The model is then tested on various types of forcing excitations to analyze the robustness of these networks in capturing different behaviors exhibited by the system. We also analyze the ability of these networks to capture the resonance effect for various parameter settings. Case studies discussing standard benchmark oscillatory systems (i.e., spring-mass-damper (S-M-D) system, single degree of freedom (DOF) Bouc-Wen oscillator, and forced Van der pol oscillator) are used to test the performance of these methodologies. The results reveal that the LTC network showed better performance in modeling the S-M-D system and 1-DOF Bouc-Wen oscillator as compared to an LSTM network but was outperformed by the GRU network. None of the networks were able to model the forced Van der pol oscillator with a reasonable accuracy. Since the GRU network outperformed other networks in terms of the computational time and the model accuracy for most of the scenarios, we applied it to a real world experimental dataset (i.e. turbocharger turbine dynamics) to compare it against the EDMD and NARX model. The results showed better performance of the GRU network in modeling the transient behaviours of the turbine. However, it failed to predict the turbine outlet temperature with a reasonable accuracy in most of the regions for the steady state dataset. As future work, we plan to consider training the GRU network with a data sampling frequency of 100 Hz for a fair comparison with the NARX and the Koopman approach.</p>
|
57 |
Predicting Electricity Consumption with ARIMA and Recurrent Neural NetworksEnerud, Klara January 2024 (has links)
Due to the growing share of renewable energy in countries' power systems, the need for precise forecasting of electricity consumption will increase. This paper considers two different approaches to time series forecasting, autoregressive moving average (ARMA) models and recurrent neural networks (RNNs). These are applied to Swedish electricity consumption data, with the aim of deriving simple yet efficient predictors. An additional aim is to analyse the impact of day of week and temperature on forecast accuracy. The models are evaluated on both long- and mid-term forecasting horizons, ranging from one day to one month. The results show that neural networks are superior for this task, although stochastic seasonal ARMA models also perform quite well. Including external variables only marginally improved the ARMA predictions, and had somewhat unclear effects on the RNN forecasting accuracy. Depending on the network model used, adding external variables had either a slightly positive or slightly negative impact on prediction accuracy.
|
58 |
Deep Neural Networks for Improved Terminal Voltage and State-of-Charge Estimation of Lithium-Ion Batteries for Traction ApplicationsGoncalves Vidal, Carlos Jose January 2020 (has links)
The growing interest in more electrified vehicles has been pushing the industry and academia to pursue new and more accurate ways to estimate the xEV batteries State-of-Charge (SOC). The battery system still represents one of the many technical barriers that need to be eliminated or reduced to enable the proliferation of more xEV in the market, which in turn can help reduce CO2 emissions. Battery modelling and SOC estimation of Lithium-ion batteries (Li-ion) at a wide temperature range, including negative temperatures, has been a challenge for many engineers.
For SOC estimation, several models configurations and approaches were developed and tested as results of this work, including different non-recurrent neural networks, such as Feedforward deep neural networks (FNN) and recurrent neural networks based on long short-term memory recurrent neural networks (LSTM-RNN). The approaches have considerably improved the accuracy presented in the previous state-of-the-art. They have expanded the application throughout five different Li-ion at a wide temperature range, achieving error as low as 0.66% Root Mean Square Error at -10⁰C using an FNN approach and 0.90% using LSTM-RNN. Therefore, the use of deep neural networks developed in this work can increase the potential for xEV application, especially where accuracy at negative temperatures is essential.
For Li-ion modelling, a cell model using LSTM-RNN (LSTM-VM) was developed for the first time to estimate the battery cell terminal voltage and is compared against a gated recurrent unit (GRU-VM) approach and a Third-order Equivalent Circuit Model based on Thevenin theorem (ECM). The models were extensively compared for different Li-ion at a wide range of temperature conditions. The LSTM-VM has shown to be more accurate than the two other benchmarks, where could achieve 43 (mV) Root Mean Square Error at -20⁰C, a third when compared to the same situation using ECM. Although the difference between LSTM-VM and GRU-VM is not that steep.
Finally, throughout the work, several methods to improve robustness, accuracy and training time have been introduced, including Transfer Learning applied to the development of SOC estimation models, showing great potential to reduce the amount of data necessary to train LSTM-RNN as well as improve its accuracy. / Thesis / Doctor of Philosophy (PhD) / For electric vehicle State-of-Charge estimation, several models configurations and approaches were developed and tested as results of this work, including different non-recurrent neural networks, such as Feedforward deep neural networks (FNN) and recurrent neural networks based on long short-term memory recurrent neural networks (LSTM-RNN). The approaches have considerably improved the accuracy presented in the previous state-of-the-art. They have expanded the application throughout five different Li-ion at a wide temperature range, achieving error as low as 0.66% Root Mean Square Error at -10⁰C using an FNN approach and 0.90% using LSTM-RNN. Therefore, the use of deep neural networks developed in this work can increase the potential for xEV application, especially where accuracy at negative temperatures is essential.
For Li-ion modelling, a cell model using LSTM-RNN (LSTM-VM) was developed for the first time to estimate the battery cell terminal voltage and is compared against a gated recurrent unit (GRU-VM) approach and a Third-order Equivalent Circuit Model based on Thevenin theorem (ECM). The models were extensively compared for different Li-ion at a wide range of temperature conditions. The LSTM-VM has shown to be more accurate than the two other benchmarks, where could achieve 43 (mV) Root Mean Square Error at -20⁰C, a third when compared to the same situation using ECM. Although the difference between LSTM-VM and GRU-VM is not that steep.
|
59 |
On Deep Multiscale Recurrent Neural NetworksChung, Junyoung 04 1900 (has links)
No description available.
|
60 |
Représentation dynamique dans le cortex préfrontal : comparaison entre reservoir computing et neurophysiologie du primate / Dynamic representation in the prefrontal cortex : insights from comparing reservoir computing and primate neurophysiologyEnel, Pierre 02 June 2014 (has links)
Les primates doivent pouvoir reconnaître de nouvelles situations pour pouvoir s'y adapter. La représentation de ces situations dans l'activité du cortex est le sujet de cette thèse. Les situations complexes s'expliquent souvent par l'interaction entre des informations sensorielles, internes et motrices. Des activités unitaires dénommées sélectivité mixte, qui sont très présentes dans le cortex préfrontal (CPF), sont un mécanisme possible pour représenter n'importe quelle interaction entre des informations. En parallèle, le Reservoir Computing a démontré que des réseaux récurrents ont la propriété de recombiner des entrées actuelles et passées dans un espace de plus haute dimension, fournissant ainsi un pré-codage potentiellement universel de combinaisons pouvant être ensuite sélectionnées et utilisées en fonction de leur pertinence pour la tâche courante. En combinant ces deux approches, nous soutenons que la nature fortement récurrente de la connectivité locale du CPF est à l'origine d'une forme dynamique de sélectivité mixte. De plus, nous tentons de démontrer qu'une simple régression linéaire, implémentable par un neurone seul, peut extraire n'importe qu'elle information/contingence encodée dans ces combinaisons complexes et dynamiques. Finalement, les entrées précédentes, qu'elles soient sensorielles ou motrices, à ces réseaux du CPF doivent être maintenues pour pouvoir influencer les traitements courants. Nous soutenons que les représentations de ces contextes définis par ces entrées précédentes doivent être exprimées explicitement et retournées aux réseaux locaux du CPF pour influencer les combinaisons courantes à l'origine de la représentation des contingences / In order to adapt to new situations, primates must be able to recognize these situations. How the cortex represents contingencies in its activity is the main subject of this thesis. First, complex new situations are often explained by the interaction between sensory, internal and motor information. Recent studies have shown that single-neuron activities referred to as mixed selectivity which are ubiquitous in the prefrontal cortex (PFC) are a possible mechanism to represent arbitrary interaction between information defining a contingency. In parallel, a recent area of reasearch referred to as Reservoir Computing has demonstrated that recurrent neural networks have the property of recombining present and past inputs into a higher dimensional space thereby providing a pre-coding of an essentially universal set of combinations which can then be selected and used arbitrarily for their relevance to the task at hand. Combining these two approaches we argue that the highly recurrent nature of local prefrontal connectivity is at the origin of dynamic form of mixed selectivity. Also, we attempt to demonstrate that a simple linear regression, implementable by a single neuron, can extract any information/ contingency encoded in these highly complex and dynamic combinations. In addition, previous inputs, whether sensory or motor, to these PFC networks must be maintained in order to influence current processing and behavioral demand. We argue that representations of contexts defined by these past inputs must be expressed explicitely and fed back to the local PFC networks in order to influence the current combinations at the origin of contingencies representation
|
Page generated in 0.0365 seconds