Spelling suggestions: "subject:"long short tem memory""
61 |
Reducing Training Time in Text Visual Question AnsweringBehboud, Ghazale 15 July 2022 (has links)
Artificial Intelligence (AI) and Computer Vision (CV) have brought the promise of many applications along with many challenges to solve. The majority of current AI research has been dedicated to single-modal data processing meaning they use only one modality such as visual recognition or text recognition. However, real-world challenges are often a combination of different modalities of data such as text, audio and images. This thesis focuses on solving the Visual Question Answering (VQA) problem which is a significant multi-modal challenge. VQA is defined as a computer vision system that when given a question about an image will answer based on an understanding of both the question and image. The goal is improving the training time of VQA models. In this thesis, Look, Read, Reason and Answer (LoRRA), which is a state-of-the-art architecture, is used as the base model. Then, Reduce Uni-modal Biases (RUBi) is applied to this model to reduce the importance of uni- modal biases in training. Finally, an early stopping strategy is employed to stop the training process once the model accuracy has converged to prevent the model from overfitting. Numerical results are presented which show that training LoRRA with RUBi and early stopping can converge in less than 5 hours. The impact of batch size, learning rate and warm up hyper parameters is also investigated and experimental results are presented. / Graduate
|
62 |
Supervised Algorithm for Predictive Maintenance / Övervakad algoritm för prediktivt underhållLu, Haida January 2023 (has links)
Predictive maintenance plays a crucial role in preventing unexpected equipment failures and maintaining assets in good operating conditions in various systems. One such scenario where predictive maintenance has been widely used is in battery management systems for electronic vehicles based on lithium batteries, where the risk of failure can be reduced by predicting the remaining useful life of the lithium battery. This project developed a DL model based on Long Short-Term Memory networks which was able to generalize new and various kinds of battery. The model was implemented on a low-cost, low-power using embedded artifcial intelligence, which enables local model execution, reducing costs, time, and risks associated with transferring data to the cloud. To further optimize the model and reduce its memory usage, quantization was applied before porting it to an embedded system based on the STM32 MCU. The results show that the model migration was successful, with low memory cost and no signifcant degradation in accuracy. Finally, the memory usage of the prediction model was also analyzed. / Predictiv underhåll har en avgörande roll för att förebygga oväntade utrustningsfel och bibehålla tillgångar i god driftsvillkor i olika system. Ett scenario där predictivt underhåll har använts mycket är i batterihanteringssystem för elfordon baserade på litiumbatterier, där risken för fel kan reduceras genom att förutsäga den återstående användbarhetsperioden för litiumbatteriet. I det här projektet utvecklades djupinlärningsprediktiva modeller med hjälp av Keras sekventiella modell för att representera en ferlagersneural nätverk och en Lång Korttidsminne modell för tidserieprediktion. Dessa modeller implementerades på en lågkostnad, låglägesmikrokontroller med inbyggd artifcial intelligence, vilket möjliggör lokal modellkörning, vilket reducerar kostnader, tid och risker med att överföra data till molnet. För att ytterligare optimera modellen och minska dess minnesfotavtryck tillämpades kvantisering innan den portades till en inbyggd system baserat på STM32 mikrokontroller. Resultaten visar att modellmigrationen var framgångsrik, med låg minneskostnad och ingen signifkant försämring av precisionen. Slutligen analyserades även minnesanvändningen av prediktionsmodellen.
|
63 |
Short Term Stock Price Prediction Using Machine LearningRahm, Olov, Wikström, Alexander January 2022 (has links)
This report assesses different machine learning models’accuracies to predict whether a stock will go up or down invalue in a short term. The models that is used is linear regression,LSTM and Elman RNN. These models was trained on historicalprice data from the Nasdaq Stock Exchange. The idea that thereexist a relationship of the price movement of a stock and its futurevalue is called ’techncial analysis’. The result shows that neitherLSTM nor Elman RNN provides any statistical significance ofits accuracy for any of the implementations. Linear regression,provides a significant accuracy for longer time series predictionof the price when trained on 100 days of data and prediction ofits movement after five more days. / I denna report undersöks olika maskininlärningsmodeller noggrannhet för att förutspå om en aktie kommer att gå upp eller ner i värde på kort sikt. De evaluerade maskininlärningsmodellernamodellerna är följande: linjär regression, LSTM och Elman RNN. Dessa modeller tränades med hjälp av historisk prisdata från Nasdaq Stock Exchange. Ide´en om att det finns ett samband mellan prisrörelsen av en aktie och dess kortsiktiga framtida värde är benämnt som ’teknisk analys’. Resultaten visar att varken LSTM eller Elman RNN förmedlar en noggrannhet med statistisk signifikans för någon av de anänvda implementationerna. Linjär regression förmedlar en statistisk signikant noggrannhet för längre tidserie förutsägelser med träningsdata om 100 dagar och förutsägelse av aktiens rörelse efter fem fler dagar. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm
|
64 |
Evaluation and Optimization of Deep Learning Networks for Plant Disease Forecasting And Assessment of their Generalizability for Early Warning SystemsHannah Elizabeth Klein (15375262) 05 May 2023 (has links)
<p>This research focused on developing adaptable models and protocols for early warning systems for forecasting plant diseases and datasets. It compared the performance of deep learning models in predicting soybean rust disease outbreaks using three years of public epidemiological data and gridded weather data. The models selected were a dense network and a Long Short-Term Memory (LSTM) network. The objectives included evaluating the effectiveness of small citizen science datasets and gridded meteorological weather in sequential forecasting, assessing the ideal window size and important inputs, and exploring the generalizability of the model protocol and models to other diseases. The model protocol was developed using a soybean rust dataset. Both the dense and the LSTM networks produced accuracies of over 90% during optimization. When tested for forecasting, both networks could forecast with an accuracy of 85% or higher over various window sizes. Experiments on window size indicated a minimum input of 8 -11 days. Generalizability was demonstrated by applying the same protocol to a southern corn rust dataset, resulting in 87.8% accuracy. In addition, transfer learning and pre-trained models were tested. Direct transfer learning between disease was not successful, while pre training models resulted both positive and negative results. Preliminary results are reported for building generalizable disease models using epidemiological and weather data that researchers could apply to generate forecasts for new diseases and locations.</p>
|
65 |
Anomaly detection for non-recurring traffic congestions using Long short-term memory networks (LSTMs) / Avvikelsedetektering för icke återkommande trafikstockningar med hjälp av LSTM-nätverkSvanberg, John January 2018 (has links)
In this master thesis, we implement a two-step anomaly detection mechanism for non-recurrent traffic congestions with data collected from public transport buses in Stockholm. We investigate the use of machine learning to model time series data with LSTMs and evaluate the results with a baseline prediction model. The anomaly detection algorithm embodies both collective and contextual expressivity, meaning it is capable of findingcollections of delayed buses and also takes the temporality of the data into account. Results show that the anomaly detection performance benefits from the lower prediction errors produced by the LSTM network. The intersection rule significantly decreases the number of false positives while maintaining the true positive rate at a sufficient level. The performance of the anomaly detection algorithm has been found to depend on the road segment it is applied to, some segments have been identified to be particularly hard whereas other have been identified to be easier than others. The performance of the best performing setup of the anomaly detection mechanism had a true positive rate of 84.3 % and a true negative rate of 96.0 %. / I den här masteruppsatsen implementerar vi en tvåstegsalgoritm för avvikelsedetektering för icke återkommande trafikstockningar. Data är insamlad från kollektivtrafikbussarna i Stockholm. Vi undersöker användningen av maskininlärning för att modellerna tidsseriedata med hjälp av LSTM-nätverk och evaluerar sedan dessa resultat med en grundmodell. Avvikelsedetekteringsalgoritmen inkluderar både kollektiv och kontextuell uttrycksfullhet, vilket innebär att kollektiva förseningar kan hittas och att även temporaliteten hos datan beaktas. Resultaten visar att prestandan hos avvikelsedetekteringen förbättras av mindre prediktionsfel genererade av LSTM-nätverket i jämförelse med grundmodellen. En regel för avvikelser baserad på snittet av två andra regler reducerar märkbart antalet falska positiva medan den höll kvar antalet sanna positiva på en tillräckligt hög nivå. Prestandan hos avvikelsedetekteringsalgoritmen har setts bero av vilken vägsträcka den tillämpas på, där några vägsträckor är svårare medan andra är lättare för avvikelsedetekteringen. Den bästa varianten av algoritmen hittade 84.3 % av alla avvikelser och 96.0 % av all avvikelsefri data blev markerad som normal data.
|
66 |
Portfolio Performance Optimization Using Multivariate Time Series Volatilities Processed With Deep Layering LSTM Neurons and Markowitz / Portföljprestanda optimering genom multivariata tidsseriers volatiliteter processade genom lager av LSTM neuroner och MarkowitzAndersson, Aron, Mirkhani, Shabnam January 2020 (has links)
The stock market is a non-linear field, but many of the best-known portfolio optimization algorithms are based on linear models. In recent years, the rapid development of machine learning has produced flexible models capable of complex pattern recognition. In this paper, we propose two different methods of portfolio optimization; one based on the development of a multivariate time-dependent neural network,thelongshort-termmemory(LSTM),capable of finding lon gshort-term price trends. The other is the linear Markowitz model, where we add an exponential moving average to the input price data to capture underlying trends. The input data to our neural network are daily prices, volumes and market indicators such as the volatility index (VIX).The output variables are the prices predicted for each asset the following day, which are then further processed to produce metrics such as expected returns, volatilities and prediction error to design a portfolio allocation that optimizes a custom utility function like the Sharpe Ratio. The LSTM model produced a portfolio with a return and risk that was close to the actual market conditions for the date in question, but with a high error value, indicating that our LSTM model is insufficient as a sole forecasting tool. However,the ability to predict upward and downward trends was somewhat better than expected and therefore we conclude that multiple neural network can be used as indicators, each responsible for some specific aspect of what is to be analysed, to draw a conclusion from the result. The findings also suggest that the input data should be more thoroughly considered, as the prediction accuracy is enhanced by the choice of variables and the external information used for training. / Aktiemarknaden är en icke-linjär marknad, men många av de mest kända portföljoptimerings algoritmerna är baserad på linjära modeller. Under de senaste åren har den snabba utvecklingen inom maskininlärning skapat flexibla modeller som kan extrahera information ur komplexa mönster. I det här examensarbetet föreslår vi två sätt att optimera en portfölj, ett där ett neuralt nätverk utvecklas med avseende på multivariata tidsserier och ett annat där vi använder den linjära Markowitz modellen, där vi även lägger ett exponentiellt rörligt medelvärde på prisdatan. Ingångsdatan till vårt neurala nätverk är de dagliga slutpriserna, volymerna och marknadsindikatorer som t.ex. volatilitetsindexet VIX. Utgångsvariablerna kommer vara de predikterade priserna för nästa dag, som sedan bearbetas ytterligare för att producera mätvärden såsom förväntad avkastning, volatilitet och Sharpe ratio. LSTM-modellen producerar en portfölj med avkastning och risk som ligger närmre de verkliga marknadsförhållandena, men däremot gav resultatet ett högt felvärde och det visar att vår LSTM-modell är otillräckligt för att använda som ensamt predikteringssverktyg. Med det sagt så gav det ändå en bättre prediktion när det gäller trender än vad vi antog den skulle göra. Vår slutsats är därför att man bör använda flera neurala nätverk som indikatorer, där var och en är ansvarig för någon specifikt aspekt man vill analysera, och baserat på dessa dra en slutsats. Vårt resultat tyder också på att inmatningsdatan bör övervägas mera noggrant, eftersom predikteringsnoggrannheten.
|
67 |
Predicting trajectories of golf balls using recurrent neural networks / Förutspå bollbanan för en golfboll med neurala nätverkJansson, Anton January 2017 (has links)
This thesis is concerned with the problem of predicting the remaining part of the trajectory of a golf ball as it travels through the air where only the three-dimensional position of the ball is captured. The approach taken to solve this problem relied on recurrent neural networks in the form of the long short-term memory networks (LSTM). The motivation behind this choice was that this type of networks had led to state-of-the-art performance for similar problems such as predicting the trajectory of pedestrians. The results show that using LSTMs led to an average reduction of 36.6 % of the error in the predicted impact position of the ball, compared to previous methods based on numerical simulations of a physical model, when the model was evaluated on the same driving range that it was trained on. Evaluating the model on a different driving range than it was trained on leads to improvements in general, but not for all driving ranges, in particular when the ball was captured at a different frequency compared to the data that the model was trained on. This problem was solved to some extent by retraining the model with small amounts of data on the new driving range. / Detta examensarbete har studerat problemet att förutspå den fullständiga bollbanan för en golfboll när den flyger i luften där endast den tredimensionella positionen av bollen observerades. Den typ av metod som användes för att lösa problemet använde sig av recurrent neural networks, i form av long short-term memory nätverk (LSTM). Motivationen bakom detta var att denna typ av nätverk hade lett till goda resultatet för liknande problem. Resultatet visar att använda sig av LSTM nätverk leder i genomsnitt till en 36.6 % förminskning av felet i den förutspådda nedslagsplatsen för bollen jämfört mot tidigare metoder som använder sig av numeriska simuleringar av en fysikalisk modell, om modellen användes på samma golfbana som den tränades på. Att använda en modell som var tränad på en annan golfbana leder till förbättringar i allmänhet, men inte om modellen användes på en golfbana där bollen fångades in med en annan frekvens. Detta problem löstes till en viss mån genom att träna om modellen med lite data från den nya golfbanan.
|
68 |
Spatio-temporal prediction of residential burglaries using convolutional LSTM neural networksHolm, Noah, Plynning, Emil January 2018 (has links)
The low amount solved residential burglary crimes calls for new and innovative methods in the prevention and investigation of the cases. There were 22 600 reported residential burglaries in Sweden 2017 but only four to five percent of these will ever be solved. There are many initiatives in both Sweden and abroad for decreasing the amount of occurring residential burglaries and one of the areas that are being tested is the use of prediction methods for more efficient preventive actions. This thesis is an investigation of a potential method of prediction by using neural networks to identify areas that have a higher risk of burglaries on a daily basis. The model use reported burglaries to learn patterns in both space and time. The rationale for the existence of patterns is based on near repeat theories in criminology which states that after a burglary both the burgled victim and an area around that victim has an increased risk of additional burglaries. The work has been conducted in cooperation with the Swedish Police authority. The machine learning is implemented with convolutional long short-term memory (LSTM) neural networks with max pooling in three dimensions that learn from ten years of residential burglary data (2007-2016) in a study area in Stockholm, Sweden. The model's accuracy is measured by performing predictions of burglaries during 2017 on a daily basis. It classifies cells in a 36x36 grid with 600 meter square grid cells as areas with elevated risk or not. By classifying 4% of all grid cells during the year as risk areas, 43% of all burglaries are correctly predicted. The performance of the model could potentially be improved by further configuration of the parameters of the neural network, along with a use of more data with factors that are correlated to burglaries, for instance weather. Consequently, further work in these areas could increase the accuracy. The conclusion is that neural networks or machine learning in general could be a powerful and innovative tool for the Swedish Police authority to predict and moreover prevent certain crime. This thesis serves as a first prototype of how such a system could be implemented and used.
|
69 |
Plant yield prediction in indoor farming using machine learningAshok, Anjali, Adesoba, Mary January 2023 (has links)
Agricultural industry has started to rely more on data driven approaches to improve productivity and utilize their resources effectively. This thesis project was carried out in collaboration with Ljusgårda AB, it explores plant yield prediction using machine learning models and hyperparameter tweaking. This thesis work is based on data gathered from the company and the plant yield prediction is carried out on two scenarios whereby each scenario is focused on a different time frame of the growth stage. The first scenario predicts yield from day 8 to day 22 of DAT (Day After Transplant), while the second scenario predicts yield from day 1 to day 22 of DAT and three machine learning algorithms Support Vector Regression (SVR), Long Short Time Memory (LSTM) and Artificial Neural Network (ANN) were investigated. Machine learning model’s performances were evaluated using the metrics; Mean Square Error (MSE), Mean Absolute Error (MAE), and r-squared. The evaluation results showed that ANN performed best on MSE and r-squared with dataset 1, while SVR performed best on MAE with dataset 2. Thus, both ANN and SVR meets the objective of this thesis work. The hyperparameter tweaking experiment of the three models further demonstrated the significance of hyperparameter tuning in improving the models and making them more suitable to the available data.
|
70 |
Federated Learning for Time Series Forecasting Using LSTM Networks: Exploiting Similarities Through Clustering / Federerad inlärning för tidserieprognos genom LSTM-nätverk: utnyttjande av likheter genom klustringDíaz González, Fernando January 2019 (has links)
Federated learning poses a statistical challenge when training on highly heterogeneous sequence data. For example, time-series telecom data collected over long intervals regularly shows mixed fluctuations and patterns. These distinct distributions are an inconvenience when a node not only plans to contribute to the creation of the global model but also plans to apply it on its local dataset. In this scenario, adopting a one-fits-all approach might be inadequate, even when using state-of-the-art machine learning techniques for time series forecasting, such as Long Short-Term Memory (LSTM) networks, which have proven to be able to capture many idiosyncrasies and generalise to new patterns. In this work, we show that by clustering the clients using these patterns and selectively aggregating their updates in different global models can improve local performance with minimal overhead, as we demonstrate through experiments using realworld time series datasets and a basic LSTM model. / Federated Learning utgör en statistisk utmaning vid träning med starkt heterogen sekvensdata. Till exempel så uppvisar tidsseriedata inom telekomdomänen blandade variationer och mönster över längre tidsintervall. Dessa distinkta fördelningar utgör en utmaning när en nod inte bara ska bidra till skapandet av en global modell utan även ämnar applicera denna modell på sin lokala datamängd. Att i detta scenario införa en global modell som ska passa alla kan visa sig vara otillräckligt, även om vi använder oss av de mest framgångsrika modellerna inom maskininlärning för tidsserieprognoser, Long Short-Term Memory (LSTM) nätverk, vilka visat sig kunna fånga komplexa mönster och generalisera väl till nya mönster. I detta arbete visar vi att genom att klustra klienterna med hjälp av dessa mönster och selektivt aggregera deras uppdateringar i olika globala modeller kan vi uppnå förbättringar av den lokal prestandan med minimala kostnader, vilket vi demonstrerar genom experiment med riktigt tidsseriedata och en grundläggande LSTM-modell.
|
Page generated in 0.1091 seconds