Global ETD Search

91	Comparing decentralized learning to Federated Learning when training Deep Neural Networks under churn Vikström, Johan January 2021 (has links) Decentralized Machine Learning could address some problematic facets with Federated Learning. There is no central server acting as an arbiter of whom or what may benefit from Machine Learning models created by the vast amount of data becoming available in recent years. It could also increase the reliability and scalability of Machine Learning systems thereby drawing the benefit of having more data accessible. Gossip Learning is such a protocol, but has primarily been designed with linear models in mind. How does Gossip Learning perform when training Deep Neural Networks? Could it be a viable alternative to Federated Learning? In this thesis, we implement Gossip Learning using two different model merging strategies. We also design and implement two extensions to this protocol with the goal of achieving higher performance when training under churn. The training methods are compared on two tasks: image classification on the Federated Extended MNIST dataset and time- series forecasting on the NN5 dataset. Additionally, we also run an experiment where learners churn, alternating between being available and unavailable. We find that Gossip Learning performs slightly better in settings where learners do not churn but is vastly outperformed in the setting where they do. / Decentraliserad Maskinginlärning kan lösa några problematiska aspekter med Federated Learning. Det finns ingen central server som agerar som domare för vilka som får gagna av Maskininlärningsmodellerna skapad av den stora mäng data som blivit tillgänglig på senare år. Det skulle också kunna öka pålitligheten och skalbarheten av Maskininlärningssystem och därav dra nytta av att mer data är tillgänglig. Gossip Learning är ett sånt protokoll, men det är primärt designat med linjära modeller i åtanke. Hur presterar Gossip Learning när man tränar Djupa Neurala Nätverk? Kan det vara ett möjligt alternativ till Federated Learning? I det här exjobbet implementerar vi Gossip Learning med två olika modelsammanslagningstekniker. Vi designar och implementerar även två tillägg till protokollet med målet att uppnå bättre prestanda när man tränar i system där noder går ner och kommer up. Träningsmetoderna jämförs på två uppgifter: bildklassificering på Federated Extended MNIST datauppsättningen och tidsserieprognostisering på NN5 datauppsättningen. Dessutom har vi även experiment då noder alternerar mellan att vara tillgängliga och otillgängliga. Vi finner att Gossip Learning presterar marginellt bättre i miljöer då noder alltid är tillgängliga men är kraftigt överträffade i miljöer då noder alternerar mellan att vara tillgängliga och otillgängliga. Peer- to- Peer Decentralized Machine Learning Federated Learning Gossip Learning Long Short- Term Memory Convolutional Neural Network Peer- to- peer Decentraliserad Maskininlärning Federated learning Gossip learning Lång korttidsminne Computer Sciences Datavetenskap (datalogi)
92	A comparative study of Neural Network Forecasting models on the M4 competition data Ridhagen, Markus, Lind, Petter January 2021 (has links) The development of machine learning research has provided statistical innovations and further developments within the field of time series analysis. This study seeks to investigate two different approaches on artificial neural network models based on different learning techniques, and answering how well the neural network approach compares with a basic autoregressive approach, as well as how the artificial neural network models compare to each other. The models were compared and analyzed in regards to the univariate forecast accuracy on 20 randomly drawn time series from two different time frequencies from the M4 competition dataset. Forecasting was made dependent on one time lag (t-1) and forecasted three and six steps ahead respectively. The artificial neural network models outperformed the baseline Autoregressive model, showing notably lower mean average percentage error overall. The Multilayered perceptron models performed better than the Long short-term memory model overall, whereas the Long short-term memory model showed improvement on longer prediction time dimensions. As the training were done univariately on a limited set of time steps, it is believed that the one layered-approach gave a good enough approximation on the data, whereas the added layer couldn’t fully utilize its strengths of processing power. Likewise, the Long short-term memory model couldn’t fully demonstrate the advantagements of recurrent learning. Using the same dataset, further studies could be made with another approach to data processing. Implementing an unsupervised approach of clustering the data before analysis, the same models could be tested with multivariate analysis on models trained on multiple time series simultaneously. Time series analysis M4 neural network NN artificial neural network ANN feedforward neural network FNN multilayer perceptron MLP recurrent neural network RNN long short-term memory LSTM Probability Theory and Statistics Sannolikhetsteori och statistik
93	Insurance Fraud Detection using Unsupervised Sequential Anomaly Detection / Detektion av försäkringsbedrägeri med oövervakad sekvensiell anomalitetsdetektion Hansson, Anton, Cedervall, Hugo January 2022 (has links) Fraud is a common crime within the insurance industry, and insurance companies want to quickly identify fraudulent claimants as they often result in higher premiums for honest customers. Due to the digital transformation where the sheer volume and complexity of available data has grown, manual fraud detection is no longer suitable. This work aims to automate the detection of fraudulent claimants and gain practical insights into fraudulent behavior using unsupervised anomaly detection, which, compared to supervised methods, allows for a more cost-efficient and practical application in the insurance industry. To obtain interpretable results and benefit from the temporal dependencies in human behavior, we propose two variations of LSTM based autoencoders to classify sequences of insurance claims. Autoencoders can provide feature importances that give insight into the models' predictions, which is essential when models are put to practice. This approach relies on the assumption that outliers in the data are fraudulent. The models were trained and evaluated on a dataset we engineered using data from a Swedish insurance company, where the few labeled frauds that existed were solely used for validation and testing. Experimental results show state-of-the-art performance, and further evaluation shows that the combination of autoencoders and LSTMs are efficient but have similar performance to the employed baselines. This thesis provides an entry point for interested practitioners to learn key aspects of anomaly detection within fraud detection by thoroughly discussing the subject at hand and the details of our work. / <p>Gjordes digitalt via Zoom. </p> Insurance Fraud Detection Anomaly Detection Long Short-Term Memory Networks (LSTM) Unsupervised Learning Autoencoder (AE) Variational Autoencoder (VAE) Interpretable Machine Learning Feature Engineering Feature Selection Feature Importance Computer Sciences Datavetenskap (datalogi)
94	Using a Character-Based Language Model for Caption Generation / Användning av teckenbaserad språkmodell för generering av bildtext Keisala, Simon January 2019 (has links) Using AI to automatically describe images is a challenging task. The aim of this study has been to compare the use of character-based language models with one of the current state-of-the-art token-based language models, im2txt, to generate image captions, with focus on morphological correctness. Previous work has shown that character-based language models are able to outperform token-based language models in morphologically rich languages. Other studies show that simple multi-layered LSTM-blocks are able to learn to replicate the syntax of its training data. To study the usability of character-based language models an alternative model based on TensorFlow im2txt has been created. The model changes the token-generation architecture into handling character-sized tokens instead of word-sized tokens. The results suggest that a character-based language model could outperform the current token-based language models, although due to time and computing power constraints this study fails to draw a clear conclusion. A problem with one of the methods, subsampling, is discussed. When using the original method on character-sized tokens this method removes characters (including special characters) instead of full words. To solve this issue, a two-phase approach is suggested, where training data first is separated into word-sized tokens where subsampling is performed. The remaining tokens are then separated into character-sized tokens. Future work where the modified subsampling and fine-tuning of the hyperparameters are performed is suggested to gain a clearer conclusion of the performance of character-based language models. Natural Language Processing NLP Machine Learning ML Neural Network Caption Generation Deep Learning Recurrent Neural Network Long-Short-Term-Memory LSTM word2vec Language Model
95	An Information Management and Decision Support tool for Predictive Alerting of Energy for Aircraft Engelmann, James E. 17 September 2020 (has links) No description available. Electrical Engineering Computer Science Aerospace Engineering Aircraft State Awareness IMDS AIME Aircraft Energy State Predicition Machine Learning Deep Learning Long Short Term Memory units
96	Artificial Neural Network in Exhaust Temperature Modelling : Viability of ANN Usage in Gasoline Engine Modelling Nibras, Musa, Linus, Roos January 2022 (has links) Developing and improving upon a good empirical model for an engine can be time-consuming and costly. The goal of this thesis has been to evaluate data-driven modelling, specifically neural networks, to see how well it can handle training for some static models like the mass flow of air into the cylinder, mean effective pressure and pump mean effective pressure but also for transient modelling, specifically the exhaust gas temperature. These models are evaluated against the classical empirical models to see if neural networks are a viable modelling option. This is done with five different types of neural networks which are trained. These are the feed-forward neural network, Nonlinear autoregressive exogenous model network, layer recurrent network, long short term memory network and gated recurrent network.The inputs were determined by looking at more simple physical models but also looking at the covariance to determine the usefulness of the input. If the calculation time is small for the specific network, the neural network structure is tested and optimized by training many networks and finding the median/mean result for that specific test.The result has shown that the static models are handled very well by the most simple feed-forward network. For the exhaust temperature, both NARX and Layer recurrent network could predict and handle it well giving results very close to the empirical models and could be a viable option for transient modelling, on the other hand, Long short term memory, gated recurrent network and the feed-forward network had trouble predicting the exhaust gas temperature and returned bad results while training. Artificial Neural Network Data-driven Modelling Engine Modelling Transient behaviour Static behaviour Feed Forward Network Layer Recurrent Network NARX Long Short Term Memory Gated Recurrent Network Data processing Vehicle Engineering Farkostteknik Elektroteknik och elektronik
97	Digital Signal Characterization for Seizure Detection Using Frequency Domain Analysis Li, Jing January 2021 (has links) Nowadays, a significant proportion of the population in the world is affected by cerebral diseases like epilepsy. In this study, frequency domain features of electroencephalography (EEG) signals were studied and analyzed, with a view being able to detect epileptic seizures more easily. The power spectrum and spectrogram were determined by using fast fourier transform (FFT) and the scalogram was found by performing continuous wavelet transform (CWT) on the testing EEG signal. In addition, two schemes, i.e. method 1 and method 2, were implemented for detecting epileptic seizures and the applicability of the two methods to electrocardiogram (ECG) signals were tested. A third method for anomaly detection in ECG signals was tested. / En signifikant del av population påverkas idag av neurala sjukdomar som epilepsi. I denna studie studerades och analyserades egenskaper inom frekvensdomänen av elektroencefalografi (EEG), med sikte på att lättare kunna upptäcka epileptiska anfall. Effektspektrumet och spektrogramet bestämdes med hjälp av en snabb fouriertransform och skalogrammet hittades genom att genomföra en kontinuerlig wavelet transform (CWT) på testsignalen från EEGsignalen. I addition till detta skapades två system, metod 1 och metod 2, som implementerades för att upptäcka epileptiska anfall. Användbarheten av dessa två metoder inom elektrokardiogramsignaler (ECG) testades. En tredje metod för anomalidetektering i ECGsignaler testades. Fourier Transform Wavelet Transform EEG and ECG Anomaly Detection Approximate Entropy Hellinger Distance Long Short- Term Memory Fourier Transform Wavelet Transform EEG och ECG Anomalidetektion Approximativ Entropi Hellinger Distans Lång Korttidsminne Computer and Information Sciences Data- och informationsvetenskap
98	Explainable AI For Predictive Maintenance Karlsson, Nellie, Bengtsson, My January 2022 (has links) As the complexity of deep learning model increases, the transparency of the systems does the opposite. It may be hard to understand the predictions a deep learning model makes, but even harder to understand why these predictions are made. Using eXplainable AI (XAI), we can gain greater knowledge of how the model operates and how the input in which the model receives can change its predictions. In this thesis, we apply Integrated Gradients (IG), an XAI method primarily used on image data and on datasets containing tabular and time-series data. We also evaluate how the results of IG differ from various types of models and how the change of baseline can change the outcome. In these results, we observe that IG can be applied to both sequenced and nonsequenced data, with varying results. We can see that the gradient baseline does not affect the results of IG on models such as RNN, LSTM, and GRU, where the data contains time series, as much as it does for models like MLP with nonsequenced data. To confirm this, we also applied IG to SVM models, which gave the results that the choice of gradient baseline has a significant impact on the results of IG. Predictive Maintenance Artificial Intelligence AI Explainable Artificial Intelligence XAI Integrated Gradients LIME SHAP Deep Learning Machine Learning Recurrent Neural Network RNN Long Short-term Memory LSTM GRU Multilayer Perceptron MLP Computer and Information Sciences Data- och informationsvetenskap
99	Detecting Anomalous Behavior in Radar Data Rook, Jayson Carr 01 June 2021 (has links) No description available. Computer Science Electrical Engineering Remote Sensing Computer Engineering Electronic Warfare Electronic Countermeasure Multifunction Radar Anomaly Detection Hidden Markov Model Long Short-Term Memory Pattern Detection Cross-Correlation Overlapping Index
100	LSTM-based Directional Stock Price Forecasting for Intraday Quantitative Trading / LSTM-baserad aktieprisprediktion för intradagshandel Mustén Ross, Isabella January 2023 (has links) Deep learning techniques have exhibited remarkable capabilities in capturing nonlinear patterns and dependencies in time series data. Therefore, this study investigates the application of the Long-Short-Term-Memory (LSTM) algorithm for stock price prediction in intraday quantitative trading using Swedish stocks in the OMXS30 index from February 28, 2013, to March 1, 2023. Contrary to previous research [12, 32] suggesting that past movements or trends in stock prices cannot predict future movements, our analysis finds limited evidence supporting this claim during periods of high volatility. We discover that incorporating stock-specific technical indicators does not significantly enhance the predictive capacity of the model. Instead, we observe a trade-off: by removing the seasonal component and leveraging feature engineering and hyperparameter tuning, the LSTM model becomes proficient at predicting stock price movements. Consequently, the model consistently demonstrates high accuracy in determining price direction due to consistent seasonality. Additionally, training the model on predicted return differences, rather than the magnitude of prices, further improves accuracy. By incorporating a novel long-only and long-short trading strategy using the one-day-ahead predictive price, our model effectively captures stock price movements and exploits market inefficiencies, ultimately maximizing portfolio returns. Consistent with prior research [14, 15, 31, 32], our LSTM model outperforms the ARIMA model in accurately predicting one-day-ahead stock prices. Portfolio returns consistently outperforms the stock market index, generating profits over the entire time period. The optimal portfolio achieves an average daily return of 1.2%, surpassing the 0.1% average daily return of the OMXS30 Index. The algorithmic trading model demonstrates exceptional precision with a 0.996 accuracy rate in executing trades, leveraging predicted directional stock movements. The algorithmic trading model demonstrates an impressive 0.996 accuracy when executing trades based on predicted directional stock movements. This remarkable performance leads to cumulative and annualized excessive returns that surpass the index return for the same period by a staggering factor of 800. / Djupinlärningstekniker har visat en enastående förmåga att fånga icke-linjära mönster och samband i tidsseriedata. Med detta som utgångspunkt undersöker denna studie användningen av Long-Short-Term-Memory (LSTM)-algoritmen för att förutsäga aktiepriser med svenska aktier i OMXS30-indexet från den 28 februari 2013 till den 1 mars 2023. Vår analys finner begränsat stöd till tidigare forskning [12, 32] som hävdar att historisk aktierörelse eller trend inte kan användas för att prognostisera framtida mönster. Genom att inkludera aktiespecifika tekniska indikatorer observerar vi ingen betydande förbättring i modellens prognosförmåga. genom att extrahera den periodiska komponenten och tillämpa metoder för egenskapskonstruktion och optimering av hyperparametrar, lär sig LSTM-modellen användbara egenskaper och blir därmed skicklig på att förutsäga akrieprisrörelser. Modellen visar konsekvent högre noggrannhet när det gäller att bestämma prisriktning på grund av den regelbundna säsongsvariationen. Genom att träna modellen att förutse avkastningsskillnader istället för absoluta prisvärden, förbättras noggrannheten avsevärt. Resultat tillämpas sedan på intradagshandel, där förutsagda stängningspriser för nästkommande dag integreras med både en lång och en lång-kort strategi. Vår modell lyckas effektivt fånga aktieprisrörelser och dra nytta av ineffektiviteter på marknaden, vilket resulterar i maximal portföljavkastning. LSTM-modellen är överlägset bättre än ARIMA-modellen när det gäller att korrekt förutsäga aktiepriser för nästkommande dag, i linje med tidigare forskning [14, 15, 31, 32], är . Resultat från intradagshandeln visar att LSTM-modellen konsekvent genererar en bättre portföljavkastning jämfört med både ARIMA-modellen och dess jämförelseindex. Dessutom uppnår strategin positiv avkastning under hela den analyserade tidsperioden. Den optimala portföljen uppnår en genomsnittlig daglig avkastning på 1.2%, vilket överstiger OMXS30-indexets genomsnittliga dagliga avkastning på 0.1%. Handelsalgoritmen är oerhört exakt med en korrekthetsnivå på 0.996 när den genomför affärer baserat på förutsagda rörelser i aktiepriset. Detta resulterar i en imponerande avkastning som växer exponentiellt och överträffar jämförelseindex med en faktor på 800 under samma period. Deep Learning Long-Short-Term-Memory (LSTM) ARIMA Financial Time Series Forecasting Algorithmic Trading Intraday Trading Stock Prediction Djupinlärning LSTM ARIMA finansiella tidsserier algoritmisk aktiehandel intradagshandel aktieprediktion Computer and Information Sciences Data- och informationsvetenskap

Search results