Global ETD Search

241	The past, present or future? : A comparative NLP study of Naive Bayes, LSTM and BERT for classifying Swedish sentences based on their tense Navér, Norah January 2021 (has links) Natural language processing is a field in computer science that is becoming increasingly important. One important part of NLP is the ability to sort text to the past, present or future, depending on when the event came or will come about. The objective of this thesis was to use text classification to classify Swedish sentences based on their tense, either past, present or future. Furthermore, the objective was also to compare how lemmatisation would affect the performance of the models. The problem was tackled by implementing three machine learning models on both lemmatised and not lemmatised data. The machine learning models were Naive Bayes, LSTM and BERT. The result showed that the overall performance was affected negatively when the data was lemmatised. The best performing model was BERT with an accuracy of 96.3\%. The result was useful as the best performing model had very high accuracy and performed well on newly constructed sentences. / Språkteknologi är område inom datavetenskap som som har blivit allt viktigare. En viktig del av språkteknologi är förmågan att sortera texter till det förflutna, nuet eller framtiden, beroende på när en händelse skedde eller kommer att ske. Syftet med denna avhandling var att använda textklassificering för att klassificera svenska meningar baserat på deras tempus, antingen dåtid, nutid eller framtid. Vidare var syftet även att jämföra hur lemmatisering skulle påverka modellernas prestanda. Problemet hanterades genom att implementera tre maskininlärningsmodeller på både lemmatiserade och icke lemmatiserade data. Maskininlärningsmodellerna var Naive Bayes, LSTM och BERT. Resultatet var att den övergripande prestandan påverkades negativt när datan lemmatiserade. Den bäst presterande modellen var BERT med en träffsäkerhet på 96,3 \%. Resultatet var användbart eftersom den bäst presterande modellen hade mycket hög träffsäkerhet och fungerade bra på nybyggda meningar. LSTM Naive Bayes BERT tense NLP text classification machine learning Computer Sciences Datavetenskap (datalogi)
242	VISION-BASED ROBOT CONTROLLER FOR HUMAN-ROBOT INTERACTION USING PREDICTIVE ALGORITHMS Nitz Pettersson, Hannes, Vikström, Samuel January 2021 (has links) The demand for robots to work in environments together with humans is growing. This calls for new requirements on robots systems, such as the need to be perceived as responsive and accurate in human interactions. This thesis explores the possibility of using AI methods to predict the movement of a human and evaluating if that information can assist a robot with human interactions. The AI methods that were used is a Long Short Term Memory(LSTM) network and an artificial neural network(ANN). Both networks were trained on data from a motion capture dataset and on four different prediction times: 1/2, 1/4, 1/8 and a 1/16 second. The evaluation was performed directly on the dataset to determine the prediction error. The neural networks were also evaluated on a robotic arm in a simulated environment, to show if the prediction methods would be suitable for a real-life system. Both methods show promising results when comparing the prediction error. From the simulated system, it could be concluded that with the LSTM prediction the robotic arm would generally precede the actual position. The results indicate that the methods described in this thesis report could be used as a stepping stone for a human-robot interactive system. stereo vision lstm pso computer vision motion prediction neural network human-robot interaction simulation ai artificial intelligence Robotics Robotteknik och automation Computer Sciences Datavetenskap (datalogi)
243	Predicting Road Rut with a Multi-time-series LSTM Model Backer-Meurke, Henrik, Polland, Marcus January 2021 (has links) Road ruts are depressions or grooves worn into a road. Increases in rut depth are highly undesirable due to the heightened risk of hydroplaning. Accurately predicting increases in road rut depth is important for maintenance planning within the Swedish Transport Administration. At the time of writing this paper, the agency utilizes a linear regression model and is developing a feed-forward neural network for road rut predictions. The aim of the study was to evaluate the possibility of using a Recurrent Neural Network to predict road rut. Through design science research, an artefact in the form of a LSTM model was designed, developed, and evaluated.The dataset consisted of multiple-multivariate short time series where research was limited. Case studies were conducted which inspired the conceptual design of the model. The baseline LSTM model proposed in this paper utilizes the full dataset in combination with time-series individualization through an added index feature. Additional features thought to correlate with rut depth was also studied through multiple training set variations. The model was evaluated by calculating the Root Mean Squared Error (RMSE) and the Mean Absolute Error (MAE) for each training set variation. The baseline model predicted rut depth with a MAE of 0.8110 (mm) and a RMSE of 1.124 (mm) outperforming a control set without the added index. The feature with the highest correlation to rut depth was curvature with a MAEof 0.8031 and a RMSE of 1.1093. Initial finding shows that there is a possibility of utilizing an LSTM model trained on multiple-multivariate time series to predict rut depth. Time series individualization through an added index feature yielded better results than control, indicating that it had the desired effect on model performance. Multiple-multivariate time-series Multi-time-series LSTM model Recurrent Neural Networks Machine Learning Road rut forecasting Information Systems
244	Caractérisation du niveau d’amusement grâce à des techniques d’apprentissage machine Toupin, Gabrielle 05 1900 (has links) Introduction. L'humour est un processus cognitif complexe qui peut entraîner un état émotionnel positif d’amusement. La réponse émotionnelle déclenchée par l'humour possède plusieurs bénéfices pour la santé. Son utilisation en recherche et lors d’essais cliniques est d’ailleurs de plus en plus fréquente. Malheureusement, l’appréciation de l’humour varie considérablement d’un individu à l’autre, et entraîne des réponses émotionnelles très différentes. Cette variabilité, rarement prise en compte dans les études de recherche, est donc importante à quantifier pour pouvoir évaluer de manière robuste les effets de l’humour sur la santé. Objectifs. Ce projet de maîtrise vise à explorer différentes modalités permettant d’établir une mesure objective de l'appréciation de l'humour via des techniques d'apprentissage automatique et d'apprentissage profond. Les caractéristiques de la vidéo, les expressions faciales et l'activité cérébrale ont été testées comme prédicteur potentiels de l’intensité de l'amusement. Étude 1. Dans notre première étude, les participants (n = 40) ont regardé et évalué des vidéos humoristiques et neutres pendant que leurs expressions faciales étaient enregistrées. Pour chaque vidéo, nous avons calculé le mouvement moyen, la saillance et deux scores sémantiques. L’algorithme d’arbres aléatoire a été entraîné sur les caractéristiques des vidéos et le sourire des participants afin de prédire à quel point le participant a évalué la vidéo comme étant drôle, et ce, à trois moments durant la vidéo (début, milieu et fin). De plus, nous avons utilisé l'expression faciale du participant pour explorer la dynamique temporelle de l'appréciation de l'humour tout au long de la vidéo et ses impacts sur la vidéo suivante. Nos résultats ont montré que les caractéristiques des vidéos permettent de bien classifier les vidéos neutres et les vidéos humoristiques, mais ne permettent pas de différencier les intensités d'humour. À l’inverse, le sourire est un bon prédicteur de l’intensité de l’amusement au sein des vidéos humoristiques (contribution=0.53) et est la seule modalité à fluctuer dans le temps; montrant ainsi que l'appréciation de l'humour est plus grande à la fin de la vidéo et après la vidéo. Étude 2. Notre deuxième étude a utilisé des techniques d'apprentissage profond afin de prédire l’intensité de l’amusement ressenti par les participants (n = 10) lorsqu’ils visionnaient des vidéos humoristiques avec un casque EEG commercial. Nous avons utilisé un algorithme LSTM pour prédire les intensités d'amusement vi (faible, modéré, élevé, très élevé) en fonction d'une seconde d'activité cérébrale. Les résultats ont montré une bonne transférabilité entre les participants et une précision de décodage dépassant 80% d’exactitude. Conclusion. Les caractéristiques de la vidéo, les expressions faciales des participants et l'activité cérébrale ont permis de prédire l'appréciation de l'humour. À partir de ces trois modalités, nous avons trouvé que les réactions physiologiques (expression faciale et activité cérébrale) prédisent mieux les intensités de l’amusement tout en offrant une meilleure précision temporelle de la dynamique d'appréciation de l'humour. Les futures études employant l'humour gagneraient à inclure le niveau d’appréciation, mesuré via le sourire ou l’activité cérébrale, comme variable d’intérêt dans leurs protocoles expérimentaux. / Introduction. Humour is a complex cognitive process that can result in a positive emotional state of amusement. The emotional response triggered by humour has several health benefits and is used in many research and clinical trials as treatments. Humour appreciation varies greatly between participants and can trigger different levels of emotional response. Unfortunately, research rarely considers these individual differences, which could impact the implication of humour in research. These researches would benefit from having an objective method to detect humour appreciation. Objectives. This master's thesis seeks to provide an appropriate solution for an objective measure of humour appreciation by using machine learning and deep learning techniques to predict how individuals react to humorous videos. Video characteristics, facial expressions and brain activity were tested as potential predictors of amusement’s intensity. Study 1. In our first study, participants (n=40) watched and rated humorous and neutral videos while their facial expressions were recorded. For each video, we computed the average movement, saliency and semantics associated with the video. Random Forest Classifier was used to predict how funny the participant rated the video at three moments during the clip (begging, middle, end) based on the video's characteristics and the smiles of the participant. Furthermore, we used the participant's facial expression to explore the temporal dynamics of humour appreciation throughout the video and its impacts on the following video. Our results showed that video characteristics are better to classify between neutral and humorous videos but cannot differentiate humour intensities. On the other hand, smiling was better to determine how funny the humorous videos were rated. The proportion of smiles also had more significant fluctuations in time, showing that humour appreciation is greater at the end of the video and the moment just after. Study 2. Our second study used deep learning techniques to predict how funny participants (n=10) rated humorous videos with a commercial EEG headset. We used an LSTM algorithm to predict the intensities of amusement (low, medium, high, very high) based on one second of brain activity. Results showed good transferability across participants, and decoding accuracy reached over 80%. Conclusion. Video characteristics, participant's facial expressions and brain activity allowed us to predict humour appreciation. From these three, we found that physiological reactions (facial expression and brain activity) better predict funniness intensities while also offering a better temporal precision as to when humour appreciation occurs. Further studies using humour would benefit from adding physiological responses as a variable of interest in their experimental protocol. amusement humour apprentissage machine apprentissage profond LSTM Forêt d'arbre décisionnels machine learning deep learning machine learning Random Forest
245	Understanding the Determinants of Car Ownership : A Regression and Neural Network Study / Faktorerna bakom bilägandeskap : En regression och neural nätverksstudie Kindvall, Olle, Pettersson, Vegard January 2023 (has links) This thesis aims to understand the determinants of car ownership in the Swedish regions containing the largest cities: Skåne, Stockholm, and Västra Götaland. This is done by performing a fixed effects regression analysis as well as creating and comparing different predictive models. Both socioeconomic and spatial factors are looked at. The data used in the study is on a Demographic Statistic Zone level for the years 2016-2021. The data consists of approximately 90 variables that are narrowed down to 12 variables based on the level of existing multicollinearity, which are used in the final models. The results from the fixed effects regression show that variables such as population density, age, income, house owning type, and house type are the main influencers on car ownership. These results are similar for the specific regions; however, some differences are discovered, pointing out the disadvantages in creating a generalized model. The results of the predictive models shows that a Long Short-Term Memory model performs better than Random Forrest Regression and OLS, however the performance of the latter two models is considered satisfying enough making them superior as they are easier to interpret and more established within the industry. The region-specific predictive models perform equally well as the ones created from all the data. In conclusion, it can be said that the determinants of car ownership that are mentioned align well with the previous studies made and are considered reliable. Regarding which predictive model to use OLS should be considered sufficient even if more complex methods perform better. Machine learning panel data panel data regression LSTM parking norm car ownership Computer and Information Sciences Data- och informationsvetenskap Probability Theory and Statistics Sannolikhetsteori och statistik
246	Dataset Drift in Radar Warning Receivers : Out-of-Distribution Detection for Radar Emitter Classification using an RNN-based Deep Ensemble Coleman, Kevin January 2023 (has links) Changes to the signal environment of a radar warning receiver (RWR) over time through dataset drift can negatively affect a machine learning (ML) model, deployed for radar emitter classification (REC). The training data comes from a simulator at Saab AB, in the form of pulsed radar in a time-series. In order to investigate this phenomenon on a neural network (NN), this study first implements an underlying classifier (UC) in the form of a deep ensemble (DE), where each ensemble member consists of an NN with two independently trained bidirectional LSTM channels for each of the signal features pulse repetition interval (PRI), pulse width (PW) and carrier frequency (CF). From tests, the UC performs best for REC when using all three features. Because dataset drift can be treated as detecting out-of-distribution (OOD) samples over time, the aim is to reduce NN overconfidence on data from unseen radar emitters in order to enable OOD detection. The method estimates uncertainty with predictive entropy and classifies samples reaching an entropy larger than a threshold as OOD. In the first set of tests, OOD is defined from holding out one feature modulation from the training dataset, and choosing this as the only modulation in the OOD dataset used during testing. With this definition, Stagger and Jitter are most difficult to detect as OOD. Moreover, using DEs with 6 ensemble members and implementing LogitNorm to the architecture improves the OOD detection performance. Furthermore, the OOD detection method performs well for up to 300 emitter classes and predictive entropy outperforms the baseline for almost all tests. Finally, the model performs worse when OOD is simply defined as signals from unseen emitters, because of a precision decrease. In conclusion, the implemented changes managed to reduce the overconfidence for this particular NN, and improve OOD detection for REC. Radar Emitter Classification Pulse Descriptor Word Out of Distribution Detection Dataset Drift Uncertainty Estimation Deep Ensembles Recurrent Neural Networks LSTM Computer Sciences Datavetenskap (datalogi)
247	A Deep Learning Approach To Vehicle Fault Detection Based On Vehicle Behavior Khaliqi, Rafi, Iulian, Cozma January 2023 (has links) Vehicles and machinery play a crucial role in our daily lives, contributing to our transportationneeds and supporting various industries. As society strives for sustainability, the advancementof technology and efficient resource allocation become paramount. However, vehicle faultscontinue to pose a significant challenge, leading to accidents and unfortunate consequences.In this thesis, we aim to address this issue by exploring the effectiveness of an ensemble ofdeep learning models for supervised classification. Specifically, we propose to evaluate the performance of 1D-CNN-Bi-LSTM and 1D-CNN-Bi-GRU models. The Bi-LSTM and Bi-GRUmodels incorporate a multi-head attention mechanism to capture intricate patterns in the data.The methodology involves initial feature extraction using 1D-CNN, followed by learning thetemporal dependencies in the time series data using Bi-LSTM and Bi-GRU. These models aretrained and evaluated on a labeled dataset, yielding promising results. The successful completion of this thesis has met the objectives and scope of the research, and it also paves the way forfuture investigations and further research in this domain. CNN Bi-LSTM Bi-GRU Supervised classification
248	Forecasting Codeword Errors in Networks with Machine Learning / Prognostisering av kodordsfel i nätverk med maskininlärning Hansson Svan, Angus January 2023 (has links) With an increasing demand for rapid high-capacity internet, the telecommunication industry is constantly driven to explore and develop new technologies to ensure stable and reliable networks. To provide a competitive internet service in this growing market, proactive detection and prevention of disturbances are key elements for an operator. Therefore, analyzing network traffic for forecasting disturbances is a well-researched area. This study explores the advantages and drawbacks of implementing a long short-term memory model for forecasting codeword errors in a hybrid fiber-coaxial network. Also, the impact of using multivariate and univariate data for training the model is explored. The performance of the long short-term memory model is compared with a multilayer perceptron model. Analysis of the results shows that the long short-term model, in the vast majority of the tests, performs better than the multilayer perceptron model. This result aligns with the hypothesis, that the long short-term memory model’s ability to handle sequential data would be superior to the multilayer perceptron. However, the difference in performance between the models varies significantly based on the characteristics of the used data set. On the set with heavy fluctuations in the sequential data, the long short-term memory model performs on average 44% better. When training the models on data sets with longer sequences of similar values and with less volatile fluctuations, the results are much more alike. The long short-term model still achieves a lower error on most tests, but the difference is never larger than 7%. If a low error is the sole criterion, the long short-term model is the overall superior model. However, in a production environment, factors such as data storage capacity and model complexity should be taken into consideration. When training the models on multivariate and univariate datasets, the results are unambiguous. When training on all three features, ratios of uncorrectable and correctable codewords, and signal-to-noise ratio, the models always perform better. That is, compared to using uncorrectable codewords as the only training data. This aligns with the hypothesis, which is based on the know-how of hybrid fiber-coaxial experts, that correctable codewords and signal-to-noise ratio have an impact on the occurrence of uncorrectable codewords. / På grund av den ökade efterfrågan av högkvalitativt internet, så drivs telekomindustrin till att konsekvent utforska och utveckla nya teknologier som kan säkerställa stabila och pålitliga nätverk. För att kunna erbjuda konkurrenskraftiga internettjänster, måste operatörerna kunna förutse och förhindra störningar i nätverken. Därför är forskningen kring hur man analyserar och förutser störningar i ett nätverk ett väl exploaterat område. Denna studie undersökte för- och nackdelar med att använda en long short-term memory (LSTM) för att förutse kodordsfel i ett hybridfiber-koaxialt nätverk. Utöver detta undersöktes även hur multidimensionell träningsdata påverkade prestandan. I jämförelsesyfte användes en multilayer perceptron (MLP) och dess resultat. Analysen av resultaten visade att LSTM-modellen presterade bättre än MLP-modellen i majoriteten av de utförda testerna. Men skillnaden i prestanda varierade kraftigt, beroende på vilken datauppsättning som användes vid träning och testning av modellerna. Slutsatsen av detta är att i denna studie så är LSTM den bästa modellen, men att det inte går att säga att LSTM presterar bättre på en godtycklig datauppsättning. Båda modellerna presterade bättre när de tränades på multidimensionell data. Vidare forskning krävs för att kunna determinera om LSTM är den mest självklara modellen för att förutse kodordsfel i ett hybridfiber-koaxialt nätverk. Forecasting Codeword Errors Hybrid Fiber-Coaxial Long Short-Term Memory Multilayer Perceptron Hybridfiber-koaxialt nätverk HFC LSTM MLP Computer and Information Sciences Data- och informationsvetenskap
249	Articulatory Copy Synthesis Based on the Speech Synthesizer VocalTractLab Gao, Yingming 04 August 2022 (has links) Articulatory copy synthesis (ACS), a subarea of speech inversion, refers to the reproduction of natural utterances and involves both the physiological articulatory processes and their corresponding acoustic results. This thesis proposes two novel methods for the ACS of human speech using the articulatory speech synthesizer VocalTractLab (VTL) to address or mitigate the existing problems of speech inversion, such as non-unique mapping, acoustic variation among different speakers, and the time-consuming nature of the process. The first method involved finding appropriate VTL gestural scores for given natural utterances using a genetic algorithm. It consisted of two steps: gestural score initialization and optimization. In the first step, gestural scores were initialized using the given acoustic signals with speech recognition, grapheme-to-phoneme (G2P), and a VTL rule-based method for converting phoneme sequences to gestural scores. In the second step, the initial gestural scores were optimized by a genetic algorithm via an analysis-by-synthesis (ABS) procedure that sought to minimize the cosine distance between the acoustic features of the synthetic and natural utterances. The articulatory parameters were also regularized during the optimization process to restrict them to reasonable values. The second method was based on long short-term memory (LSTM) and convolutional neural networks, which were responsible for capturing the temporal dependence and the spatial structure of the acoustic features, respectively. The neural network regression models were trained, which used acoustic features as inputs and produced articulatory trajectories as outputs. In addition, to cover as much of the articulatory and acoustic space as possible, the training samples were augmented by manipulating the phonation type, speaking effort, and the vocal tract length of the synthetic utterances. Furthermore, two regularization methods were proposed: one based on the smoothness loss of articulatory trajectories and another based on the acoustic loss between original and predicted acoustic features. The best-performing genetic algorithms and convolutional LSTM systems (evaluated in terms of the difference between the estimated and reference VTL articulatory parameters) obtained average correlation coefficients of 0.985 and 0.983 for speaker-dependent utterances, respectively, and their reproduced speech achieved recognition accuracies of 86.25% and 64.69% for speaker-independent utterances of German words, respectively. When applied to German sentence utterances, as well as English and Mandarin Chinese word utterances, the neural network based ACS systems achieved recognition accuracies of 73.88%, 52.92%, and 52.41%, respectively. The results showed that both of these methods not only reproduced the articulatory processes but also reproduced the acoustic signals of reference utterances. Moreover, the regularization methods led to more physiologically plausible articulatory processes and made the estimated articulatory trajectories be more articulatorily preferred by VTL, thus reproducing more natural and intelligible speech. This study also found that the convolutional layers, when used in conjunction with batch normalization layers, automatically learned more distinctive features from log power spectrograms. Furthermore, the neural network based ACS systems trained using German data could be generalized to the utterances of other languages. info:eu-repo/classification/ddc/006 ddc:006 info:eu-repo/classification/ddc/621 ddc:621
250	Multi-objective optimization for model selection in music classification / Flermålsoptimering för modellval i musikklassificering Ujihara, Rintaro January 2021 (has links) With the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which machine classification algorithm to use depends on data sets and the objective of each project work. The collaborating company of this thesis, Ichigoichie AB, is currently developing a system to categorize music data into positive/negative classes. To enhance the accuracy of the existing system, this project aims to figure out the best model through experiments with six audio features (Mel spectrogram, MFCC, HPSS, Onset, CENS, Tonnetz) and several machine learning models including deep neural network models for the classification task. For each model, hyperparameter tuning is performed and the model evaluation is carried out according to pareto optimality with regard to accuracy and execution time. The results show that the most promising model accomplished 95% correct classification with an execution time of less than 15 seconds. / I och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder. Music emotion recognition Mel spectrogram MFCC CENS Onset Tonnetz HPSS 1D convolutional neural network Attention LSTM 1DCNN BiLSTM Pareto optimality Mathematics Matematik

Search results