Global ETD Search

61	A Comparative Study : Time-Series Analysis Methods for Predicting COVID-19 Case Trend / En jämförande studie : Tidsseriens analysmetoder för att förutsäga fall av COVID-19 Xu, Chenhui January 2021 (has links) Since 2019, COVID-19, as a new acute respiratory disease, has struck the whole world, causing millions of death and threatening the economy, politics, and civilization. Therefore, an accurate prediction of the future spread of COVID-19 becomes crucial in such a situation. In this comparative study, four different time-series analysis models, namely the ARIMA model, the Prophet model, the Long Short-Term Memory (LSTM) model, and the Transformer model, are investigated to determine which has the best performance when predicting the future case trends of COVID-19 in six countries. After obtaining the publicly available COVID-19 case data from Johns Hopkins University Center for Systems Science and Engineering database, we conduct repetitive experiments which exploit the data to predict future trends for all models. The performance is then evaluated by mean squared error (MSE) and mean absolute error (MAE) metrics. The results show that overall the LSTM model has the best performance for all countries that it can achieve extremely low MSE and MAE. The Transformer model has the second-best performance with highly satisfactory results in some countries, and the other models have poorer performance. This project highlights the high accuracy of the LSTM model, which can be used to predict the spread of COVID-19 so that countries can be better prepared and aware when controlling the spread. / Sedan 2019 har COVID-19, som en ny akut andningssjukdom, drabbat hela världen, orsakat miljontals dödsfall och hotat ekonomin, politiken och civilisationen. Därför blir en korrekt förutsägelse av den framtida spridningen av COVID-19 avgörande i en sådan situation. I denna jämförande studie undersöks fyra olika tidsseriemodeller, nämligen ARIMA-modellen, profetmodellen, Long Short-Term Memory (LSTM) -modellen och transformatormodellen, för att avgöra vilken som har bäst prestanda när man förutsäger framtida falltrender av COVID-19 i sex länder. Efter att ha fått offentligt tillgängliga COVID-19-falldata från Johns Hopkins University Center for Systems Science and Engineering-databasen utför vi repetitiva experiment som utnyttjar data för att förutsäga framtida trender för alla modeller. Prestandan utvärderas sedan med medelvärde för kvadratfel (MSE) och medelvärde för absolut fel (MAE). Resultaten visar att LSTM -modellen överlag har den bästa prestandan för alla länder att den kan uppnå extremt låg MSE och MAE. Transformatormodellen har den näst bästa prestandan med mycket tillfredsställande resultat i vissa länder, och de andra modellerna har sämre prestanda. Detta projekt belyser den höga noggrannheten hos LSTM-modellen, som kan användas för att förutsäga spridningen av COVID-19 så att länder kan vara bättre förberedda och medvetna när de kontrollerar spridningen. Time-series analysis ARIMA Prophet LSTM Transformer Tidsserieanalys ARIMA Prophet LSTM Transformer Computer and Information Sciences Data- och informationsvetenskap
62	Time Series forecasting of the SP Global Clean Energy Index using a Multivariate LSTM Larsson, Klara, Ling, Freja January 2021 (has links) Clean energy and machine learning are subjects that play significant roles in shaping our future. The current climate crisis has forced the world to take action towards more sustainable solutions. Arrangements such as the UN’s Sustainable Development Goals and the Paris Agreement are causing an increased interest in renewable energy solutions. Further, the EU Taxonomy Regulation, applied in 2020, aims to scale up sustainable investments and to direct cash flows toward sustainable projects and activities. These measures create interest in investing in renewable energy alternatives and predicting future movements of stocks related to these businesses. Machine learning models have previously been used to predict time series with promising results. However, predicting time series in the form of stock price indices has, throughout previous attempts, proved to be a difficult task due to the complexity of the variables that play a role in the indices’ movements. This paper uses the machine learning algorithm long short-term memory (LSTM) to predict the S&P Global Clean Energy Index. The research question revolves around how well the LSTM model performs on this specific index and how the result is affected when past returns from correlating variables are added to the model. The researched variables are crude oil price, gold price, and interest. A model for each correlating variable was created, as well as one with all three, and one standard model which used only historical data from the index. The study found that while the model with the variable which had the strongest correlation performed best among the multivariate models, the standard model using only the target variable gave the most accurate result of any of the LSTM models. / Den pågående klimatkrisen har tvingat allt fler länder till att vidta åtgärder, och FN:s globala hållbarhetsmål och Parisavtalet ökar intresset för förnyelsebar energi. Vidare lanserade EU-kommissionen den 21 april 2021 ett omfattande åtgärdspaket, med syftet att öka investeringar i hållbara verksamheter. Detta skapar i sin tur ett ökat intresse för investeringar i förnyelsebar energi och metoder för att förutspå aktiepriser för dessa bolag. Maskininlärningsmodeller har tidigare använts för tidsserieanalyser med goda resultat, men att förutspå aktieindex har visat sig svårt till stor del på grund av uppgiftens komplexitet och antalet variabler som påverkar börsen. Den här uppsatsen använder sig av maskininlärningsmodellen long short-term memory (LSTM) för att förutspå S&P:s Global Clean Energy Index. Syftet är att ta reda på hur träffsäkert en LSTM-modell kan förutspå detta index, och hur resultatet påverkas då modellen används med ytterligare variabler som korrelerar med indexet. De variabler som undersöks är priset på råolja, priset på guld, och ränta. Modeller för var variabel skapades, samt en modell med samtliga variabler och en med endast historisk data från indexet. Resultatet visar att den modell med den variabel som korrelerar starkast med indexet presterade bäst bland flervariabelmodellerna, men den modell som endast användes med historisk data från indexet gav det mest träffsäkra resultatet. Machine learning clean energy neural networks stock market LSTM time series multivariate LSTM correlation. Computer and Information Sciences Data- och informationsvetenskap
63	Anomaly Detection using LSTM N. Networks and Naive Bayes Classifiers in Multi-Variate Time-Series Data from a Bolt Tightening Tool / Anomali detektion med LSTM neuralt nätverk och Naive Bayes klassificerare av multivariabel tidsseriedata från en mutterdragare Selander, Karl-Filip January 2021 (has links) In this thesis, an anomaly detection framework has been developed to aid in maintenance of tightening tools. The framework is built using LSTM networks and gaussian naive bayes classifiers. The suitability of LSTM networks for multi-variate sensor data and time-series prediction as a basis for anomaly detection has been explored. Current literature and research is mostly concerned with uni-variate data, where LSTM based approaches have had variable but often good results. However, most real world settings with sensor networks, such as the environment and tool from which this thesis data is gathered, are multi-variable. Thus, there is a need to research the effectiveness of the LSTM model in this setting. The thesis has emphasized the need of well defined evaluation metrics of anomaly detection approaches, the difficulties of defining anomalies and anomaly datasets, as well as illustrated the effectiveness of LSTM networks in multi-variate environments. / I den här uppsatsen har ett anomali detektions ramverk utvecklats för att bidra till underhållandet av åtdragarverktyg. Ramverket bygger på LSTM neurala nätverk och gaussian Naive Bayes klassificerare. Användbarheten av LSTM nätverk för multi-variabel data och tidsserie prediktion som basis för anomali detektion har undersökts. Nutida literatur och forskning berör mest envariabel data där LSTM baserade metoder ofta har presterat bra. Men, de flesta system i verkligheten är inte envariabel utan multivariabel, som den miljö verktyget, vars data undersöks i den här uppsatsen, opererar i. Därför anses det att det finns ett behov att undersöka användbarheten av LSTM modeller i den här typen av miljö. Det här arbetet har betonat vikten av väldefinierade utvärderingsvärden för anomali detektion, svårigheterna med att definiera anomalier och anomalidataset, samt illustrerat användbarheten av LSTM nätverk i multivariabla miljöer. LSTM anomaly detection time-series multi-variable sensor deep learning LSTM anomalidetektion tidsserie multivariabel sensor djupinlärning Engineering and Technology Teknik och teknologier
64	Forecasting the Future: Integrating Predictive Modeling into Production Planning : A Quantitative Case Study Andersson, Gustav January 2024 (has links) With Industry 4.0, companies are faced with the challenge of managing an ever-increasing amount of data and re-evaluating and innovating their production planning methods. An important aspect of demand forecasting is the accuracy of forecasts compared to outcomes. Research has shown that more complex models perform better in demand forecasting, however, this research has focused on demand forecasting in the IT, finance and e-commerce sectors. This thesis investigates the application of predictive modelling in demand forecasting in the context of production planning for a medium-sized manufacturing company. The study mainly compares the performance of two predictive models: Autoregressive Integrated Moving Average (ARIMA) and Long Short-Term Memory (LSTM) networks, with the aim of assessing its usefulness in improving the accuracy of demand forecasts. Based on historical sales data, this quantitative case study investigates how these models can improve operational efficiency that can be applied to production planning processes such as optimal inventory and production schedules. The study found that the LSTM model, through Automated Machine Learning (AutoML), was significantly better than the ARIMA model in terms of forecast accuracy. This was evidenced by lower Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values, indicating that LSTM's ability to capture long-term dependencies and adapt to non-linear patterns provides a more robust tool for demand forecasting in production planning. This study contributes to the field of industrial engineering by demonstrating the practical benefits of integrating advanced predictive models into manufacturing companies' production planning processes. It highlights the potential of machine learning techniques to transform traditional production planning systems and thus provides insights into the strategic implementation of AI in industrial operations. Future research could explore and compare more models to get a broader picture of how different models perform against each other in terms of prediction errors. / Med Industri 4.0 står företagen inför utmaningen att hantera en ständigt ökande mängd data och att omvärdera och förnya sina metoder för produktionsplanering. En viktig aspekt av efterfrågeprognoser är prognosernas träffsäkerhet jämfört med utfallet. Forskning har visat att mer komplexa modeller presterar bättre vid efterfrågeprognoser, men denna forskning har fokuserat på efterfrågeprognoser inom IT-, finans- och e-handelssektorerna. Denna studie undersöker tillämpningen av prediktiv modellering vid efterfrågeprognoser i samband med produktionsplanering för ett medelstort tillverkningsföretag. Studien jämför främst prestandan hos två prediktiva modeller: Autoregressive Integrated Moving Average (ARIMA) och Long Short-Term Memory (LSTM) nätverk, i syfte att bedöma hur användbara de är för att förbättra precisionen i efterfrågeprognoser. Baserat på historiska försäljningsdata undersöker denna kvantitativa fallstudie hur dessa modeller kan förbättra den operativa effektiviteten som kan tillämpas på produktionsplaneringsprocesser, såsom lagerhållning och produktionsscheman. Studien visade att LSTM-modellen, genom automatiserad maskininlärning (AutoML), var betydligt bättre än ARIMA-modellen när det gäller prognosprecision. Detta framgick av lägre RMSE-värden (Root Mean Squared Error) och MAE-värden (Mean Absolute Error), vilket tyder på att LSTM:s förmåga att fånga upp långsiktiga beroenden och anpassa sig till icke-linjära mönster ger ett mer robust verktyg för efterfrågeprognoser inom produktionsplanering. Denna studie bidrar till området industriell ekonomi genom att visa på de praktiska fördelarna med att integrera avancerade prediktiva modeller i tillverkningsföretagens produktionsplaneringsprocesser. Den belyser maskininlärningsteknikernas potential att omvandla traditionella produktionsplaneringssystem och ger därmed insikter i den strategiska implementeringen av AI i industriell verksamhet. Framtida forskning skulle kunna utforska och jämföra fler modeller för att få en bredare bild av hur olika modeller presterar mot varandra när det gäller prediktionsfel. Demand Forecasting ARIMA LSTM Production Planning Efterfrågeprognoser ARIMA LSTM Produktionsplanering Övrig annan teknik
65	Statistické strojové učení s aplikacemi v hudbě / Statistical machine learning with applications in music Janásková, Eliška January 2019 (has links) The aim of this thesis is to review the current state of machine learning in music composition and to train a computer on Beatles' songs using research project Magenta from the Google Brain Team to produce its own music. In order to explore the qualities of the generated music more thoroughly, we restrict our- selves to monophonic melodies only. We train three deep learning models with three different configurations (Basic, Lookback, and Attention) and compare generated results. Even though the generated music is not as interesting as the original Beatles, it is quite likable. According to our analysis based on musically informed metrics, generated melodies differ from the original ones especially in lengths of notes and in pitch differences between consecutive notes. Generated melodies tend to use shorter notes and higher pitch differences. In theoretical background, we cover the most commonly used machine learning algorithms, introduce neural networks and review related work of music generation. 1
66	Deep neural semantic parsing: translating from natural language into SPARQL / Análise semântica neural profunda: traduzindo de linguagem natural para SPARQL Luz, Fabiano Ferreira 07 February 2019 (has links) Semantic parsing is the process of mapping a natural-language sentence into a machine-readable, formal representation of its meaning. The LSTM Encoder-Decoder is a neural architecture with the ability to map a source language into a target one. We are interested in the problem of mapping natural language into SPARQL queries, and we seek to contribute with strategies that do not rely on handcrafted rules, high-quality lexicons, manually-built templates or other handmade complex structures. In this context, we present two contributions to the problem of semantic parsing departing from the LSTM encoder-decoder. While natural language has well defined vector representation methods that use a very large volume of texts, formal languages, like SPARQL queries, suffer from lack of suitable methods for vector representation. In the first contribution we improve the representation of SPARQL vectors. We start by obtaining an alignment matrix between the two vocabularies, natural language and SPARQL terms, which allows us to refine a vectorial representation of SPARQL items. With this refinement we obtained better results in the posterior training for the semantic parsing model. In the second contribution we propose a neural architecture, that we call Encoder CFG-Decoder, whose output conforms to a given context-free grammar. Unlike the traditional LSTM encoder-decoder, our model provides a grammatical guarantee for the mapping process, which is particularly important for practical cases where grammatical errors can cause critical failures. Results confirm that any output generated by our model obeys the given CFG, and we observe a translation accuracy improvement when compared with other results from the literature. / A análise semântica é o processo de mapear uma sentença em linguagem natural para uma representação formal, interpretável por máquina, do seu significado. O LSTM Encoder-Decoder é uma arquitetura de rede neural com a capacidade de mapear uma sequência de origem para uma sequência de destino. Estamos interessados no problema de mapear a linguagem natural em consultas SPARQL e procuramos contribuir com estratégias que não dependam de regras artesanais, léxico de alta qualidade, modelos construídos manualmente ou outras estruturas complexas feitas à mão. Neste contexto, apresentamos duas contribuições para o problema de análise semântica partindo da arquitetura LSTM Encoder-Decoder. Enquanto para a linguagem natural existem métodos de representação vetorial bem definidos que usam um volume muito grande de textos, as linguagens formais, como as consultas SPARQL, sofrem com a falta de métodos adequados para representação vetorial. Na primeira contribuição, melhoramos a representação dos vetores SPARQL. Começamos obtendo uma matriz de alinhamento entre os dois vocabulários, linguagem natural e termos SPARQL, o que nos permite refinar uma representação vetorial dos termos SPARQL. Com esse refinamento, obtivemos melhores resultados no treinamento posterior para o modelo de análise semântica. Na segunda contribuição, propomos uma arquitetura neural, que chamamos de Encoder CFG-Decoder, cuja saída está de acordo com uma determinada gramática livre de contexto. Ao contrário do modelo tradicional LSTM Encoder-Decoder, nosso modelo fornece uma garantia gramatical para o processo de mapeamento, o que é particularmente importante para casos práticos nos quais erros gramaticais podem causar falhas críticas em um compilador ou interpretador. Os resultados confirmam que qualquer resultado gerado pelo nosso modelo obedece à CFG dada, e observamos uma melhora na precisão da tradução quando comparada com outros resultados da literatura. Análise semântica CFG Codificação decodificação Encoder decoder GLC Gramáticas Grammars LSTM LSTM NLP Ontologias Ontology Palavras associadas PLN RDF RDF RNN RNN Semantic parsing SPARQL SPARQL Word embeddings
67	Learning socio-communicative behaviors of a humanoid robot by demonstration / Apprendre les comportements socio-communicatifs d'un robot humanoïde par la démonstration Nguyen, Duc-Canh 22 October 2018 (has links) Un robot d'assistance sociale (SAR) est destiné à engager les gens dans une interaction située comme la surveillance de l'exercice physique, la réadaptation neuropsychologique ou l'entraînement cognitif. Alors que les comportements interactifs de ces systèmes sont généralement scriptés, nous discutons ici du cadre d’apprentissage de comportements interactifs multimodaux qui est proposé par le projet SOMBRERO.Dans notre travail, nous avons utilisé l'apprentissage par démonstration afin de fournir au robot des compétences nécessaires pour effectuer des tâches collaboratives avec des partenaires humains. Il y a trois étapes principales d'apprentissage de l'interaction par démonstration: (1) recueillir des comportements interactifs représentatifs démontrés par des tuteurs humains; (2) construire des modèles des comportements observés tout en tenant compte des connaissances a priori (modèle de tâche et d'utilisateur, etc.); et ensuite (3) fournir au robot-cible des contrôleurs de gestes appropriés pour exécuter les comportements souhaités.Les modèles multimodaux HRI (Human-Robot Interaction) sont fortement inspirés des interactions humain-humain (HHI). Le transfert des comportements HHI aux modèles HRI se heurte à plusieurs problèmes: (1) adapter les comportements humains aux capacités interactives du robot en ce qui concerne ses limitations physiques et ses capacités de perception, d'action et de raisonnement limitées; (2) les changements drastiques des comportements des partenaires humains face aux robots ou aux agents virtuels; (3) la modélisation des comportements interactifs conjoints; (4) la validation des comportements robotiques par les partenaires humains jusqu'à ce qu'ils soient perçus comme adéquats et significatifs.Dans cette thèse, nous étudions et faisons des progrès sur ces quatre défis. En particulier, nous traitons les deux premiers problèmes (transfert de HHI vers HRI) en adaptant le scénario et en utilisant la téléopération immersive. En outre, nous utilisons des réseaux neuronaux récurrents pour modéliser les comportements interactifs multimodaux (tels que le discours, le regard, les mouvements de bras, les mouvements de la tête, les canaux). Ces techniques récentes surpassent les méthodes traditionnelles (Hidden Markov Model, Dynamic Bayesian Network, etc.) en termes de précision et de coordination inter-modalités. A la fin de cette thèse, nous évaluons une première version de robot autonome équipé des modèles construits par apprentissage. / A socially assistive robot (SAR) is meant to engage people into situated interaction such as monitoring physical exercise, neuropsychological rehabilitation or cognitive training. While the interactive behavioral policies of such systems are mainly hand-scripted, we discuss here key features of the training of multimodal interactive behaviors in the framework of the SOMBRERO project.In our work, we used learning by demonstration in order to provide the robot with adequate skills for performing collaborative tasks in human centered environments. There are three main steps of learning interaction by demonstration: we should (1) collect representative interactive behaviors from human coaches; (2) build comprehensive models of these overt behaviors while taking into account a priori knowledge (task and user model, etc.); and then (3) provide the target robot with appropriate gesture controllers to execute the desired behaviors.Multimodal HRI (Human-Robot Interaction) models are mostly inspired by Human-Human interaction (HHI) behaviors. Transferring HHI behaviors to HRI models faces several issues: (1) adapting the human behaviors to the robot’s interactive capabilities with regards to its physical limitations and impoverished perception, action and reasoning capabilities; (2) the drastic changes of human partner behaviors in front of robots or virtual agents; (3) the modeling of joint interactive behaviors; (4) the validation of the robotic behaviors by human partners until they are perceived as adequate and meaningful.In this thesis, we study and make progress over those four challenges. In particular, we solve the two first issues (transfer from HHI to HRI) by adapting the scenario and using immersive teleoperation. In addition, we use Recurrent Neural Networks to model multimodal interactive behaviors (such as speech, gaze, arm movements, head motion, backchannels) that surpass traditional methods (Hidden Markov Model, Dynamic Bayesian Network, etc.) in both accuracy and coordination between the modalities. We also build and evaluate a proof-of-concept autonomous robot to perform the tasks. Socially Assistive Robot Comportements Interactifs Multimodaux Lstm Téléopération Immersive Apprendre par Démonstration Socially Assistive Robot Multimodal interactive behaviors Lstm Immersive Teleoperation Learning by Demonstration 004 620
68	Strojové učení a zpracování signálu / Machine learning and signal processing Kolář, Michael January 2019 (has links) This thesis deals with the possibility of using neural networks in the analysis of a racing vehicle telemetry data. The analysis consists of finding and classifying the driving states of the vehicle as well as reconstruction and prediction of the data signal. The thesis consists of a research of machine learning focusing on neural networks. It describes the main types of the most commonly used neural networks with the description of their structure and their learning process. It also deals with the problem of classification and an in-depth analysis of the signal prediction and reconstruction. The conclusion is devoted to the implementation of the network into a TeleMatrix application.
69	Steps towards end-to-end neural speaker diarization / Étapes vers un système neuronal de bout en bout pour la tâche de segmentation et de regroupement en locuteurs Yin, Ruiqing 26 September 2019 (has links) La tâche de segmentation et de regroupement en locuteurs (speaker diarization) consiste à identifier "qui parle quand" dans un flux audio sans connaissance a priori du nombre de locuteurs ou de leur temps de parole respectifs. Les systèmes de segmentation et de regroupement en locuteurs sont généralement construits en combinant quatre étapes principales. Premièrement, les régions ne contenant pas de parole telles que les silences, la musique et le bruit sont supprimées par la détection d'activité vocale (VAD). Ensuite, les régions de parole sont divisées en segments homogènes en locuteur par détection des changements de locuteurs, puis regroupées en fonction de l'identité du locuteur. Enfin, les frontières des tours de parole et leurs étiquettes sont affinées avec une étape de re-segmentation. Dans cette thèse, nous proposons d'aborder ces quatre étapes avec des approches fondées sur les réseaux de neurones. Nous formulons d’abord le problème de la segmentation initiale (détection de l’activité vocale et des changements entre locuteurs) et de la re-segmentation finale sous la forme d’un ensemble de problèmes d’étiquetage de séquence, puis nous les résolvons avec des réseaux neuronaux récurrents de type Bi-LSTM (Bidirectional Long Short-Term Memory). Au stade du regroupement des régions de parole, nous proposons d’utiliser l'algorithme de propagation d'affinité à partir de plongements neuronaux de ces tours de parole dans l'espace vectoriel des locuteurs. Des expériences sur un jeu de données télévisées montrent que le regroupement par propagation d'affinité est plus approprié que le regroupement hiérarchique agglomératif lorsqu'il est appliqué à des plongements neuronaux de locuteurs. La segmentation basée sur les réseaux récurrents et la propagation d'affinité sont également combinées et optimisées conjointement pour former une chaîne de regroupement en locuteurs. Comparé à un système dont les modules sont optimisés indépendamment, la nouvelle chaîne de traitements apporte une amélioration significative. De plus, nous proposons d’améliorer l'estimation de la matrice de similarité par des réseaux neuronaux récurrents, puis d’appliquer un partitionnement spectral à partir de cette matrice de similarité améliorée. Le système proposé atteint des performances à l'état de l'art sur la base de données de conversation téléphonique CALLHOME. Enfin, nous formulons le regroupement des tours de parole en mode séquentiel sous la forme d'une tâche supervisée d’étiquetage de séquence et abordons ce problème avec des réseaux récurrents empilés. Pour mieux comprendre le comportement du système, une analyse basée sur une architecture de codeur-décodeur est proposée. Sur des exemples synthétiques, nos systèmes apportent une amélioration significative par rapport aux méthodes de regroupement traditionnelles. / Speaker diarization is the task of determining "who speaks when" in an audio stream that usually contains an unknown amount of speech from an unknown number of speakers. Speaker diarization systems are usually built as the combination of four main stages. First, non-speech regions such as silence, music, and noise are removed by Voice Activity Detection (VAD). Next, speech regions are split into speaker-homogeneous segments by Speaker Change Detection (SCD), later grouped according to the identity of the speaker thanks to unsupervised clustering approaches. Finally, speech turn boundaries and labels are (optionally) refined with a re-segmentation stage. In this thesis, we propose to address these four stages with neural network approaches. We first formulate both the initial segmentation (voice activity detection and speaker change detection) and the final re-segmentation as a set of sequence labeling problems and then address them with Bidirectional Long Short-Term Memory (Bi-LSTM) networks. In the speech turn clustering stage, we propose to use affinity propagation on top of neural speaker embeddings. Experiments on a broadcast TV dataset show that affinity propagation clustering is more suitable than hierarchical agglomerative clustering when applied to neural speaker embeddings. The LSTM-based segmentation and affinity propagation clustering are also combined and jointly optimized to form a speaker diarization pipeline. Compared to the pipeline with independently optimized modules, the new pipeline brings a significant improvement. In addition, we propose to improve the similarity matrix by bidirectional LSTM and then apply spectral clustering on top of the improved similarity matrix. The proposed system achieves state-of-the-art performance in the CALLHOME telephone conversation dataset. Finally, we formulate sequential clustering as a supervised sequence labeling task and address it with stacked RNNs. To better understand its behavior, the analysis is based on a proposed encoder-decoder architecture. Our proposed systems bring a significant improvement compared with traditional clustering methods on toy examples. Détection des changements de locuteurs Segmentation LSTM Propagation d'affinité Partitionnement spectral Speaker diarization Speaker change detection Speech segmentation LSTM Affinity propagation Spectral clustering
70	Pattern Recognition in the Usage Sequences of Medical Apps / Analyse des Séquences d'Usage d'Applications Médicales Adam, Chloé 01 April 2019 (has links) Les radiologues utilisent au quotidien des solutions d'imagerie médicale pour le diagnostic. L'amélioration de l'expérience utilisateur est toujours un axe majeur de l'effort continu visant à améliorer la qualité globale et l'ergonomie des produits logiciels. Les applications de monitoring permettent en particulier d'enregistrer les actions successives effectuées par les utilisateurs dans l'interface du logiciel. Ces interactions peuvent être représentées sous forme de séquences d'actions. Sur la base de ces données, ce travail traite de deux sujets industriels : les pannes logicielles et l'ergonomie des logiciels. Ces deux thèmes impliquent d'une part la compréhension des modes d'utilisation, et d'autre part le développement d'outils de prédiction permettant soit d'anticiper les pannes, soit d'adapter dynamiquement l'interface logicielle en fonction des besoins des utilisateurs. Tout d'abord, nous visons à identifier les origines des crashes du logiciel qui sont essentielles afin de pouvoir les corriger. Pour ce faire, nous proposons d'utiliser un test binomial afin de déterminer quel type de pattern est le plus approprié pour représenter les signatures de crash. L'amélioration de l'expérience utilisateur par la personnalisation et l'adaptation des systèmes aux besoins spécifiques de l'utilisateur exige une très bonne connaissance de la façon dont les utilisateurs utilisent le logiciel. Afin de mettre en évidence les tendances d'utilisation, nous proposons de regrouper les sessions similaires. Nous comparons trois types de représentation de session dans différents algorithmes de clustering. La deuxième contribution de cette thèse concerne le suivi dynamique de l'utilisation du logiciel. Nous proposons deux méthodes -- basées sur des représentations différentes des actions d'entrée -- pour répondre à deux problématiques industrielles distinctes : la prédiction de la prochaine action et la détection du risque de crash logiciel. Les deux méthodologies tirent parti de la structure récurrente des réseaux LSTM pour capturer les dépendances entre nos données séquentielles ainsi que leur capacité à traiter potentiellement différents types de représentations d'entrée pour les mêmes données. / Radiologists use medical imaging solutions on a daily basis for diagnosis. Improving user experience is a major line of the continuous effort to enhance the global quality and usability of software products. Monitoring applications enable to record the evolution of various software and system parameters during their use and in particular the successive actions performed by the users in the software interface. These interactions may be represented as sequences of actions. Based on this data, this work deals with two industrial topics: software crashes and software usability. Both topics imply on one hand understanding the patterns of use, and on the other developing prediction tools either to anticipate crashes or to dynamically adapt software interface according to users' needs. First, we aim at identifying crash root causes. It is essential in order to fix the original defects. For this purpose, we propose to use a binomial test to determine which type of patterns is the most appropriate to represent crash signatures. The improvement of software usability through customization and adaptation of systems to each user's specific needs requires a very good knowledge of how users use the software. In order to highlight the trends of use, we propose to group similar sessions into clusters. We compare 3 session representations as inputs of different clustering algorithms. The second contribution of our thesis concerns the dynamical monitoring of software use. We propose two methods -- based on different representations of input actions -- to address two distinct industrial issues: next action prediction and software crash risk detection. Both methodologies take advantage of the recurrent structure of LSTM neural networks to capture dependencies among our sequential data as well as their capacity to potentially handle different types of input representations for the same data. Exploration de motifs fréquents Représentations pour l’apprentissage Représentations d’action Clustering Réseaux de Neurones Récurrents LSTM Frequent pattern mining Representation learning Action embeddings Clustering LSTM Recurrent Neural Networks

Search results