Global ETD Search

51	Recognition and Tracking of Vehicles in Highways using Deep Learning / Reconhecimento e Rastreamento de Veículos em Rodovias usando Deep Learning Cala, Ludwin Lope 08 March 2019 (has links) Unmanned aerial vehicles (UAV) have become increasingly popular and their ability to analyze images collected in real time has drawn the attention of researchers regarding their use in several tasks, as surveillance of environments, persecution, collection of images, among others. This dissertation proposes a vehicle tracking system through which UAVs can recognize a vehicle and monitor it in highways. The system is based on a combination of bio-inspired machine learning algorithms VOCUS2, CNN and LSTM and was tested with real images collected by an aerial robot. The results show it is simpler and outperformed other complex algorithms, in terms of precision. / Veículos aéreos não tripulados têm se tornado cada vez mais populares e sua capacidade de analisar imagens coletadas em tempo real tem chamado a atenção de pesquisadores quanto ao seu uso em diversas tarefas, como vigilância de ambientes, perseguição, coleta de imagens, entre outros. Esta dissertação propõe um sistema de rastreamento de veículos através do qual os UAV podem reconhecer um veículo e monitorá-lo em rodovias. O sistema é baseado em uma combinação de algoritmos de aprendizado de máquina bio-inspirados VOCUS2, CNN e LSTM e foi testado com imagens reais coletadas por um robô aéreo. Os resultados mostram que é mais simples e superou outros algoritmos complexos, em termos de precisão. Aprendizado profundo Computer vision Deep learning Detecção e classificação Detection and classification Drone Drone Rastreamento Recurrent neural network Rede neural recorrente Tracking Visão computacional
52	A natural language processing solution to probable Alzheimer’s disease detection in conversation transcripts Comuni, Federica January 2019 (has links) This study proposes an accuracy comparison of two of the best performing machine learning algorithms in natural language processing, the Bayesian Network and the Long Short-Term Memory (LSTM) Recurrent Neural Network, in detecting Alzheimer’s disease symptoms in conversation transcripts. Because of the current global rise of life expectancy, the number of seniors affected by Alzheimer’s disease worldwide is increasing each year. Early detection is important to ensure that affected seniors take measures to relieve symptoms when possible or prepare plans before further cognitive decline occurs. Literature shows that natural language processing can be a valid tool for early diagnosis of the disease. This study found that mild dementia and possible Alzheimer’s can be detected in conversation transcripts with promising results, and that the LSTM is particularly accurate in said detection, reaching an accuracy of 86.5% on the chosen dataset. The Bayesian Network classified with an accuracy of 72.1%. The study confirms the effectiveness of a natural language processing approach to detecting Alzheimer’s disease. Bayesian network machine learning natural language processing Alzheimer's disease early detection Computer Sciences Datavetenskap (datalogi)
53	Generating rhyming poetry using LSTM recurrent neural networks Peterson, Cole 30 April 2019 (has links) Current approaches to generating rhyming English poetry with a neural network involve constraining output to enforce the condition of rhyme. We investigate whether this approach is necessary, or if recurrent neural networks can learn rhyme patterns on their own. We compile a new dataset of amateur poetry which allows rhyme to be learned without external constraints because of the dataset’s size and high frequency of rhymes. We then evaluate models trained on the new dataset using a novel framework that automatically measures the system’s knowledge of poetic form and generalizability. We find that our trained model is able to generalize the pattern of rhyme, generate rhymes unseen in the training data, and also that the learned word embeddings for rhyming sets of words are linearly separable. Our model generates a couplet which rhymes 68.15% of the time; this is the first time that a recurrent neural network has been shown to generate rhyming poetry a high percentage of the time. Additionally, we show that crowd-source workers can only distinguish between our generated couplets and couplets from our dataset 63.3% of the time, indicating that our model generates poetry with coherency, semantic meaning, and fluency comparable to couplets written by humans. / Graduate machine learning artificial intelligence poetry computer generated poetry neural networks LSTM recurrent neural network RNN rhyme language modelling sequence modelling neural network algorithmic art creative coding
54	Scalable System-Wide Traffic Flow Predictions Using Graph Partitioning and Recurrent Neural Networks Reginbald Ivarsson, Jón January 2018 (has links) Traffic flow predictions are an important part of an Intelligent Transportation System as the ability to forecast accurately the traffic conditions in a transportation system allows for proactive rather than reactive traffic control. Providing accurate real-time traffic predictions is a challenging problem because of the nonlinear and stochastic features of traffic flow. An increasingly widespread deployment of traffic sensors in a growing transportation system produces greater volume of traffic flow data. This results in problems concerning fast, reliable and scalable traffic predictions.The thesis explores the feasibility of increasing the scalability of real-time traffic predictions by partitioning the transportation system into smaller subsections. This is done by using data collected by Trafikverket from traffic sensors in Stockholm and Gothenburg to construct a traffic sensor graph of the transportation system. In addition, three graph partitioning algorithms are designed to divide the traffic sensor graph according to vehicle travel time. Finally, the produced transportation system partitions are used to train multi-layered long shortterm memory recurrent neural networks for traffic density predictions. Four different types of models are produced and evaluated based on root mean squared error, training time and prediction time, i.e. transportation system model, partitioned transportation models, single sensor models, and overlapping partition models.Results of the thesis show that partitioning a transportation system is a viable solution to produce traffic prediction models as the average prediction accuracy for each traffic sensor across the different types of prediction models are comparable. This solution tackles scalability issues that are caused by increased deployment of traffic sensors to the transportation system. This is done by reducing the number of traffic sensors each prediction model is responsible for which results in less complex models with less amount of input data. A more decentralized and effective solution can be achieved since the models can be distributed to the edge of the transportation system, i.e. near the physical location of the traffic sensors, reducing prediction and response time of the models. / Prognoser för trafikflödet är en viktig del av ett intelligent transportsystem, eftersom möjligheten att prognostisera exakt trafiken i ett transportsystem möjliggör proaktiv snarare än reaktiv trafikstyrning. Att tillhandahålla noggrann trafikprognosen i realtid är ett utmanande problem på grund av de olinjära och stokastiska egenskaperna hos trafikflödet. En alltmer utbredd använding av trafiksensorer i ett växande transportsystem ger större volym av trafikflödesdata. Detta leder till problem med snabba, pålitliga och skalbara trafikprognoser.Avhandlingen undersöker möjligheten att öka skalbarheten hos realtidsprognoser genom att dela transportsystemet i mindre underavsnitt. Detta görs genom att använda data som samlats in av Trafikverket från trafiksensorer i Stockholm och Göteborg för att konstruera en trafiksensor graf för transportsystemet. Dessutom är tre grafpartitioneringsalgoritmer utformade för att dela upp trafiksensor grafen enligt fordonets körtid. Slutligen används de producerade transportsystempartitionerna för att träna multi-layered long short memory neurala nät för förspänning av trafiktäthet. Fyra olika typer av modeller producerades och utvärderades baserat på rotvärdes kvadratfel, träningstid och prediktionstid, d.v.s. transportsystemmodell, partitionerade transportmodeller, enkla sensormodeller och överlappande partitionsmodeller.Resultat av avhandlingen visar att partitionering av ett transportsystem är en genomförbar lösning för att producera trafikprognosmodeller, eftersom den genomsnittliga prognoser noggrannheten för varje trafiksensor över de olika typerna av prediktionsmodeller är jämförbar. Denna lösning tar itu med skalbarhetsproblem som orsakas av ökad användning av trafiksensorer till transportsystemet. Detta görs genom att minska antal trafiksensorer varje trafikprognosmodell är ansvarig för. Det resulterar i mindre komplexa modeller med mindre mängd inmatningsdata. En mer decentraliserad och effektiv lösning kan uppnås eftersom modellerna kan distribueras till transportsystemets kant, d.v.s. nära trafiksensorns fysiska läge, vilket minskar prognosoch responstid för modellerna. Computer and Information Sciences Data- och informationsvetenskap
55	Optimizing text-independent speaker recognition using an LSTM neural network Larsson, Joel January 2014 (has links) In this paper a novel speaker recognition system is introduced. Automated speaker recognition has become increasingly popular to aid in crime investigations and authorization processes with the advances in computer science. Here, a recurrent neural network approach is used to learn to identify ten speakers within a set of 21 audio books. Audio signals are processed via spectral analysis into Mel Frequency Cepstral Coefficients that serve as speaker specific features, which are input to the neural network. The Long Short-Term Memory algorithm is examined for the first time within this area, with interesting results. Experiments are made as to find the optimum network model for the problem. These show that the network learns to identify the speakers well, text-independently, when the recording situation is the same. However the system has problems to recognize speakers from different recordings, which is probably due to noise sensitivity of the speech processing algorithm in use. speaker recognition speaker identification text-independent long short-term memory lstm mel frequency cepstral coefficients mfcc recurrent neural network speech processing spectral analysis rnnlib htktoolkit
56	Inferring Genetic Regulatory Networks Using Cost-based Abduction and Its Relation to Bayesian Inference Andrews, Emad Abdel-Thalooth 16 July 2014 (has links) Inferring Genetic Regulatory Networks (GRN) from multiple data sources is a fundamental problem in computational biology. Computational models for GRN range from simple Boolean networks to stochastic differential equations. To successfully model GRN, a computational method has to be scalable and capable of integrating different biological data sources effectively and homogeneously. In this thesis, we introduce a novel method to model GRN using Cost-Based Abduction (CBA) and study the relation between CBA and Bayesian inference. CBA is an important AI formalism for reasoning under uncertainty that can integrate different biological data sources effectively. We use three different yeast genome data sources—protein-DNA, protein-protein, and knock-out data—to build a skeleton (unannotated) graph which acts as a theory to build a CBA system. The Least Cost Proof (LCP) for the CBA system fully annotates the skeleton graph to represent the learned GRN. Our results show that CBA is a promising tool in computational biology in general and in GRN modeling in particular because CBA knowledge representation can intrinsically implement the AND/OR logic in GRN while enforcing cis-regulatory logic constraints effectively, allowing the method to operate on a genome-wide scale.Besides allowing us to successfully learn yeast pathways such as the pheromone pathway, our method is scalable enough to analyze the full yeast genome in a single CBA instance, without sub-networking. The scalability power of our method comes from the fact that our CBA model size grows in a quadratic, rather than exponential, manner with respect to data size and path length. We also introduce a new algorithm to convert CBA into an equivalent binary linear program that computes the exact LCP for the CBA system, thus reaching the optimal solution. Our work establishes a framework to solve Bayesian networks using integer linear programming and high order recurrent neural networks through CBA as an intermediate representation. Genetic Regulatory Networks Cost-Based Abduction Bayesian Network Systems biology recurrent neural network Maximum a Posteriori Assignment reasoning under uncertainty 0715 0800 0984
57	Comment le langage impose-t-il la structure du sens : construal et narration / How Language Imposes Structure on Meaning : Construal and Narrative Mealier, Anne-Laure 12 December 2016 (has links) Cette thèse a été effectuée dans le cadre du projet européen WYSIWYD (What You Say is What You Did). Ce projet a pour but de rendre, plus naturelles, les interactions Humain-robot, notamment par le biais du langage. Le déploiement de robots compagnon et de robots de service requière que les humains et les robots puissent se comprendre mutuellement et communiquer. Les humains ont développé une codification avancée de leur comportement qui fournit la base de la transparence de la plupart de leurs actions et de leur communication. Jusqu'à présent, les robots ne partagent pas ce code de comportement et ne sont donc pas capables d'expliquer leurs propres actions aux humains. Nous savons que dans le langage parlé, il existe un lien direct entre le langage et le sens permettant à une personne qui écoute d'orienter son attention sur un aspect précis d'un événement. Ceci est particulièrement vrai en production de langage. On sait que la perception visuelle permet l'extraction des aspects de «qui a fait quoi à qui» dans la compréhension des événements sociaux. Mais dans le cadre d'interactions humaines, il existe d'autres aspects importants qui ne peuvent être déterminés uniquement à partir de l'image visuelle. L'échange d'un objet peut être interprété suivant différents points de vue, par exemple du point de vue du donateur ou de celui du preneur. Nous introduisons ainsi la notion de construal. Le construal est la manière dont une personne interprète le monde ou comprend une situation particulière. De plus, les événements sont reliés dans le temps, mais il y a des liens de causalité ainsi que des liens intentionnels qui ne peuvent pas être vus d'un point de vue uniquement visuel. Un agent exécute une action, car il sait que cette action satisfait le désir d'un autre agent. Cela peut ne pas être visible directement dans la scène visuelle. Le langage permet ainsi de préciser cette particularité : "Il vous a donné le livre parce que vous le vouliez". La première problématique que nous mettons en évidence dans ce travail est la manière dont le langage peut être utilisé pour représenter ces construals. Autrement dit, la manière dont un orateur choisit une construction grammaticale plutôt qu'une autre en fonction de son centre d'intérêt. Pour y répondre, nous avons développé un système dans lequel un modèle mental représente un événement d'action. Ce modèle est déterminé par la correspondance entre deux vecteurs abstraits : le vecteur de force exercée par l'action et le vecteur de résultat correspondant à l'effet de la force exercée. La deuxième problématique que nous étudions est comment des constructions de discours narratif peuvent être apprises grâce à un modèle de discours narratifs. Ce modèle se base sur des réseaux neuronaux de production et de compréhension de phrases existants que nous enrichissons avec des structures additionnelles permettant de représenter un contexte de discours. Nous présentons également la manière dont ce modèle peut s'intégrer dans un système cognitif global permettant de comprendre et de générer de nouvelles constructions de discours narratifs ayant une structure similaire, mais des arguments différents. Pour chacun des travaux cités précédemment, nous montrons comment ces modèles théoriques sont intégrés dans la plateforme de développement du robot humanoïde iCub. Cette thèse étudiera donc principalement deux mécanismes qui permettent d'enrichir le sens des évènements par le langage. Le travail se situe entre les neurosciences computationnelles, l'élaboration de modèles de réseaux neuronaux de compréhension et de production de discours narratifs, et la linguistique cognitive où comprendre et expliquer un sens en fonction de l'attention est crucial / This thesis takes place in the context of the European project WYSIWYD (What You Say is What You Did). The goal of this project is to provide transparency in Human-robot interactions, including by mean of language. The deployment of companion and service robots requires that humans and robots can understand each other and communicate. Humans have developed an advanced coding of their behavior that provides the basis of transparency of most of their actions and their communication. Until now, the robots do not share this code of behavior and are not able to explain their own actions to humans. We know that in spoken language, there is a direct mapping between languages and meaning allowing a listener to focus attention on a specific aspect of an event. This is particularly true in language production. Moreover, visual perception allows the extraction of the aspects of "who did what to whom" in the understanding of social events. However, in the context of human interaction, other important aspects cannot be determined only from the visual image. The exchange of an object can be interpreted from the perspective of the giver or taker. This introduces the notion of construal that is how a person interprets the world and perceive a particular situation. The events are related in time, but there are causal and intentional connexion that cannot be seen only from a visual standpoint. An agent performs an action because he knows that this action satisfies the need for another person. This may not be directly visible in the visual scene. The language allows specifying this characteristic: "He gave you the book because you like it." The first point that we demonstrate in this work is how the language can be used to represent these construals. In response, we have developed a system in which a mental model represents an action event. This model is determined by the correspondence between two abstract vectors: the force vector exerted by the action and the result vector corresponding to the effect of the applied force. The application of an attentional process selects one of the two vectors, thus generating the construal of the event. The second point that we consider in this work is how the construction of narrative discourse can be learned with a narrative discourse model. This model is based on both existing neural networks of production and comprehension of sentences that we enrich with additional structures to represent a context of discourse. We present also how this model can be integrated into an overall cognitive system for understanding and generate new constructions of narrative discourse based on similar structure, but different arguments. For each of the works mentioned above, we show how these theoretical models are integrated into the development platform of the iCub humanoid robot. This thesis will explore two main mechanisms to enrich the meaning of events through language. The work is situated between computational neuroscience, with development of neural network models of comprehension and production of narrative discourse, and cognitive linguistics where to understand and explain the meaning according to joint attention is crucial Robotique Humanoïde ICub Interactions Humain-Robot Coopération Réseau Neuronal Récurrent Sciences Cognitives Langage Robotics Humanoïde ICub Human-Robot Interactions Cooperation Recurrent Neural Network Cognitive Science Language 401.8
58	Ring topology of an optical phase delayed nonlinear dynamics for neuromorphic photonic computing / Topologie en anneau d’une dynamique non linéaire à retard en phase optique, pour le calcul photonique neuromorphique Baylon Fuentes, Antonio 13 December 2016 (has links) Aujourd'hui, la plupart des ordinateurs sont encore basés sur des concepts développés il y a plus de 60 ans par Alan Turing et John von Neumann. Cependant, ces ordinateurs numériques ont déjà commencé à atteindre certaines limites physiques via la technologie de la microélectronique au silicium (dissipation, vitesse, limites d'intégration, consommation d'énergie). Des approches alternatives, plus puissantes, plus efficaces et moins consommatrices d'énergie, constituent depuis plusieurs années un enjeu scientifique majeur. Beaucoup de ces approches s'inspirent naturellement du cerveau humain, dont les principes opérationnels sont encore loin d'être compris. Au début des années 2000, la communauté scientifique s'est aperçue qu'une modification du réseau neuronal récurrent (RNN), plus simple et maintenant appelée Reservoir Computing (RC), est parfois plus efficace pour certaines fonctionnalités, et est un nouveau paradigme de calcul qui s'inspire du cerveau. Sa structure est assez semblable aux concepts classiques de RNN, présentant généralement trois parties: une couche d'entrée pour injecter l'information dans un système dynamique non-linéaire (Write-In), une seconde couche où l'information d'entrée est projetée dans un espace de grande dimension (appelé réservoir dynamique) et une couche de sortie à partir de laquelle les informations traitées sont extraites par une fonction dite de lecture-sortie. Dans l'approche RC, la procédure d'apprentissage est effectuée uniquement dans la couche de sortie, tandis que la couche d'entrée et la couche réservoir sont fixées de manière aléatoire, ce qui constitue l'originalité principale du RC par rapport aux méthodes RNN. Cette fonctionnalité permet d'obtenir plus d'efficacité, de rapidité, de convergence d'apprentissage, et permet une mise en œuvre expérimentale. Cette thèse de doctorat a pour objectifs d'implémenter pour la première fois le RC photoniques en utilisant des dispositifs de télécommunication. Notre mise en œuvre expérimentale est basée sur un système dynamique non linéaire à retard, qui repose sur un oscillateur électro-optique (EO) avec une modulation de phase différentielle. Cet oscillateur EO a été largement étudié dans le contexte de la cryptographie optique du chaos. La dynamique présentée par de tels systèmes est en effet exploitée pour développer des comportements complexes dans un espace de phase à dimension infinie, et des analogies avec la dynamique spatio-temporelle (tels que les réseaux neuronaux) sont également trouvés dans la littérature. De telles particularités des systèmes à retard ont conforté l'idée de remplacer le RNN traditionnel (généralement difficile à concevoir technologiquement) par une architecture à retard d'EO non linéaire. Afin d'évaluer la puissance de calcul de notre approche RC, nous avons mis en œuvre deux tests de reconnaissance de chiffres parlés (tests de classification) à partir d'une base de données standard en intelligence artificielle (TI-46 et AURORA-2), et nous avons obtenu des performances très proches de l'état de l'art tout en établissant un nouvel état de l'art en ce qui concerne la vitesse de classification. Notre approche RC photonique nous a en effet permis de traiter environ 1 million de mots par seconde, améliorant la vitesse de traitement de l'information d'un facteur supérieur à ~3. / Nowadays most of computers are still based on concepts developed more than 60 years ago by Alan Turing and John von Neumann. However, these digital computers have already begun to reach certain physical limits of their implementation via silicon microelectronics technology (dissipation, speed, integration limits, energy consumption). Alternative approaches, more powerful, more efficient and with less consume of energy, have constituted a major scientific issue for several years. Many of these approaches naturally attempt to get inspiration for the human brain, whose operating principles are still far from being understood. In this line of research, a surprising variation of recurrent neural network (RNN), simpler, and also even sometimes more efficient for features or processing cases, has appeared in the early 2000s, now known as Reservoir Computing (RC), which is currently emerging new brain-inspired computational paradigm. Its structure is quite similar to the classical RNN computing concepts, exhibiting generally three parts: an input layer to inject the information into a nonlinear dynamical system (Write-In), a second layer where the input information is projected in a space of high dimension called dynamical reservoir and an output layer from which the processed information is extracted through a so-called Read-Out function. In RC approach the learning procedure is performed in the output layer only, while the input and reservoir layer are randomly fixed, being the main originality of RC compared to the RNN methods. This feature allows to get more efficiency, rapidity and a learning convergence, as well as to provide an experimental implementation solution. This PhD thesis is dedicated to one of the first photonic RC implementation using telecommunication devices. Our experimental implementation is based on a nonlinear delayed dynamical system, which relies on an electro-optic (EO) oscillator with a differential phase modulation. This EO oscillator was extensively studied in the context of the optical chaos cryptography. Dynamics exhibited by such systems are indeed known to develop complex behaviors in an infinite dimensional phase space, and analogies with space-time dynamics (as neural network ones are a kind of) are also found in the literature. Such peculiarities of delay systems supported the idea of replacing the traditional RNN (usually difficult to design technologically) by a nonlinear EO delay architecture. In order to evaluate the computational power of our RC approach, we implement two spoken digit recognition tests (classification tests) taken from a standard databases in artificial intelligence TI-46 and AURORA-2, obtaining results very close to state-of-the-art performances and establishing state-of-the-art in classification speed. Our photonic RC approach allowed us to process around of 1 million of words per second, improving the information processing speed by a factor ~3. Réservoir computing Réseau neuronal récurrent Dynalique non linéaire à retard Oscillateur électro-Optique Reservoir computing Recurrent neural network Delay dynamics Nonlinear delay electro-Optic oscillator 535
59	Huvudtitel: Understand and Utilise Unformatted Text Documents by Natural Language Processing algorithms Lindén, Johannes January 2017 (has links) News companies have a need to automate and make the editors process of writing about hot and new events more effective. Current technologies involve robotic programs that fills in values in templates and website listeners that notifies the editors when changes are made so that the editor can read up on the source change at the actual website. Editors can provide news faster and better if directly provided with abstracts of the external sources. This study applies deep learning algorithms to automatically formulate abstracts and tag sources with appropriate tags based on the context. The study is a full stack solution, which manages both the editors need for speed and the training, testing and validation of the algorithms. Decision Tree, Random Forest, Multi Layer Perceptron and phrase document vectors are used to evaluate the categorisation and Recurrent Neural Networks is used to paraphrase unformatted texts. In the evaluation a comparison between different models trained by the algorithms with a variation of parameters are done based on the F-score. The results shows that the F-scores are increasing the more document the training has and decreasing the more categories the algorithm needs to consider. The Multi-Layer Perceptron perform best followed by Random Forest and finally Decision Tree. The document length matters, when larger documents are considered during training the score is increasing considerably. A user survey about the paraphrase algorithms shows the paraphrase result is insufficient to satisfy editors need. It confirms a need for more memory to conduct longer experiments. Machine learning data mining big data news events journalists editors text analysis natural language processing nlp document vectors seq2seq recurrent neural network Computer Systems Datorsystem
60	Pokročilá klasifikace poruch srdečního rytmu v EKG / Advanced classification of cardiac arrhythmias in ECG Sláma, Štěpán January 2020 (has links) This work focuses on a theoretical explanation of heart rhythm disorders and the possibility of their automatic detection using deep learning networks. For the purposes of this work, a total of 6884 10-second ECG recordings with measured eight leads were used. Those recordings were divided into 5 groups according to heart rhythm into a group of records with atrial fibrillation, sinus rhythms, supraventricular rhythms, ventricular rhythms, and the last group consisted of the others records. Individual groups were unbalanced represented and more than 85 % of the total number of data are sinus rhythm group records. The used classification methods served effectively as a record detector of the largest group and the most effective of all was a procedure consisting of a 2D convolutional neural network into which data entered in the form of scalalograms (classification procedure number 3). It achieved results of precision of 91%, recall of 96% and F1-score values of 0.93. On the contrary, when classifying all groups at the same time, there were no such quality results for all groups. The most efficient procedure seems to be a variant composed of PCA on eight input signals with the gain of one output signal, which becomes the input of a 1D convolutional neural network (classification procedure number 5). This procedure achieved the following F1-score values: 1) group of records with atrial fibrillation 0.54, 2) group of sinus rhythms 0.91, 3) group of supraventricular rhythms 0.65, 4) group of ventricular rhythms 0.68, 5) others records 0.65.

Search results