• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 305
  • 96
  • 41
  • 24
  • 17
  • 11
  • 9
  • 6
  • 5
  • 5
  • 4
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 614
  • 318
  • 204
  • 170
  • 140
  • 115
  • 102
  • 101
  • 88
  • 77
  • 65
  • 56
  • 55
  • 55
  • 54
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
301

apprentissage de séquences et extraction de règles de réseaux récurrents : application au traçage de schémas techniques. / sequence learning and rules extraction from recurrent neural networks : application to the drawing of technical diagrams

Chraibi Kaadoud, Ikram 02 March 2018 (has links)
Deux aspects importants de la connaissance qu'un individu a pu acquérir par ses expériences correspondent à la mémoire sémantique (celle des connaissances explicites, comme par exemple l'apprentissage de concepts et de catégories décrivant les objets du monde) et la mémoire procédurale (connaissances relatives à l'apprentissage de règles ou de la syntaxe). Cette "mémoire syntaxique" se construit à partir de l'expérience et notamment de l'observation de séquences, suites d'objets dont l'organisation séquentielle obéit à des règles syntaxiques. Elle doit pouvoir être utilisée ultérieurement pour générer des séquences valides, c'est-à-dire respectant ces règles. Cette production de séquences valides peut se faire de façon explicite, c'est-à-dire en évoquant les règles sous-jacentes, ou de façon implicite, quand l'apprentissage a permis de capturer le principe d'organisation des séquences sans recours explicite aux règles. Bien que plus rapide, plus robuste et moins couteux en termes de charge cognitive que le raisonnement explicite, le processus implicite a pour inconvénient de ne pas donner accès aux règles et de ce fait, de devenir moins flexible et moins explicable. Ces mécanismes mnésiques s'appliquent aussi à l'expertise métier : la capitalisation des connaissances pour toute entreprise est un enjeu majeur et concerne aussi bien celles explicites que celles implicites. Au début, l'expert réalise un choix pour suivre explicitement les règles du métier. Mais ensuite, à force de répétition, le choix se fait automatiquement, sans évocation explicite des règles sous-jacentes. Ce changement d'encodage des règles chez un individu en général et particulièrement chez un expert métier peut se révéler problématique lorsqu'il faut expliquer ou transmettre ses connaissances. Si les concepts métiers peuvent être formalisés, il en va en général de tout autre façon pour l'expertise. Dans nos travaux, nous avons souhaité nous pencher sur les séquences de composants électriques et notamment la problématique d’extraction des règles cachées dans ces séquences, aspect important de l’extraction de l’expertise métier à partir des schémas techniques. Nous nous plaçons dans le domaine connexionniste, et nous avons en particulier considéré des modèles neuronaux capables de traiter des séquences. Nous avons implémenté deux réseaux de neurones récurrents : le modèle de Elman et un modèle doté d’unités LSTM (Long Short Term Memory). Nous avons évalué ces deux modèles sur différentes grammaires artificielles (grammaire de Reber et ses variations) au niveau de l’apprentissage, de leurs capacités de généralisation de celui-ci et leur gestion de dépendances séquentielles. Finalement, nous avons aussi montré qu’il était possible d’extraire les règles encodées (issues des séquences) dans le réseau récurrent doté de LSTM, sous la forme d’automate. Le domaine électrique est particulièrement pertinent pour cette problématique car il est plus contraint avec une combinatoire plus réduite que la planification de tâches dans des cas plus généraux comme la navigation par exemple, qui pourrait constituer une perspective de ce travail. / There are two important aspects of the knowledge that an individual acquires through experience. One corresponds to the semantic memory (explicit knowledge, such as the learning of concepts and categories describing the objects of the world) and the other, the procedural or syntactic memory (knowledge relating to the learning of rules or syntax). This "syntactic memory" is built from experience and particularly from the observation of sequences of objects whose organization obeys syntactic rules.It must have the capability to aid recognizing as well as generating valid sequences in the future, i.e., sequences respecting the learnt rules. This production of valid sequences can be done either in an explicit way, that is, by evoking the underlying rules, or implicitly, when the learning phase has made it possible to capture the principle of organization of the sequences without explicit recourse to the rules. Although the latter is faster, more robust and less expensive in terms of cognitive load as compared to explicit reasoning, the implicit process has the disadvantage of not giving access to the rules and thus becoming less flexible and less explicable. These mnemonic mechanisms can also be applied to business expertise. The capitalization of information and knowledge in general, for any company is a major issue and concerns both the explicit and implicit knowledge. At first, the expert makes a choice to explicitly follow the rules of the trade. But then, by dint of repetition, the choice is made automatically, without explicit evocation of the underlying rules. This change in encoding rules in an individual in general and particularly in a business expert can be problematic when it is necessary to explain or transmit his or her knowledge. Indeed, if the business concepts can be formalized, it is usually in any other way for the expertise which is more difficult to extract and transmit.In our work, we endeavor to observe sequences of electrical components and in particular the problem of extracting rules hidden in these sequences, which are an important aspect of the extraction of business expertise from technical drawings. We place ourselves in the connectionist domain, and we have particularly considered neuronal models capable of processing sequences. We implemented two recurrent neural networks: the Elman model and a model with LSTM (Long Short Term Memory) units. We have evaluated these two models on different artificial grammars (Reber's grammar and its variations) in terms of learning, their generalization abilities and their management of sequential dependencies. Finally, we have also shown that it is possible to extract the encoded rules (from the sequences) in the recurrent network with LSTM units, in the form of an automaton. The electrical domain is particularly relevant for this problem. It is more constrained with a limited combinatorics than the planning of tasks in general cases like navigation for example, which could constitute a perspective of this work.
302

Le rôle de la balance entre excitation et inhibition dans l'apprentissage dans les réseaux de neurones à spikes / The role of balance between excitation and inhibition in learning in spiking networks

Bourdoukan, Ralph 10 October 2016 (has links)
Lorsqu'on effectue une tâche, les circuits neuronaux doivent représenter et manipuler des stimuli continus à l'aide de potentiels d'action discrets. On suppose communément que les neurones représentent les quantités continues à l'aide de leur fréquence de décharge et ceci indépendamment les un des autres. Cependant, un tel codage indépendant est inefficace puisqu'il exige la génération d'un très grand nombre de potentiels d'action pour atteindre un certain niveau de précision. Dans ces travaux, on montre que les neurones d'un réseau récurrent peuvent apprendre - à l'aide d'une règle de plasticité locale - à coordonner leurs potentiels d'actions afin de représenter l'information avec une très haute précision tout en déchargeant de façon minimale. La règle d'apprentissage qui agit sur les connexions récurrentes, conduit à un codage efficace en imposant au niveau de chaque neurone un équilibre précis entre excitation et inhibition. Cet équilibre est un phénomène fréquemment observer dans le cerveau et c'est un principe central de notre théorie. On dérive également deux autres règles d'apprentissages biologiquement plausibles qui permettent respectivement au réseau de s'adapter aux statistiques de ses entrées et d'effectuer des transformations complexes et dynamiques sur elles. Finalement, dans ces réseaux, le stochasticité du temps de décharge d'un neurone n'est pas la signature d'un bruit mais au contraire de précision et d'efficacité. Le caractère aléatoire du temps de décharge résulte de la dégénérescence de la représentation. Ceci constitue donc une interprétation radicalement différente et nouvelle de l'irrégularité trouvée dans des trains de potentiels d'actions. / When performing a task, neural circuits must represent and manipulate continuous stimuli using discrete action potentials. It is commonly assumed that neurons represent continuous quantities with their firing rate and this independently from one another. However, such independent coding is very inefficient because it requires the generation of a large number of action potentials in order to achieve a certain level of accuracy. We show that neurons in a spiking recurrent network can learn - using a local plasticity rule - to coordinate their action potentials in order to represent information with high accuracy while discharging minimally. The learning rule that acts on recurrent connections leads to such an efficient coding by imposing a precise balance between excitation and inhibition at the level of each neuron. This balance is a frequently observed phenomenon in the brain and is central in our work. We also derive two biologically plausible learning rules that respectively allows the network to adapt to the statistics of its inputs and to perform complex and dynamic transformations on them. Finally, in these networks, the stochasticity of the spike timing is not a signature of noise but rather of precision and efficiency. In fact, the random nature of the spike times results from the degeneracy of the representation. This constitutes a new and a radically different interpretation of the irregularity found in spike trains.
303

Real-time Process Modelling Based on Big Data Stream Learning

He, Fan January 2017 (has links)
Most control systems now are assumed to be unchangeable, but this is an ideal situation. In real applications, they are often accompanied with many changes. Some of changes are from environment changes, and some are system requirements. So, the goal of thesis is to model a dynamic adaptive real-time control system process with big data stream. In this way, control system model can adjust itself using example measurements acquired during the operation and give suggestion to next arrival input, which also indicates the accuracy of states under control highly depends on quality of the process model.   In this thesis, we choose recurrent neural network to model process because it is a kind of cheap and fast artificial intelligence. In most of existent artificial intelligence, a database is necessity and the bigger the database is, the more accurate result can be. For example, in case-based reasoning, testcase should be compared with all of cases in database, then take the closer one’s result as reference. However, in neural network, it does not need any big database to support and search, and only needs simple calculation instead, because information is all stored in each connection. All small units called neuron are linear combination, but a neural network made up of neurons can perform some complex and non-linear functionalities. For training part, Backpropagation and Kalman filter are used together. Backpropagation is a widely-used and stable optimization algorithm. Kalman filter is new to gradient-based optimization, but it has been proved to converge faster than other traditional first-order-gradient-based algorithms.   Several experiments were prepared to compare new and existent algorithms under various circumstances. The first set of experiments are static systems and they are only used to investigate convergence rate and accuracy of different algorithms. The second set of experiments are time-varying systems and the purpose is to take one more attribute, adaptivity, into consideration.
304

On The Effectiveness of Multi-TaskLearningAn evaluation of Multi-Task Learning techniques in deep learning models

Tovedal, Sofiea January 2020 (has links)
Multi-Task Learning is today an interesting and promising field which many mention as a must for achieving the next level advancement within machine learning. However, in reality, Multi-Task Learning is much more rarely used in real-world implementations than its more popular cousin Transfer Learning. The questionis why that is and if Multi-Task Learning outperforms its Single-Task counterparts. In this thesis different Multi-Task Learning architectures were utilized in order to build a model that can handle labeling real technical issues within two categories. The model faces a challenging imbalanced data set with many labels to choose from and short texts to base its predictions on. Can task-sharing be the answer to these problems? This thesis investigated three Multi-Task Learning architectures and compared their performance to a Single-Task model. An authentic data set and two labeling tasks was used in training the models with the method of supervised learning. The four model architectures; Single-Task, Multi-Task, Cross-Stitched and the Shared-Private, first went through a hyper parameter tuning process using one of the two layer options LSTM and GRU. They were then boosted by auxiliary tasks and finally evaluated against each other.
305

Machine Learning for Disease Prediction

Frandsen, Abraham Jacob 01 June 2016 (has links)
Millions of people in the United States alone suffer from undiagnosed or late-diagnosed chronic diseases such as Chronic Kidney Disease and Type II Diabetes. Catching these diseases earlier facilitates preventive healthcare interventions, which in turn can lead to tremendous cost savings and improved health outcomes. We develop algorithms for predicting disease occurrence by drawing from ideas and techniques in the field of machine learning. We explore standard classification methods such as logistic regression and random forest, as well as more sophisticated sequence models, including recurrent neural networks. We focus especially on the use of medical code data for disease prediction, and explore different ways for representing such data in our prediction algorithms.
306

Feature Fusion Deep Learning Method for Video and Audio Based Emotion Recognition

Yanan Song (11825003) 20 December 2021 (has links)
In this thesis, we proposed a deep learning based emotion recognition system in order to improve the successive classification rate. We first use transfer learning to extract visual features and use Mel frequency Cepstral Coefficients(MFCC) to extract audio features, and then apply the recurrent neural networks(RNN) with attention mechanism to process the sequential inputs. After that, the outputs of both channels are fused into a concatenate layer, which is processed using batch normalization, to reduce internal covariate shift. Finally, the classification result is obtained by the softmax layer. From our experiments, the video and audio subsystem achieve 78% and 77% respectively, and the feature fusion system with video and audio achieves 92% accuracy based on the RAVDESS dataset for eight emotion classes. Our proposed feature fusion system outperforms conventional methods in terms of classification prediction.
307

Translating LaTeX to Coq: A Recurrent Neural Network Approach to Formalizing Natural Language Proofs

Carman, Benjamin Andrew 18 May 2021 (has links)
No description available.
308

Solving Prediction Problems from Temporal Event Data on Networks

Sha, Hao 08 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Many complex processes can be viewed as sequential events on a network. In this thesis, we study the interplay between a network and the event sequences on it. We first focus on predicting events on a known network. Examples of such include: modeling retweet cascades, forecasting earthquakes, and tracing the source of a pandemic. In specific, given the network structure, we solve two types of problems - (1) forecasting future events based on the historical events, and (2) identifying the initial event(s) based on some later observations of the dynamics. The inverse problem of inferring the unknown network topology or links, based on the events, is also of great important. Examples along this line include: constructing influence networks among Twitter users from their tweets, soliciting new members to join an event based on their participation history, and recommending positions for job seekers according to their work experience. Following this direction, we study two types of problems - (1) recovering influence networks, and (2) predicting links between a node and a group of nodes, from event sequences.
309

Comparing Encoder-Decoder Architectures for Neural Machine Translation: A Challenge Set Approach

Doan, Coraline 19 November 2021 (has links)
Machine translation (MT) as a field of research has known significant advances in recent years, with the increased interest for neural machine translation (NMT). By combining deep learning with translation, researchers have been able to deliver systems that perform better than most, if not all, of their predecessors. While the general consensus regarding NMT is that it renders higher-quality translations that are overall more idiomatic, researchers recognize that NMT systems still struggle to deal with certain classic difficulties, and that their performance may vary depending on their architecture. In this project, we implement a challenge-set based approach to the evaluation of examples of three main NMT architectures: convolutional neural network-based systems (CNN), recurrent neural network-based (RNN) systems, and attention-based systems, trained on the same data set for English to French translation. The challenge set focuses on a selection of lexical and syntactic difficulties (e.g., ambiguities) drawn from literature on human translation, machine translation, and writing for translation, and also includes variations in sentence lengths and structures that are recognized as sources of difficulties even for NMT systems. This set allows us to evaluate performance in multiple areas of difficulty for the systems overall, as well as to evaluate any differences between architectures’ performance. Through our challenge set, we found that our CNN-based system tends to reword sentences, sometimes shifting their meaning, while our RNN-based system seems to perform better when provided with a larger context, and our attention-based system seems to struggle the longer a sentence becomes.
310

Arrival Time Predictions for Buses using Recurrent Neural Networks / Ankomsttidsprediktioner för bussar med rekurrenta neurala nätverk

Fors Johansson, Christoffer January 2019 (has links)
In this thesis, two different types of bus passengers are identified. These two types, namely current passengers and passengers-to-be have different needs in terms of arrival time predictions. A set of machine learning models based on recurrent neural networks and long short-term memory units were developed to meet these needs. Furthermore, bus data from the public transport in Östergötland county, Sweden, were collected and used for training new machine learning models. These new models are compared with the current prediction system that is used today to provide passengers with arrival time information. The models proposed in this thesis uses a sequence of time steps as input and the observed arrival time as output. Each input time step contains information about the current state such as the time of arrival, the departure time from thevery first stop and the current position in Cartesian coordinates. The targeted value for each input is the arrival time at the next time step. To predict the rest of the trip, the prediction for the next step is simply used as input in the next time step. The result shows that the proposed models can improve the mean absolute error per stop between 7.2% to 40.9% compared to the system used today on all eight routes tested. Furthermore, the choice of loss function introduces models thatcan meet the identified passengers need by trading average prediction accuracy for a certainty that predictions do not overestimate or underestimate the target time in approximately 95% of the cases.

Page generated in 0.0669 seconds