• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 305
  • 96
  • 41
  • 24
  • 17
  • 11
  • 9
  • 6
  • 5
  • 5
  • 4
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 614
  • 318
  • 204
  • 170
  • 140
  • 115
  • 102
  • 101
  • 88
  • 77
  • 65
  • 56
  • 55
  • 55
  • 54
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
541

Rekurentní neuronové sítě v počítačovém vidění / Recurrent Neural Networks in Computer Vision

Křepský, Jan January 2011 (has links)
The thesis concentrates on using recurrent neural networks in computer vision. The theoretical part describes the basic knowledge about artificial neural networks with focus on a recurrent architecture. There are presented some of possible applications of the recurrent neural networks which could be used for a solution of real problems. The practical part concentrates on face recognition from an image sequence using the Elman simple recurrent network. For training there are used the backpropagation and backpropagation through time algorithms.
542

Rekurentní neuronové sítě pro rozpoznávání řeči / Recurrent Neural Networks for Speech Recognition

Nováčik, Tomáš January 2016 (has links)
This master thesis deals with the implementation of various types of recurrent neural networks via programming language lua using torch library. It focuses on finding optimal strategy for training recurrent neural networks and also tries to minimize the duration of the training. Furthermore various types of regularization techniques are investigated and implemented into the recurrent neural network architecture. Implemented recurrent neural networks are compared on the speech recognition task using AMI dataset, where they model the acustic information. Their performance is also compared to standard feedforward neural network. Best results are achieved using BLSTM architecture. The recurrent neural network are also trained via CTC objective function on the TIMIT dataset. Best result is again achieved using BLSTM architecture.
543

Cohorte de réseaux de neurones récurrents pour la reconnaissance de l'écriture / Cohort of recurrent neural networks for handwriting recognition

Stuner, Bruno 11 June 2018 (has links)
Les méthodes à l’état de l’art de la reconnaissance de l’écriture sont fondées sur des réseaux de neurones récurrents (RNN) à cellules LSTM ayant des performances remarquables. Dans cette thèse, nous proposons deux nouveaux principes la vérification lexicale et la génération de cohorte afin d’attaquer les problèmes de la reconnaissance de l’écriture : i) le problème des grands lexiques et des décodages dirigés par le lexique ii) la problématique de combinaison de modèles optiques pour une meilleure reconnaissance iii) la nécessité de constituer de très grands ensembles de données étiquetées dans un contexte d’apprentissage profond. La vérification lexicale est une alternative aux décodages dirigés par le lexique peu étudiée à cause des faibles performances des modèles optiques historiques (HMM). Nous montrons dans cette thèse qu’elle constitue une alternative intéressante aux approches dirigées par le lexique lorsqu’elles s’appuient sur des modèles optiques très performants comme les RNN LSTM. La génération de cohorte permet de générer facilement et rapidement un grand nombre de réseaux récurrents complémentaires en un seul apprentissage. De ces deux techniques nous construisons et proposons un nouveau schéma de cascade pour la reconnaissance de mots isolés, une nouvelle combinaison au niveau ligne LV-ROVER et une nouvelle stratégie d’auto-apprentissage de RNN LSTM pour la reconnaissance de mots isolés. La cascade proposée permet de combiner avec la vérification lexicale des milliers de réseaux et atteint des résultats à l’état de l’art pour les bases Rimes et IAM. LV-ROVER a une complexité réduite par rapport à l’algorithme original ROVER et permet de combiner des centaines de réseaux sans modèle de langage tout en dépassant l’état de l’art pour la reconnaissance de lignes sur le jeu de donnéesRimes. Notre stratégie d’auto-apprentissage permet d’apprendre à partir d’un seul réseau BLSTM et sans paramètres grâce à la cohorte et la vérification lexicale, elle montre d’excellents résultats sur les bases Rimes et IAM. / State-of-the-art methods for handwriting recognition are based on LSTM recurrent neural networks (RNN) which achieve high performance recognition. In this thesis, we propose the lexicon verification and the cohort generation as two new building blocs to tackle the problem of handwriting recognition which are : i) the large vocabulary problem and the use of lexicon driven methods ii) the combination of multiple optical models iii) the need for large labeled dataset for training RNN. The lexicon verification is an alternative to the lexicon driven decoding process and can deal with lexicons of 3 millions words. The cohort generation is a method to get easily and quickly a large number of complementary recurrent neural networks extracted from a single training. From these two new techniques we build and propose a new cascade scheme for isolated word recognition, a new line level combination LV-ROVER and a new self-training strategy to train LSTM RNN for isolated handwritten words recognition. The proposed cascade combines thousands of LSTM RNN with lexicon verification and achieves state-of-the art word recognition performance on the Rimes and IAM datasets. The Lexicon Verified ROVER : LV-ROVER, has a reduce complexity compare to the original ROVER algorithm and combine hundreds of recognizers without language models while achieving state of the art for handwritten line text on the RIMES dataset. Our self-training strategy use both labeled and unlabeled data with the unlabeled data being self-labeled by its own lexicon verified predictions. The strategy enables self-training with a single BLSTM and show excellent results on the Rimes and Iam datasets.
544

Prognosis of cancer patients : input of standard and joint frailty models / Pronostic en cancérologie : apport des modèles à fragilité standards et conjoints

Mauguen, Audrey 28 November 2014 (has links)
La recherche sur le traitement des cancers a évolué durant les dernières années principalement dans une direction: la médecine personnalisée. Idéalement, le choix du traitement doit être basé sur les caractéristiques dupatient et de sa tumeur. Cet objectif nécessite des développements biostatistiques, pour pouvoir évaluer lesmodèles pronostiques, et in fine proposer le meilleur. Dans une première partie, nous considérons le problèmede l’évaluation d’un score pronostique dans le cadre de données multicentriques. Nous étendons deux mesuresde concordance aux données groupées analysées par un modèle à fragilité partagée. Les deux niveaux inter etintra-groupe sont étudiés, et l’impact du nombre et de la taille des groupes sur les performances des mesuresest analysé. Dans une deuxième partie, nous proposons d’améliorer la prédiction du risque de décès en tenantcompte des rechutes précédemment observées. Pour cela nous développons une prédiction issue d’un modèleconjoint pour un événement récurrent et un événement terminal. Les prédictions individuelles proposées sontdynamiques, dans le sens où le temps et la fenêtre de prédiction peuvent varier, afin de pouvoir mettre à jourla prédiction lors de la survenue de nouveaux événements. Les prédictions sont développées sur une série hospitalièrefrançaise, et une validation externe est faite sur des données de population générale issues de registres decancer anglais et néerlandais. Leurs performances sont comparées à celles d’une prédiction issue d’une approchelandmark. Dans une troisième partie, nous explorons l’utilisation de la prédiction proposée pour diminuer ladurée des essais cliniques. Les temps de décès non observés des derniers patients inclus sont imputés en utilisantl’information des patients ayant un suivi plus long. Nous comparons trois méthodes d’imputation : un tempsde survie moyen, un temps échantillonné dans une distribution paramétrique et un temps échantillonné dansune distribution non-paramétrique des temps de survie. Les méthodes sont comparées en termes d’estimationdes paramètres (coefficient et écart-type), de risque de première espèce et de puissance. / Research on cancer treatment has been evolving for last years in one main direction: personalised medicine. Thetreatment choice must be done according to the patients’ and tumours’ characteristics. This goal requires somebiostatistical developments, in order to assess prognostic models and eventually propose the best one. In a firstpart, we consider the problem of assessing a prognostic score when multicentre data are used. We extended twoconcordance measures to clustered data in the context of shared frailty model. Both the between-cluster andthe within-cluster levels are studied, and the impact of the cluster number and size on the performance of themeasures is investigated. In a second part, we propose to improve the prediction of the risk of death accountingfor the previous observed relapses. For that, we develop predictions from a joint model for a recurrent event anda terminal event. The proposed individual prediction is dynamic, both the time and the horizon of predictioncan evolve, so that the prediction can be updated at each new event time. The prediction is developed ona French hospital series, and externally validated on population-based data from English and Dutch cancerregistries. Its performances are compared to those of a landmarking approach. In a third part, we explore theuse of the proposed prediction to reduce the clinical trial duration. The non-observed death times of the lastincluded patients are imputed using the information of the patients with longer follow-up. We compared threemethods to impute the data: a survival mean time, a time sampled from the parametric distribution and atime sampled from a non-parametric distribution of the survival times. The comparison is made in terms ofparameters estimation (coefficient and standard-error), type-I error and power.
545

Recurrent neural models and related problems in natural language processing

Zhang, Saizheng 04 1900 (has links)
No description available.
546

Essays on monetary macroeconomics

Almosova, Anna 05 September 2019 (has links)
Diese Dissertation beschäftigt sich mit drei relevanten Aufgabebereichen einer Zentralbank und untersucht die makroökonomische Prognose, die Analyse der Geldpolitik in einem makroökonomischen Modell und die Analyse des Währungssystems. Jedes dieser Phänomene wird mit Hilfe des passenden Modells nach Nichtlinearitäten untersucht. Der erste Teil der Dissertation zeigt, dass nichtlineare rekurrente neuronale Netze, eine Methode aus dem Bereich Maschinelles Lernen, die Standard-Methoden übertreffen können und präzise Vorhersagen der Inflation in 1 bis 12 Monaten liefern können. Der zweiter Teil analysiert eine nichtlineare Formulierung der monetären Taylor-Regel. Anhand der Schätzung eines nichtlinearen DSGE Modells wird gezeigt, dass die Taylor-Regel in den USA asymmetrisch ist. Die Zentralbank ergreift stärkere Maßnahmen, wenn die Inflation höher ist als die Zielinflation, und reagiert weniger wenn die Inflation niedriger als die Zielinflation ist. Gleicherweise ist die Reaktion der monetären Politik stärker bei zu geringem Produktionswachstum als bei zu hohem. Der dritte Teil der Dissertation formuliert ein theoretisches Modell, das für eine Analyse der digitalen dezentralen Währungen verwendet werden kann. Es werden die Bedingungen bestimmt, unter denen der Wettbewerb zwischen der Währung der Zentralbank und den digitalen Währungen einige Beschränkungen für die Geldpolitik darstellt. / This thesis addresses three topics that are relevant for the central bank policy design. It analyzes forecasting of the macroeconomic time series, accurate monetary policy formulation in a general equilibrium macroeconomic model and monitoring of the novel developments in the monetary system. All these issues are analyzed in a nonlinear framework with the help of a macroeconomic model. The first part of the thesis shows that nonlinear recurrent neural networks – a method from the machine learning literature – outperforms the usual benchmark forecasting models and delivers accurate inflation predictions for 1 to 12 months ahead. The second part of the thesis analyzes a nonlinear formulation of the Taylor rule. With the help of the nonlinear Bayesian estimation of a DSGE model it shows that the Taylor rule in the US is asymmetric. The central bank reacts stronger to inflation when it is above the target than when it is below the target. Similarly, the reaction to the output growth rate is stronger when the output growth is too weak than when it is too strong. The last part of the thesis develops a theoretical model that is suitable for the analysis of decentralized digital currencies. The model is used to derive the conditions, under which the competition between digital and fiat currencies imposes restrictions on the monetary policy design.
547

Prediction of the number of weekly covid-19 infections : A comparison of machine learning methods

Branding, Nicklas January 2022 (has links)
The thesis two-folded problem aim was to identify and evaluate candidate Machine Learning (ML) methods and performance methods, for predicting the weekly number of covid-19 infections. The two-folded problem aim was created from studying public health studies where several challenges were identified. One challenge identified was the lack of using sophisticated and hybrid ML methods in the public health research area. In this thesis a comparison of ML methods for predicting the number of covid-19 weekly infections has been performed. A dataset taken from the Public Health Agency in Sweden consisting of 101weeks divided into a 60 % training set and a 40% testing set was used in the evaluation. Five candidate ML methods have been investigated in this thesis called Support Vector Regressor (SVR), Long Short Term Memory (LSTM), Gated Recurrent Network (GRU), Bidirectional-LSTM (BI-LSTM) and LSTM-Convolutional Neural Network (LSTM-CNN). These methods have been evaluated based on three performance measurements called Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and R2. The evaluation of these candidate ML resulted in the LSTM-CNN model performing the best on RMSE, MAE and R2.
548

A comparative study of Neural Network Forecasting models on the M4 competition data

Ridhagen, Markus, Lind, Petter January 2021 (has links)
The development of machine learning research has provided statistical innovations and further developments within the field of time series analysis. This study seeks to investigate two different approaches on artificial neural network models based on different learning techniques, and answering how well the neural network approach compares with a basic autoregressive approach, as well as how the artificial neural network models compare to each other. The models were compared and analyzed in regards to the univariate forecast accuracy on 20 randomly drawn time series from two different time frequencies from the M4 competition dataset. Forecasting was made dependent on one time lag (t-1) and forecasted three and six steps ahead respectively. The artificial neural network models outperformed the baseline Autoregressive model, showing notably lower mean average percentage error overall. The Multilayered perceptron models performed better than the Long short-term memory model overall, whereas the Long short-term memory model showed improvement on longer prediction time dimensions. As the training were done univariately  on a limited set of time steps, it is believed that the one layered-approach gave a good enough approximation on the data, whereas the added layer couldn’t fully utilize its strengths of processing power. Likewise, the Long short-term memory model couldn’t fully demonstrate the advantagements of recurrent learning. Using the same dataset, further studies could be made with another approach to data processing. Implementing an unsupervised approach of clustering the data before analysis, the same models could be tested with multivariate analysis on models trained on multiple time series simultaneously.
549

Homeostatic Plasticity in Input-Driven Dynamical Systems

Toutounji, Hazem 26 February 2015 (has links)
The degree by which a species can adapt to the demands of its changing environment defines how well it can exploit the resources of new ecological niches. Since the nervous system is the seat of an organism's behavior, studying adaptation starts from there. The nervous system adapts through neuronal plasticity, which may be considered as the brain's reaction to environmental perturbations. In a natural setting, these perturbations are always changing. As such, a full understanding of how the brain functions requires studying neuronal plasticity under temporally varying stimulation conditions, i.e., studying the role of plasticity in carrying out spatiotemporal computations. It is only then that we can fully benefit from the full potential of neural information processing to build powerful brain-inspired adaptive technologies. Here, we focus on homeostatic plasticity, where certain properties of the neural machinery are regulated so that they remain within a functionally and metabolically desirable range. Our main goal is to illustrate how homeostatic plasticity interacting with associative mechanisms is functionally relevant for spatiotemporal computations. The thesis consists of three studies that share two features: (1) homeostatic and synaptic plasticity act on a dynamical system such as a recurrent neural network. (2) The dynamical system is nonautonomous, that is, it is subject to temporally varying stimulation. In the first study, we develop a rigorous theory of spatiotemporal representations and computations, and the role of plasticity. Within the developed theory, we show that homeostatic plasticity increases the capacity of the network to encode spatiotemporal patterns, and that synaptic plasticity associates these patterns to network states. The second study applies the insights from the first study to the single node delay-coupled reservoir computing architecture, or DCR. The DCR's activity is sampled at several computational units. We derive a homeostatic plasticity rule acting on these units. We analytically show that the rule balances between the two necessary processes for spatiotemporal computations identified in the first study. As a result, we show that the computational power of the DCR significantly increases. The third study considers minimal neural control of robots. We show that recurrent neural control with homeostatic synaptic dynamics endows the robots with memory. We show through demonstrations that this memory is necessary for generating behaviors like obstacle-avoidance of a wheel-driven robot and stable hexapod locomotion.
550

Using a Character-Based Language Model for Caption Generation / Användning av teckenbaserad språkmodell för generering av bildtext

Keisala, Simon January 2019 (has links)
Using AI to automatically describe images is a challenging task. The aim of this study has been to compare the use of character-based language models with one of the current state-of-the-art token-based language models, im2txt, to generate image captions, with focus on morphological correctness. Previous work has shown that character-based language models are able to outperform token-based language models in morphologically rich languages. Other studies show that simple multi-layered LSTM-blocks are able to learn to replicate the syntax of its training data. To study the usability of character-based language models an alternative model based on TensorFlow im2txt has been created. The model changes the token-generation architecture into handling character-sized tokens instead of word-sized tokens. The results suggest that a character-based language model could outperform the current token-based language models, although due to time and computing power constraints this study fails to draw a clear conclusion. A problem with one of the methods, subsampling, is discussed. When using the original method on character-sized tokens this method removes characters (including special characters) instead of full words. To solve this issue, a two-phase approach is suggested, where training data first is separated into word-sized tokens where subsampling is performed. The remaining tokens are then separated into character-sized tokens. Future work where the modified subsampling and fine-tuning of the hyperparameters are performed is suggested to gain a clearer conclusion of the performance of character-based language models.

Page generated in 0.062 seconds