Global ETD Search

21	Da computação paraconsistente a computação quantica Agudelo, Juan Carlos Agudelo 05 August 2006 (has links) Orientador: Walter Alexandre Carnielli / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Filosofia e Ciencias Humanas / Made available in DSpace on 2018-08-06T09:39:40Z (GMT). No. of bitstreams: 1 Agudelo_JuanCarlosAgudelo_M.pdf: 2515941 bytes, checksum: 58c117425f8731bb67cf3fc0e1181ad4 (MD5) Previous issue date: 2006 / Resumo: As diferentes interpretações da mecânica quântica levanta sérios problemas filosóficos a respeito da natureza do mundo físico e do estatuto das teorias físicas. Tais interpretações desempenham um papel importante na compreensão dos modelos de computação quântica, e por sua vez os modelos de computação quântica abrem a possibilidade de se confrontar as teses filosóficas que se atrevem a responder a tais problemas. Apesar das relevantes e surpreendentes promessas de uso pragmático e tecnológico da computação quântica, não é por essa vereda que caminha este trabalho: o que aqui se oferece é um novo paradigma de computação (um modelo de computação baseado no paradigma paraconsistente), e se propõe uma nova interpretação da computação quântica através desse novo paradigma, dessa forma colaborando simultaneamente na discussão filosófica a respeito da noção de computabilidade e da mecânica quântica. O presente trabalho introduz a definição do que será chamado de modelo de máquinas de Turing paraconsistentes (MTPs). Tal modelo de computação é uma generalização do modelo clássico de máquinas de Turing. No modelo de máquinas de Turing paraconsistentes, diferentemente do modelo clássico, permite-se a execução de múltiplas instruções de mancha simultânea, dando lugar a multiplicidade de símbolos em diferentes casas da fita, multiplicidade de estados e multiplicidade dc posições da máquina. Considerando que tal multiplicidade de configurações, embora essencial nas MTPs pode ser interpretado como incoerências com respeito às máquinas de Turing clássicas, permite-se acrescentar condições de consistência e inconsistência na execução das instruções nas MTPs. o que servirá então para controlar o estado dc incoerência do sistema. Depois de apresentar o modelo de MTPs, são descritos os modelo de máquinas de Turing quânticas (MTQs) e o modelo dc circuitos quânticos (CQs), ambos introduzidos inicialmente por David Dcutsch. os quais são, respectivamente, generalizações do modelo de máquinas dc Turing clássicas e de circuitos boolcanos clássicos usando as leis da mecânica quântica. Finalmente, estabelecem-se relações entre o modelo de MTPs e os modelos de computação quântica, simulando algoritmos quânticos simples (um CQ que soluciona o chamado problema de Dcutsch e um CQ que soluciona o chamado problema de Deutsch-Josza) e mostrando que o paralelismo quântico, uma característica essencial da computação quântica., pode em alguns casos ser simulado por meio de MTPs. Dessa forma, apesar de o particular modelo de MTPs aqui apresentado ter algumas restrições na simulação de certas características da computação quântica, abre-se a possibilidade de se definir outros modelos de MTPs de maneira a simular tais características. Em resumo, o presente trabalho, na medida cm que oferece um novo paradigma de computação (a saber, a computação paraconsistente) e uma nova interpretação dos modelos de computação quântica (a saber, a interpretação da computação quântica através da computação paraconsistente) contribui para a discussão filosófica a respeito da interpretação dos modelos de computação quântica, e possivelmente da interpretação da própria mecânica quântica. Contudo, não menos importante é o fato de que, apesar de o presente trabalho não pretender se dedicar a questões puramente técnicas da computabilidade, ele de fato abre um imenso campo de investigação a respeito da computação relativizada à lógica e suas implicações - no caso presente, relativizada à lógica paraconsistente / Abstract: The interpretations of quantum mechanics open serious philosophical questions about the nature of the physical world and about the status of physical theories. Such interpretations play an important role in the understanding of models of quantum computing, and models of quantum computing, by their turn, open possibilities to confront and test philosophical theses that dare to address such problems. Although the promising pragmatic and technological applications of quantum computing, this work goes in another way: what is here offered is a new paradigm of computation based upon the pa-raconsistency paradigm, and a new interpretation of quantum computing through this model, in this way simultaneously collaborating in the philosophical discussion about the concepts of computability and of quantum mechanics. This work introduces the definition of what will be called the model of paraconsistent Turing machines (PTMs). Such computational model is a generalization of the classical model of Turing machines. In the PTMs model, differently from the classical Turing machines model, simultaneous execution of multiple instructions is allowed, giving rise to a multiplicity of symbols on different cells of the tape, multiplicity of machine states and multiplicity of machine positions. Such multiplicity of configurations, though essential in the PTMs. can be seen as incoherencies with respect to classical Turing machines: to compensate this, the PTMs permit to operate with consistency and inconsistency conditions to control the global state of incoherence of the system. After introducing the PTMs, the model of quantum Turing machines (QTMs) and the model of quantum circuits (QCs) arc presented. This models arc due to David Dcutsch and arc. respectively, generalizations of the model of classical Turing machines and the model of classical boolean circuits, using the laws of quantum mechanics. Finally, relations between the model of PTMs and models of quantum computing arc established, which permits to simulate simple quantum algorithms (a QC to solve the so called Dcutsch problem and a QC to solve the so called Dcutsch-Jozsa. problem) by PTMs. It is also shown that quantum parallelism, an essential characteristic of quantum computation, may be simulated in some cases by PTMs. Although the particular model of PTMs here presented has some restrictions in the capacity to simulate certain quantum computing characteristics, our work opens the possibility to define other PTM models which could simulate such characteristics. To sum up, the present work, while offering a new paradigm of computation (namely, parconsistent computation) and a new interpretation of quantum computing (namely, interpretation of quantum computation by means of paraconsistent computation) contributes to the philosophical discussion about the interpretation of quantum computation and of the quantum mechanics itself. However, not less important is the fact that, besides its lack of explicit intention towards tccnical questions of computability theory, opens a new line of research about the possibilities of logic-relativized computation and its implications- in the present case, relativized to paraconsistent logics / Mestrado / Logica / Mestre em Filosofia Turing, Máquinas de Funções computáveis Lógica - Estudo e ensino Circuitos lógicos Computação quântica Turing machines Computable functions Logic Logical circuits Quantum computation
22	Classical and quantum computing. Hardy, Yorick 29 May 2008 (has links) Prof. W.H. Steeb recursion theory coding theory turing machines neural networks quantum computers quantum theory algorithms boolean algebra data encryption (computer science)
23	Systémy převodníků / Transducer Systems Skácel, Jiří January 2016 (has links) This document defines systems of pushdown transducers. The idea of cooperating distributed grammar systems for components working on one word is adjusted for use of transducers instead of grammars. The transducers cooperate by passing output of one to input of another component. It discusses their descriptive power and equivalency between systems with arbitrary numbers of components. The main conclusion is then comparison of their descriptive power with Turing machines with regard to their translation and accepted languages.
24	Teaching Formal Languages through Visualizations, Machine Simulations, Auto-Graded Exercises, and Programmed Instruction Mohammed, Mostafa Kamel Osman 14 July 2021 (has links) The material taught in a Formal Languages course is mathematical in nature and requires students to practice proofs and algorithms to understand the content. Traditional Formal Languages textbooks are heavy on prose, and homework typically consists of solving many paper exercises. Some instructors make use of finite state machine simulators like the JFLAP package. JFLAP helps students by allowing them to build models and apply various algorithms on these models, which improves student interaction with the studied material. However, students still need to read a significant amount of text and practice problems by hand to achieve understanding. Inspired by the principles of the Programmed Instruction (PI) teaching method, we seek to develop a new Formal Languages eTextbook capable of conveying these concepts more intuitively. The PI approach has students read a little, ideally a sentence or a paragraph, and then answer a question or complete an exercise related to that information. Based on the question response, students can continue to other information frames or retry to solve the exercise. Our goal is to present all algorithms using algorithm visualizations and produce proficiency exercises to let students demonstrate understanding. To evaluate the pedagogical effectiveness of our new eTextbook, we conduct time and performance evaluations across two offerings of the course CS4114 Formal Languages and Automata. In time evaluation, the time spent by students looking at instructional content with text and visualizations versus with PI frames is compared to determine levels of student engagement. In performance evaluation, students grades are compared to assess learning gains with text and paper exercises only, with text, visualizations with exercises, and with PI frames. / Doctor of Philosophy / Theory textbooks in computer science are hard to read and understand. Traditionally, instructors use books that are heavy on mathematical prose and paper exercises. Sometimes, instructors use simulators to allow students to create, simulate, and test models. Previously, we found that students tend to skip reading the text presented in the books. This leads to less understanding of the topics taught in the course. To increase student engagement, we developed a new eTextbook for the Formal Languages course. We used pedagogy based on Programmed Instruction, presenting the content in the form of short bits of prose followed by the related question. If students can solve the question correctly, this means that they understood the content and are ready to move forward. To help both instructors and students, we developed a new Formal Languages simulator named OpenFLAP. OpenFLAP allows instructors to create many exercises, and OpenFLAP can grade these exercises automatically. Formal Languages Programmed Instruction visualizations pedagogical evaluation automated assessment Turing Machines JFLAP OpenFLAP Engagement Taxonomy Auto-graded exercises Proficiency exercises
25	Construction de liens entre algorithmique et logique par du calcul à temps infini / From algorithmics to logic through infinite time computations Ouazzani, Sabrina 02 December 2016 (has links) Cette thèse s'inscrit dans le contexte du calcul en temps infini. Par cette désignation, nous faisons référence au temps indicé par des ordinaux, ces derniers possédant de bonnes propriétés pour ``compter''en leur long. En 2000, le modèle des machines de Turing à temps infini fut proposé par Hamkins et Lewis. Ce modèle généralise le processus de calcul des machines de Turing aux étapes de temps représentées par des ordinaux. Dans ce modèle de calcul, les étapes sont indicées par des ordinaux dénombrables, bien que le ruban soit toujours indicé par des entiers naturels. Les entrées du modèle sont donc les suites infinies de lettres. Un certain nombre de comportements nouveaux et étonnants apparaissent avec ces machines. Dans notre thèse, nous nous intéressons à certains de ces comportements.Naturellement, plus les temps de calcul sont longs, plus le modèle est puissant, et plus il devient possible de décider de nouveaux ensembles.À partir d’ordinaux assez grands, de nouvelles propriétés structurelles apparaissent également. L'une d'entre elles est l'existence de brèches dans les temps possibles d'arrêts de programmes. Lorsque ces brèches furent découvertes, de premiers liens entre elles et le caractère admissible des ordinaux qui les commencent furent établis. Notre approche utilise l'algorithmique pour préciser les liens entre les propriétés logiques des ordinaux et les propriétés calculatoires de ces machines à temps infini.Plus précisément, grâce à des des algorithmes spécifiques, nous découvrons et prouvons de nouvelles propriétés sur ces brèches,amenant à une meilleure compréhension de leur structure. Nous montrons notamment que les brèches peuvent être de toutes les tailles (limites) écrivables, qu'il en existe même de taille au moins aussi grande que leur ordinal de début. Jusqu’à la première brèche ayant cette caractéristique, la structure des brèches est assez proche de celle des ordinaux : elles apparaissent en ordre croissant en fonction de leur taille. Nous montrons également que jusqu'à cette brèche spéciale, si les ordinaux admissibles sont exactement les ordinaux débutant les brèches, au-dessus, des ordinaux admissibles peuvent apparaître au milieu de très grandes brèches et la structure des brèches devient désordonnée. / This thesis is centred on the study of infinite time computations. Infinite time here means having a time axis indexed by ordinals — the ordinals are convenient objects along which we can count. The infinite time Turing machine model was introduced by Hamkins and Lewis in 2000. This model generalises the Turing machine computation model to ordinal time. In this model, stages are indexed by (countable) ordinals, even though the tape is indexed by the integers as in the classical model. This model can thus now have infinite strings as input. The main focus of this thesis is the study of some of the new and surprising behaviours that these machines exhibit. Naturally, the longer, i.e., the greater ordinal, the computations run, the more powerful the model is, i.e. it computes/recognizes more sets. If the computations run beyond certain ordinal times, new structural properties also appear. One of these properties is the existence of gaps in the halting times of the machines. When these gaps had been discovered, some first links had been established between these gaps and the admissible character of the ordinals starting them. Our approach uses algorithmics as a mean to emphasize the links between the logical properties of the ordinals involved and the computational properties of the infinite time Turing machines. Moreover, thanks to some specific algorithms, we discover and prove new properties of these gaps, leading to a better understanding of their structure. We show in particular that gaps can have precisely every possible writable ordinal size and that there are gaps whose length is greater or equal than their starting ordinal point. Until the first of such a gap, the gaps appear in increasing sizes. We also show that even if, before this special gap, admissible ordinals only appear at the beginning of gaps, the gaps structure becomes quite disordered beyond that point, with admissible ordinals appearing not only at the beginning but also inside some (huge) gaps. Modèles de calcul Calcul à temps infini Machines de Turing Ordinaux horlogeables Ordinaux écrivables Ordinaux admissibles Computation models Infinite time computations Turing machines Clockable ordinals Writable ordinals Admissible ordinals
26	Indução finita, deduções e máquina de Turing / Finite induction, deductions and Turing machine Almeida, João Paulo da Cruz [UNESP] 29 June 2017 (has links) Submitted by JOÃO PAULO DA CRUZ ALMEIDA (joaopauloalmeida2010@gmail.com) on 2017-09-26T16:20:50Z No. of bitstreams: 1 Minha Dissertação.pdf: 1021011 bytes, checksum: 1717c0a1baae32699bdf06c781a9ed31 (MD5) / Approved for entry into archive by Monique Sasaki (sayumi_sasaki@hotmail.com) on 2017-09-28T12:58:50Z (GMT) No. of bitstreams: 1 almeida_jpc_me_sjrp.pdf: 1021011 bytes, checksum: 1717c0a1baae32699bdf06c781a9ed31 (MD5) / Made available in DSpace on 2017-09-28T12:58:50Z (GMT). No. of bitstreams: 1 almeida_jpc_me_sjrp.pdf: 1021011 bytes, checksum: 1717c0a1baae32699bdf06c781a9ed31 (MD5) Previous issue date: 2017-06-29 / Este trabalho apresenta uma proposta relacionada ao ensino e prática do pensamento dedutivo formal em Matemática. São apresentados no âmbito do conjunto dos números Naturais três temas essencialmente interligados: indução/boa ordem, dedução e esquemas de computação representados pela máquina teórica de Turing. Os três temas se amalgamam na teoria lógica de dedução e tangem os fundamentos da Matemática, sua própria indecidibilidade e extensões / limites de tudo que pode ser deduzido utilizando a lógica de Aristóteles, caminho tão profundamente utilizado nos trabalhos de Gödel, Church, Turing, Robinson e outros. São apresentadas inúmeros esquemas de dedução referentes às “fórmulas” e Teoremas que permeiam o ensino fundamental e básico, com uma linguagem apropriada visando treinar os alunos (e professores) para um enfoque mais próprio pertinente à Matemática. / This work deals with the teaching and practice of formal deductive thinking in Mathematics. Three essentially interconnected themes are presented within the set of Natural Numbers: induction, deduction and computation schemes represented by the Turing theoretical machine. The three themes are put together into the logical theory of deduction and touch upon the foundations of Mathematics, its own undecidability and the extent / limits of what can be deduced by using Aristotle's logic, that is the subject in the works of Gödel, Church, Turing, Robinson, and others. There are a large number of deduction schemes referring to the "formulas" and Theorems that are usual subjects in elementary and basic degrees of the educational field, with an appropriate language in order to train students (and teachers) for a more pertinent approach to Mathematics. Números naturais Axiomas de peano Indução Primeiro elemento dos naturais Máquinas de Turing Tese de Turing-Church Natural numbers Peano's axioms Induction The first natural element Turing machines Turing-Church thesis
27	Aspects of algorithms and dynamics of cellular paradigms Pazienza, Giovanni Egidio 15 December 2008 (has links) Els paradigmes cel·lulars, com les xarxes neuronals cel·lulars (CNN, en anglès) i els autòmats cel·lulars (CA, en anglès), són una eina excel·lent de càlcul, al ser equivalents a una màquina universal de Turing. La introducció de la màquina universal CNN (CNN-UM, en anglès) ha permès desenvolupar hardware, el nucli computacional del qual funciona segons la filosofia cel·lular; aquest hardware ha trobat aplicació en diversos camps al llarg de la darrera dècada. Malgrat això, encara hi ha moltes preguntes a obertes sobre com definir els algoritmes d'una CNN-UM i com estudiar la dinàmica dels autòmats cel·lulars. En aquesta tesis es tracten els dos problemes: primer, es demostra que es possible acotar l'espai dels algoritmes per a la CNN-UM i explorar-lo gràcies a les tècniques genètiques; i segon, s'expliquen els fonaments de l'estudi dels CA per mitjà de la dinàmica no lineal (segons la definició de Chua) i s'il·lustra com aquesta tècnica ha permès trobar resultats innovadors. / Los paradigmas celulares, como las redes neuronales celulares (CNN, eninglés) y los autómatas celulares (CA, en inglés), son una excelenteherramienta de cálculo, al ser equivalentes a una maquina universal deTuring. La introducción de la maquina universal CNN (CNN-UM, eninglés) ha permitido desarrollar hardware cuyo núcleo computacionalfunciona según la filosofía celular; dicho hardware ha encontradoaplicación en varios campos a lo largo de la ultima década. Sinembargo, hay aun muchas preguntas abiertas sobre como definir losalgoritmos de una CNN-UM y como estudiar la dinámica de los autómatascelular. En esta tesis se tratan ambos problemas: primero se demuestraque es posible acotar el espacio de los algoritmos para la CNN-UM yexplorarlo gracias a técnicas genéticas; segundo, se explican losfundamentos del estudio de los CA por medio de la dinámica no lineal(según la definición de Chua) y se ilustra como esta técnica hapermitido encontrar resultados novedosos. / Cellular paradigms, like Cellular Neural Networks (CNNs) and Cellular Automata (CA) are an excellent tool to perform computation, since they are equivalent to a Universal Turing machine. The introduction of the Cellular Neural Network - Universal Machine (CNN-UM) allowed us to develop hardware whose computational core works according to the principles of cellular paradigms; such a hardware has found application in a number of fields throughout the last decade. Nevertheless, there are still many open questions about how to define algorithms for a CNN-UM, and how to study the dynamics of Cellular Automata. In this dissertation both problems are tackled: first, we prove that it is possible to bound the space of all algorithms of CNN-UM and explore it through genetic techniques; second, we explain the fundamentals of the nonlinear perspective of CA (according to Chua's definition), and we illustrate how this technique has allowed us to find novel results. turing machines cellular automata cellular paradigms genetic programming cellular neural networks autómatas celulares programación genética redes neuronales celulares autòmats cel·lulars programació genètica xarxes neuronals cel·lulars 621.3
28	Games and Probabilistic Infinite-State Systems Sandberg, Sven January 2007 (has links) <p>Computer programs keep finding their ways into new safety-critical applications, while at the same time growing more complex. This calls for new and better methods to verify the correctness of software. We focus on one approach to verifying systems, namely that of <i>model checking</i>. At first, we investigate two categories of problems related to model checking: <i>games</i> and <i>stochastic infinite-state systems</i>. In the end, we join these two lines of research, by studying <i>stochastic infinite-state games</i>.</p><p>Game theory has been used in verification for a long time. We focus on finite-state 2-player parity and limit-average (mean payoff) games. These problems have applications in model checking for the <i>μ</i>-calculus, one of the most expressive logics for programs. We give a simplified proof of memoryless determinacy. The proof applies <i>both</i> to parity and limit-average games. Moreover, we suggest a strategy improvement algorithm for limit-average games. The algorithm is discrete and strongly subexponential.</p><p>We also consider probabilistic infinite-state systems (Markov chains) induced by three types of models. <i>Lossy channel systems (LCS)</i> have been used to model processes that communicate over an unreliable medium. <i>Petri nets</i> model systems with unboundedly many parallel processes. <i>Noisy Turing machines</i> can model computers where the memory may be corrupted in a stochastic manner. We introduce the notion of <i>eagerness</i> and prove that all these systems are eager. We give a scheme to approximate the value of a reward function defined on paths. Eagerness allows us to prove that the scheme terminates. For probabilistic LCS, we also give an algorithm that approximates the limit-average reward. This quantity describes the long-run behavior of the system.</p><p>Finally, we investigate Büchi games on probabilistic LCS. Such games can be used to model a malicious cracker trying to break a network protocol. We give an algorithm to solve these games.</p> program verification model checking determinacy strategy improvement infinite games parity games mean payoff games stochastic games Büchi games reachability games limiting average limiting behavior Markov chains infinite-state systems lossy channel systems vector addition systems noisy Turing machines Computer science Datavetenskap
29	Games and Probabilistic Infinite-State Systems Sandberg, Sven January 2007 (has links) Computer programs keep finding their ways into new safety-critical applications, while at the same time growing more complex. This calls for new and better methods to verify the correctness of software. We focus on one approach to verifying systems, namely that of model checking. At first, we investigate two categories of problems related to model checking: games and stochastic infinite-state systems. In the end, we join these two lines of research, by studying stochastic infinite-state games. Game theory has been used in verification for a long time. We focus on finite-state 2-player parity and limit-average (mean payoff) games. These problems have applications in model checking for the μ-calculus, one of the most expressive logics for programs. We give a simplified proof of memoryless determinacy. The proof applies both to parity and limit-average games. Moreover, we suggest a strategy improvement algorithm for limit-average games. The algorithm is discrete and strongly subexponential. We also consider probabilistic infinite-state systems (Markov chains) induced by three types of models. Lossy channel systems (LCS) have been used to model processes that communicate over an unreliable medium. Petri nets model systems with unboundedly many parallel processes. Noisy Turing machines can model computers where the memory may be corrupted in a stochastic manner. We introduce the notion of eagerness and prove that all these systems are eager. We give a scheme to approximate the value of a reward function defined on paths. Eagerness allows us to prove that the scheme terminates. For probabilistic LCS, we also give an algorithm that approximates the limit-average reward. This quantity describes the long-run behavior of the system. Finally, we investigate Büchi games on probabilistic LCS. Such games can be used to model a malicious cracker trying to break a network protocol. We give an algorithm to solve these games. program verification model checking determinacy strategy improvement infinite games parity games mean payoff games stochastic games Büchi games reachability games limiting average limiting behavior Markov chains infinite-state systems lossy channel systems vector addition systems noisy Turing machines Computer science Datavetenskap
30	On challenges in training recurrent neural networks Anbil Parthipan, Sarath Chandar 11 1900 (has links) Dans un problème de prédiction à multiples pas discrets, la prédiction à chaque instant peut dépendre de l’entrée à n’importe quel moment dans un passé lointain. Modéliser une telle dépendance à long terme est un des problèmes fondamentaux en apprentissage automatique. En théorie, les Réseaux de Neurones Récurrents (RNN) peuvent modéliser toute dépendance à long terme. En pratique, puisque la magnitude des gradients peut croître ou décroître exponentiellement avec la durée de la séquence, les RNNs ne peuvent modéliser que les dépendances à court terme. Cette thèse explore ce problème dans les réseaux de neurones récurrents et propose de nouvelles solutions pour celui-ci. Le chapitre 3 explore l’idée d’utiliser une mémoire externe pour stocker les états cachés d’un réseau à Mémoire Long et Court Terme (LSTM). En rendant l’opération d’écriture et de lecture de la mémoire externe discrète, l’architecture proposée réduit le taux de décroissance des gradients dans un LSTM. Ces opérations discrètes permettent également au réseau de créer des connexions dynamiques sur de longs intervalles de temps. Le chapitre 4 tente de caractériser cette décroissance des gradients dans un réseau de neurones récurrent et propose une nouvelle architecture récurrente qui, grâce à sa conception, réduit ce problème. L’Unité Récurrente Non-saturante (NRUs) proposée n’a pas de fonction d’activation saturante et utilise la mise à jour additive de cellules au lieu de la mise à jour multiplicative. Le chapitre 5 discute des défis de l’utilisation de réseaux de neurones récurrents dans un contexte d’apprentissage continuel, où de nouvelles tâches apparaissent au fur et à mesure. Les dépendances dans l’apprentissage continuel ne sont pas seulement contenues dans une tâche, mais sont aussi présentes entre les tâches. Ce chapitre discute de deux problèmes fondamentaux dans l’apprentissage continuel: (i) l’oubli catastrophique d’anciennes tâches et (ii) la capacité de saturation du réseau. De plus, une solution est proposée pour régler ces deux problèmes lors de l’entraînement d’un réseau de neurones récurrent. / In a multi-step prediction problem, the prediction at each time step can depend on the input at any of the previous time steps far in the past. Modelling such long-term dependencies is one of the fundamental problems in machine learning. In theory, Recurrent Neural Networks (RNNs) can model any long-term dependency. In practice, they can only model short-term dependencies due to the problem of vanishing and exploding gradients. This thesis explores the problem of vanishing gradient in recurrent neural networks and proposes novel solutions for the same. Chapter 3 explores the idea of using external memory to store the hidden states of a Long Short Term Memory (LSTM) network. By making the read and write operations of the external memory discrete, the proposed architecture reduces the rate of gradients vanishing in an LSTM. These discrete operations also enable the network to create dynamic skip connections across time. Chapter 4 attempts to characterize all the sources of vanishing gradients in a recurrent neural network and proposes a new recurrent architecture which has significantly better gradient flow than state-of-the-art recurrent architectures. The proposed Non-saturating Recurrent Units (NRUs) have no saturating activation functions and use additive cell updates instead of multiplicative cell updates. Chapter 5 discusses the challenges of using recurrent neural networks in the context of lifelong learning. In the lifelong learning setting, the network is expected to learn a series of tasks over its lifetime. The dependencies in lifelong learning are not just within a task, but also across the tasks. This chapter discusses the two fundamental problems in lifelong learning: (i) catastrophic forgetting of old tasks, and (ii) network capacity saturation. Further, it proposes a solution to solve both these problems while training a recurrent neural network. Recurrent Neural Networks Long-term Dependencies Non-saturating Recurrent Units Memory Augmented Neural Networks Neural Turing Machines LSTMs Skip connections lifelong learning catastrophic forgetting capacity saturation Réseaux de Neurones Récurrents Dépendances à long terme Unité Récurrente Non-saturante Machine Neuronale de Turing Connexions dynamiques Apprentissage continuel Oubli catastrophique Capacité de saturation

Search results