• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • 1
  • Tagged with
  • 5
  • 5
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A Dual Pathway Approach for Solving the Spatial Credit Assignment Problem in a Biological Way

Connor, Patrick 01 November 2013 (has links)
To survive, many biological organisms need to accurately infer which features of their environment predict future rewards and punishments. In machine learning terms, this is the problem of spatial credit assignment, for which many supervised learning algorithms have been developed. In this thesis, I mainly propose that a dual-pathway, regression-like strategy and associated biological implementations may be used to solve this problem. Using David Marr's (1982) three-level philosophy of computational neuroscience, the thesis and its contributions are organized as follows: - Computational Level: Here, the spatial credit assignment problem is formally defined and modeled using probability density functions. The specific challenges of the problem faced by organisms and machine learning algorithms alike are also identified. - Algorithmic Level: I present and evaluate the novel hypothesis that the general strategy used by animals is to perform a regression over past experiences. I also introduce an extension of a probabilistic model for regression that substantially improves generalization without resorting to regularization. This approach subdues residual associations to irrelevant features, as does regularization. - Physical Level: Here, the neuroscience of classical conditioning and of the basal ganglia is briefly reviewed. Then, two novel models of the basal ganglia are put forward: 1) an online-learning model that supports the regression hypothesis and 2) a biological implementation of the probabilistic model previously introduced. Finally, we compare these models to others in the literature. In short, this thesis establishes a theoretical framework for studying the spatial credit assignment problem, offers a simple hypothesis for how biological systems solve it, and implements basal ganglia-based algorithms in support. The thesis brings to light novel approaches for machine learning and several explanations for biological structures and classical conditioning phenomena. / Note: While the thesis contains content from two articles (one journal, one conference), their publishers do not require special permission for their use in dissertations (information confirming this is in an appendix of the thesis itself).
2

Understanding Generalization, Credit Assignment and the Regulation of Learning Rate in Human Motor Learning

Gonzalez Castro, Luis Nicolas January 2011 (has links)
Understanding the neural processes underlying motor learning in humans is important to facilitate the acquisition of new motor skills and to aid the relearning of skills lost after neurologic injury. Although it is known that the learning of a new movement is guided by the error feedback received after each repeated attempt to produce the movement, how the central nervous system (CNS) processes individual errors and how it modulates its learning rate in response to the history of errors experienced are issues that remain to be elucidated. To address these issues we studied the generalization of learning and learning decay – the transfer of what has been learned, or unlearned, in a particular movement condition to new movement conditions. Generalization offers a window into the process of error credit assignment during motor learning, since it allows us to measure which actions benefit the most in terms of learning after experiencing an error. We found that the distributions that describe generalization after learning are unimodal and biased towards the motion directions experienced during training, a finding that suggests that the credit for the learning experienced after a particular trial is assigned to the actual motion (motion-referenced learning) and not to the planned motion (plan-referenced learning) as it had previously been assumed in the motor learning literature. In addition, after training the same action along multiple directions, we found that the pattern of learning decay has two distinct components: one that is time-dependent and affects all trained directions, and one that is trial-dependent and affects mostly the direction where decay was induced, generalizing narrowly with a unimodal pattern similar to the one observed for learning generalization. We finally studied the effect that the consistency of the error perturbations in the training environment has on the learning rate adopted by the CNS. We found that learning rate increases when the perturbations experienced in training are consistent, and decreases when these perturbations are inconsistent. Besides increasing our understanding of the mechanisms underlying motor learning, the findings described in the present dissertation will enable the principled design of skill training and rehabilitation protocols that accelerate learning. / Engineering and Applied Sciences
3

Captação de recursos por empresas em recuperação judicial e Fundos de Investimento em Direitos Creditórios (FIDC) / Fund-raising by companies in judiciary reorganization and Receivables Funds

Trovo, Beatriz Villas Boas Pimentel 29 May 2013 (has links)
O presente estudo examina, sob o enfoque do Direito Brasileiro, a captação de recursos por empresas viáveis em crise, durante o processo de recuperação judicial, por meio do mercado de capitais, especificamente com a cessão de direitos creditórios a Fundos de Investimento em Direitos Creditórios. Em alguns casos, os FIDCs podem consistir em uma alternativa constante de captação de recursos, a custos consideravelmente menores que os praticados por instituições financeiras. Todavia, muitos cuidados e precauções devem ser tomados nas cessões de créditos a FIDCs, a fim de garantir segurança e transparência aos investidores e aos credores das empresas em recuperação. / This research examines, from the Brazilian Law focus, the fund-raising for viable Companies in crisis, during the judiciary reorganization procedure, through the capital markets, specifically with the assignment of receivables to Receivables Funds. In some cases, these investment funds may consist of an usual-recurrent alternative, with considerably lower costs than those charged by financial institutions. However, many precautions should be taken in the FIDCs credit assignments in order to ensure safety and transparency to investors and companies creditors.
4

Captação de recursos por empresas em recuperação judicial e Fundos de Investimento em Direitos Creditórios (FIDC) / Fund-raising by companies in judiciary reorganization and Receivables Funds

Beatriz Villas Boas Pimentel Trovo 29 May 2013 (has links)
O presente estudo examina, sob o enfoque do Direito Brasileiro, a captação de recursos por empresas viáveis em crise, durante o processo de recuperação judicial, por meio do mercado de capitais, especificamente com a cessão de direitos creditórios a Fundos de Investimento em Direitos Creditórios. Em alguns casos, os FIDCs podem consistir em uma alternativa constante de captação de recursos, a custos consideravelmente menores que os praticados por instituições financeiras. Todavia, muitos cuidados e precauções devem ser tomados nas cessões de créditos a FIDCs, a fim de garantir segurança e transparência aos investidores e aos credores das empresas em recuperação. / This research examines, from the Brazilian Law focus, the fund-raising for viable Companies in crisis, during the judiciary reorganization procedure, through the capital markets, specifically with the assignment of receivables to Receivables Funds. In some cases, these investment funds may consist of an usual-recurrent alternative, with considerably lower costs than those charged by financial institutions. However, many precautions should be taken in the FIDCs credit assignments in order to ensure safety and transparency to investors and companies creditors.
5

Parsimonious reasoning in reinforcement learning for better credit assignment

Ma, Michel 08 1900 (has links)
Le contenu de cette thèse explore la question de l’attribution de crédits à long terme dans l’apprentissage par renforcement du point de vue d’un biais inductif de parcimonie. Dans ce contexte, un agent parcimonieux cherche à comprendre son environnement en utilisant le moins de variables possible. Autrement dit, si l’agent est crédité ou blâmé pour un certain comportement, la parcimonie l’oblige à attribuer ce crédit (ou blâme) à seulement quelques variables latentes sélectionnées. Avant de proposer de nouvelles méthodes d’attribution parci- monieuse de crédits, nous présentons les travaux antérieurs relatifs à l’attribution de crédits à long terme en relation avec l’idée de sparsité. Ensuite, nous développons deux nouvelles idées pour l’attribution de crédits dans l’apprentissage par renforcement qui sont motivées par un raisonnement parcimonieux : une dans le cadre sans modèle et une pour l’apprentissage basé sur un modèle. Pour ce faire, nous nous appuyons sur divers concepts liés à la parcimonie issus de la causalité, de l’apprentissage supervisé et de la simulation, et nous les appliquons dans un cadre pour la prise de décision séquentielle. La première, appelée évaluation contrefactuelle de la politique, prend en compte les dévi- ations mineures de ce qui aurait pu être compte tenu de ce qui a été. En restreignant l’espace dans lequel l’agent peut raisonner sur les alternatives, l’évaluation contrefactuelle de la politique présente des propriétés de variance favorables à l’évaluation des politiques. L’évaluation contrefactuelle de la politique offre également une nouvelle perspective sur la rétrospection, généralisant les travaux antérieurs sur l’attribution de crédits a posteriori. La deuxième contribution de cette thèse est un algorithme augmenté d’attention latente pour l’apprentissage par renforcement basé sur un modèle : Latent Sparse Attentive Value Gra- dients (LSAVG). En intégrant pleinement l’attention dans la structure d’optimisation de la politique, nous montrons que LSAVG est capable de résoudre des tâches de mémoire active que son homologue sans modèle a été conçu pour traiter, sans recourir à des heuristiques ou à un biais de l’estimateur original. / The content of this thesis explores the question of long-term credit assignment in reinforce- ment learning from the perspective of a parsimony inductive bias. In this context, a parsi- monious agent looks to understand its environment through the least amount of variables possible. Alternatively, given some credit or blame for some behavior, parsimony forces the agent to assign this credit (or blame) to only a select few latent variables. Before propos- ing novel methods for parsimonious credit assignment, previous work relating to long-term credit assignment is introduced in relation to the idea of sparsity. Then, we develop two new ideas for credit assignment in reinforcement learning that are motivated by parsimo- nious reasoning: one in the model-free setting, and one for model-based learning. To do so, we build upon various parsimony-related concepts from causality, supervised learning, and simulation, and apply them to the Markov Decision Process framework. The first of which, called counterfactual policy evaluation, considers minor deviations of what could have been given what has been. By restricting the space in which the agent can reason about alternatives, counterfactual policy evaluation is shown to have favorable variance properties for policy evaluation. Counterfactual policy evaluation also offers a new perspective to hindsight, generalizing previous work in hindsight credit assignment. The second contribution of this thesis is a latent attention augmented algorithm for model-based reinforcement learning: Latent Sparse Attentive Value Gradients (LSAVG). By fully inte- grating attention into the structure for policy optimization, we show that LSAVG is able to solve active memory tasks that its model-free counterpart was designed to tackle, without resorting to heuristics or biasing the original estimator.

Page generated in 0.1127 seconds