• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 4
  • 1
  • 1
  • Tagged with
  • 21
  • 21
  • 21
  • 8
  • 7
  • 7
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Individual differences in structure learning

Newlin, Philip 13 May 2022 (has links)
Humans have a tendency to impute structure spontaneously even in simple learning tasks, however the way they approach structure learning can vary drastically. The present study sought to determine why individuals learn structure differently. One hypothesized explanation for differences in structure learning is individual differences in cognitive control. Cognitive control allows individuals to maintain representations of a task and may interact with reinforcement learning systems. It was expected that individual differences in propensity to apply cognitive control, which shares component processes with hierarchical reinforcement learning, may explain how individuals learn structure differently in a simple structure learning task. Results showed that proactive control and model-based control explained differences in the rate at which individuals applied structure learning.
12

Hierarchical reinforcement learning for spoken dialogue systems

Cuayáhuitl, Heriberto January 2009 (has links)
This thesis focuses on the problem of scalable optimization of dialogue behaviour in speech-based conversational systems using reinforcement learning. Most previous investigations in dialogue strategy learning have proposed flat reinforcement learning methods, which are more suitable for small-scale spoken dialogue systems. This research formulates the problem in terms of Semi-Markov Decision Processes (SMDPs), and proposes two hierarchical reinforcement learning methods to optimize sub-dialogues rather than full dialogues. The first method uses a hierarchy of SMDPs, where every SMDP ignores irrelevant state variables and actions in order to optimize a sub-dialogue. The second method extends the first one by constraining every SMDP in the hierarchy with prior expert knowledge. The latter method proposes a learning algorithm called 'HAM+HSMQ-Learning', which combines two existing algorithms in the literature of hierarchical reinforcement learning. Whilst the first method generates fully-learnt behaviour, the second one generates semi-learnt behaviour. In addition, this research proposes a heuristic dialogue simulation environment for automatic dialogue strategy learning. Experiments were performed on simulated and real environments based on a travel planning spoken dialogue system. Experimental results provided evidence to support the following claims: First, both methods scale well at the cost of near-optimal solutions, resulting in slightly longer dialogues than the optimal solutions. Second, dialogue strategies learnt with coherent user behaviour and conservative recognition error rates can outperform a reasonable hand-coded strategy. Third, semi-learnt dialogue behaviours are a better alternative (because of their higher overall performance) than hand-coded or fully-learnt dialogue behaviours. Last, hierarchical reinforcement learning dialogue agents are feasible and promising for the (semi) automatic design of adaptive behaviours in larger-scale spoken dialogue systems. This research makes the following contributions to spoken dialogue systems which learn their dialogue behaviour. First, the Semi-Markov Decision Process (SMDP) model was proposed to learn spoken dialogue strategies in a scalable way. Second, the concept of 'partially specified dialogue strategies' was proposed for integrating simultaneously hand-coded and learnt spoken dialogue behaviours into a single learning framework. Third, an evaluation with real users of hierarchical reinforcement learning dialogue agents was essential to validate their effectiveness in a realistic environment.
13

A Learning-based Control Architecture for Socially Assistive Robots Providing Cognitive Interventions

Chan, Jeanie 05 December 2011 (has links)
Due to the world’s rapidly growing elderly population, dementia is becoming increasingly prevalent. This poses considerable health, social, and economic concerns as it impacts individuals, families and healthcare systems. Current research has shown that cognitive interventions may slow the decline of or improve brain functioning in older adults. This research investigates the use of intelligent socially assistive robots to engage individuals in person-centered cognitively stimulating activities. Specifically, in this thesis, a novel learning-based control architecture is developed to enable socially assistive robots to act as social motivators during an activity. A hierarchical reinforcement learning approach is used in the architecture so that the robot can learn appropriate assistive behaviours based on activity structure and personalize an interaction based on the individual’s behaviour and user state. Experiments show that the control architecture is effective in determining the robot’s optimal assistive behaviours for a memory game interaction and a meal assistance scenario.
14

A Learning-based Control Architecture for Socially Assistive Robots Providing Cognitive Interventions

Chan, Jeanie 05 December 2011 (has links)
Due to the world’s rapidly growing elderly population, dementia is becoming increasingly prevalent. This poses considerable health, social, and economic concerns as it impacts individuals, families and healthcare systems. Current research has shown that cognitive interventions may slow the decline of or improve brain functioning in older adults. This research investigates the use of intelligent socially assistive robots to engage individuals in person-centered cognitively stimulating activities. Specifically, in this thesis, a novel learning-based control architecture is developed to enable socially assistive robots to act as social motivators during an activity. A hierarchical reinforcement learning approach is used in the architecture so that the robot can learn appropriate assistive behaviours based on activity structure and personalize an interaction based on the individual’s behaviour and user state. Experiments show that the control architecture is effective in determining the robot’s optimal assistive behaviours for a memory game interaction and a meal assistance scenario.
15

Evaluating behaviour tree integration in the option critic framework in Starcraft 2 mini-games with training restricted by consumer level hardware

Lundberg, Fredrik January 2022 (has links)
This thesis investigates the performance of the option critic (OC) framework combined with behaviour trees (BTs) in Starcraft 2 mini-games when training time is constrained by a time frame limited by consumer level hardware. We test two such combination models: BTs as macro actions (OCBT) and BTs as options (OCBToptions) and measure the relative performance to the plain OC model through an ablation study. The tests were conducted in two of the mini-games called build marines (BM) and defeat zerglings and banelings (DZAB) and a set of metrics were collected, including game score. We find that BTs improve the performance in the BM mini-game using both OCBT and OCBToptions, but in DZAB the models performed equally. Additionally, results indicate that the improvement in BM scores does not stem solely from the complexity of the BTs but from the OC model learning to use the BTs effectively and learning beneficial options in relation to the BT options. Thus, it is concluded that BTs can improve performance when training time is limited by consumer level hardware. / Denna avhandling undersöker hur kombinationen av option critic (OC) ramverket och beteendeträd (BT) förbättrar resultatet i Starcraft 2 minispel när träningstiden är begränsad av konsumenthårdvara. Vi testar två kombinationsmodeller: BT som makrohandlingar (OCBT) och BT som options (OCBToptions) och mäter den relativa förbättringen jämte OC modellen med en ablationsstudie. Testen utfördes i två minispel build marines (BM) och defeat zerglings and banelings (DZAB) och olika typer av data insamlades, bland annat spelpoängen. Vi fann att BT förbättrade resultatet i BM på båda hierarkiska nivåerna men i DZAB var resultaten ungefär lika mellan de olika modellerna. Resultaten indikerar också att förbättringen i BM inte beror bara på BT komplexitet utan på att OC modellen lär sig att använda BT och lär sig options som kompletterar dess BT options. Vi finner därför att BT kan förbättra resultaten när träningen är begränsad av konsumenthårdvara.
16

Biased Exploration in Offline Hierarchical Reinforcement Learning

Miller, Eric D. 26 January 2021 (has links)
No description available.
17

Model-Free Reinforcement Learning for Hierarchical OO-MDPs

Goldblatt, John Dallan 23 May 2022 (has links)
No description available.
18

A hierarchical neural network approach to learning sensor planning and control

Löfwenberg, Nicke January 2023 (has links)
The ability to search their environment is one of the most fundamental skills for any living creature. Visual search in particular is abundantly common for almost all animals. This act of searching is generally active in nature, with vision not simply reacting to incoming stimuli but also actively searching the environment for potential stimuli (such as by moving their head or eyes). Automatic visual search, likewise, is a crucial and powerful tool within a wide variety of different fields. However, performing such an active search is a nontrivial issue for many machine learning approaches. The added complexity of choosing which area to observe, as well as the common case of having a camera with adaptive field-of-view capabilities further complicates the problem. Hierarchical Reinforcement Learning have in recent years proven to be a particularly powerful means of solving hard machine learning problems by a divide-and-conquer methodology, where one highly complex task can be broken down into smaller sub-tasks which on their own may be more easily learnable. In this thesis, we present a hierarchical reinforcement learning system for solving a visual search problem in a stationary camera environment with adjustable pan, tilt and field-of-view capabilities. This hierarchical model also incorporates non-reinforcement learning agents in its workflow to better utilize the strengths of different agents and form a more powerful overall model. This model is then compared to a non-hierarchical baseline as well as some learning-free approaches.
19

Learning competitive ensemble of information-constrained primitives

Sodhani, Shagun 07 1900 (has links)
No description available.
20

Lifelong learning of concepts in CRAFT

Vasishta, Nithin Venkatesh 08 1900 (has links)
La planification à des niveaux d’abstraction plus élevés est essentielle lorsqu’il s’agit de résoudre des tâches à long horizon avec des complexités hiérarchiques. Pour planifier avec succès à un niveau d’abstraction donné, un agent doit comprendre le fonctionnement de l’environnement à ce niveau particulier. Cette compréhension peut être implicite en termes de politiques, de fonctions de valeur et de modèles, ou elle peut être définie explicitement. Dans ce travail, nous introduisons les concepts comme un moyen de représenter et d’accumuler explicitement des informations sur l’environnement. Les concepts sont définis en termes de transition d’état et des conditions requises pour que cette transition ait lieu. La simplicité de cette définition offre flexibilité et contrôle sur le processus d’apprentissage. Étant donné que les concepts sont de nature hautement interprétable, il est facile d’encoder les connaissances antérieures et d’intervenir au cours du processus d’apprentissage si nécessaire. Cette définition facilite également le transfert de concepts entre différents domaines. Les concepts, à un niveau d’abstraction donné, sont intimement liés aux compétences, ou actions temporellement abstraites. Toutes les transitions d’état suffisamment importantes pour être représentées par un concept se produisent après l’exécution réussie d’une compétence. En exploitant cette relation, nous introduisons un cadre qui facilite l’apprentissage tout au long de la vie et le raffinement des concepts à différents niveaux d’abstraction. Le cadre comporte trois volets: Le sytème 1 segmente un flux d’expérience (par exemple une démonstration) en une séquence de compétences. Cette segmentation peut se faire à différents niveaux d’abstraction. Le sytème 2 analyse ces segments pour affiner et mettre à niveau son ensemble de concepts, lorsqu’applicable. Le sytème 3 utilise les concepts disponibles pour générer un graphe de dépendance de sous-tâches. Ce graphe peut être utilisé pour planifier à différents niveaux d’abstraction. Nous démontrons l’applicabilité de ce cadre dans l’environnement hiérarchique 2D CRAFT. Nous effectuons des expériences pour explorer comment les concepts peuvent être appris de différents flux d’expérience et comment la qualité de la base de concepts affecte l’optimalité du plan général. Dans les tâches avec des dépendances de sous-tâches complexes, où la plupart des algorithmes ne parviennent pas à se généraliser ou prennent un temps impraticable à converger, nous démontrons que les concepts peuvent être utilisés pour simplifier considérablement la planification. Ce cadre peut également être utilisé pour comprendre l’intention d’une démonstration donnée en termes de concepts. Cela permet à l’agent de répliquer facilement la démonstration dans différents environnements. Nous montrons que cette méthode d’imitation est beaucoup plus robuste aux changements de configuration de l’environnement que les méthodes traditionnelles. Dans notre formulation du problème, nous faisons deux hypothèses: 1) que nous avons accès à un ensemble de compétences suffisamment exhaustif, et 2) que notre agent a accès à des environnements de pratique, qui peuvent être utilisés pour affiner les concepts en cas de besoin. L’objectif de ce travail est d’explorer l’aspect pratique des concepts d’apprentissage comme moyen d’améliorer la compréhension de l’environnement. Dans l’ensemble, nous démontrons que les concepts d’apprentissage / Planning at higher levels of abstraction is critical when it comes to solving long horizon tasks with hierarchical complexities. To plan successfully at a given level of abstraction, an agent must have an understanding of how the environment functions at that particular level. This understanding may be implicit in terms of policies, value functions, and world models, or it can be defined explicitly. In this work, we introduce concepts as a means to explicitly represent and accumulate information about the environment. Concepts are defined in terms of a state transition and the conditions required for that transition to take place. The simplicity of this definition offers flexibility and control over the learning process. Since concepts are highly interpretable in nature, it is easy to encode prior knowledge and intervene during the learning process if necessary. This definition also makes it relatively straightforward to transfer concepts across different domains wherever applicable. Concepts, at a given level of abstraction, are intricately linked to skills, or temporally abstracted actions. All the state transitions significant enough to be represented by a concept occur only after the successful execution of a skill. Exploiting this relationship, we introduce a framework that aids in lifelong learning and refining of concepts across different levels of abstraction. The framework has three components: - System 1 segments a stream of experience (e.g. a demonstration) into a sequence of skills. This segmentation can be done at different levels of abstraction. - System 2 analyses these segments to refine and upgrade its set of concepts, whenever applicable. - System 3 utilises the available concepts to generate a sub-task dependency graph. This graph can be used for planning at different levels of abstraction We demonstrate the applicability of this framework in the 2D hierarchical environment CRAFT. We perform experiments to explore how concepts can be learned from different streams of experience, and how the quality of the concept base affects the optimality of the overall plan. In tasks with complex sub-task dependencies, where most algorithms fail to generalise or take an impractical amount of time to converge, we demonstrate that concepts can be used to significantly simplify planning. This framework can also be used to understand the intention of a given demonstration in terms of concepts. This makes it easy for the agent to replicate a demonstration in different environments. We show that this method of imitation is much more robust to changes in the environment configurations than traditional methods. In our problem formulation, we make two assumptions: 1) that we have access to a sufficiently exhaustive set of skills, and 2) that our agent has access to practice environments, which can be used to refine concepts when needed. The objective behind this work is to explore the practicality of learning concepts as a means to improve one’s understanding about the environment. Overall, we demonstrate that learning concepts can be a light-weight yet efficient way to increase the capability of a system.

Page generated in 0.1588 seconds