Global ETD Search

11	Individual differences in structure learning Newlin, Philip 13 May 2022 (has links) Humans have a tendency to impute structure spontaneously even in simple learning tasks, however the way they approach structure learning can vary drastically. The present study sought to determine why individuals learn structure differently. One hypothesized explanation for differences in structure learning is individual differences in cognitive control. Cognitive control allows individuals to maintain representations of a task and may interact with reinforcement learning systems. It was expected that individual differences in propensity to apply cognitive control, which shares component processes with hierarchical reinforcement learning, may explain how individuals learn structure differently in a simple structure learning task. Results showed that proactive control and model-based control explained differences in the rate at which individuals applied structure learning. structure learning reinforcement learning individual differences learning hierarchical reinforcement learning cognitive control Cognitive Neuroscience Cognitive Psychology Cognitive Science Computational Neuroscience
12	Hierarchical reinforcement learning for spoken dialogue systems Cuayáhuitl, Heriberto January 2009 (has links) This thesis focuses on the problem of scalable optimization of dialogue behaviour in speech-based conversational systems using reinforcement learning. Most previous investigations in dialogue strategy learning have proposed flat reinforcement learning methods, which are more suitable for small-scale spoken dialogue systems. This research formulates the problem in terms of Semi-Markov Decision Processes (SMDPs), and proposes two hierarchical reinforcement learning methods to optimize sub-dialogues rather than full dialogues. The first method uses a hierarchy of SMDPs, where every SMDP ignores irrelevant state variables and actions in order to optimize a sub-dialogue. The second method extends the first one by constraining every SMDP in the hierarchy with prior expert knowledge. The latter method proposes a learning algorithm called 'HAM+HSMQ-Learning', which combines two existing algorithms in the literature of hierarchical reinforcement learning. Whilst the first method generates fully-learnt behaviour, the second one generates semi-learnt behaviour. In addition, this research proposes a heuristic dialogue simulation environment for automatic dialogue strategy learning. Experiments were performed on simulated and real environments based on a travel planning spoken dialogue system. Experimental results provided evidence to support the following claims: First, both methods scale well at the cost of near-optimal solutions, resulting in slightly longer dialogues than the optimal solutions. Second, dialogue strategies learnt with coherent user behaviour and conservative recognition error rates can outperform a reasonable hand-coded strategy. Third, semi-learnt dialogue behaviours are a better alternative (because of their higher overall performance) than hand-coded or fully-learnt dialogue behaviours. Last, hierarchical reinforcement learning dialogue agents are feasible and promising for the (semi) automatic design of adaptive behaviours in larger-scale spoken dialogue systems. This research makes the following contributions to spoken dialogue systems which learn their dialogue behaviour. First, the Semi-Markov Decision Process (SMDP) model was proposed to learn spoken dialogue strategies in a scalable way. Second, the concept of 'partially specified dialogue strategies' was proposed for integrating simultaneously hand-coded and learnt spoken dialogue behaviours into a single learning framework. Third, an evaluation with real users of hierarchical reinforcement learning dialogue agents was essential to validate their effectiveness in a realistic environment. 020
13	A Learning-based Control Architecture for Socially Assistive Robots Providing Cognitive Interventions Chan, Jeanie 05 December 2011 (has links) Due to the world’s rapidly growing elderly population, dementia is becoming increasingly prevalent. This poses considerable health, social, and economic concerns as it impacts individuals, families and healthcare systems. Current research has shown that cognitive interventions may slow the decline of or improve brain functioning in older adults. This research investigates the use of intelligent socially assistive robots to engage individuals in person-centered cognitively stimulating activities. Specifically, in this thesis, a novel learning-based control architecture is developed to enable socially assistive robots to act as social motivators during an activity. A hierarchical reinforcement learning approach is used in the architecture so that the robot can learn appropriate assistive behaviours based on activity structure and personalize an interaction based on the individual’s behaviour and user state. Experiments show that the control architecture is effective in determining the robot’s optimal assistive behaviours for a memory game interaction and a meal assistance scenario. Human-robot Interaction Control Architecture Robotics Hierarchical Reinforcement Learning Multi-modal Human-machine Interfaces Cognitive Interventions Socially Assistive Robots 0771 0800 0548
14	A Learning-based Control Architecture for Socially Assistive Robots Providing Cognitive Interventions Chan, Jeanie 05 December 2011 (has links) Due to the world’s rapidly growing elderly population, dementia is becoming increasingly prevalent. This poses considerable health, social, and economic concerns as it impacts individuals, families and healthcare systems. Current research has shown that cognitive interventions may slow the decline of or improve brain functioning in older adults. This research investigates the use of intelligent socially assistive robots to engage individuals in person-centered cognitively stimulating activities. Specifically, in this thesis, a novel learning-based control architecture is developed to enable socially assistive robots to act as social motivators during an activity. A hierarchical reinforcement learning approach is used in the architecture so that the robot can learn appropriate assistive behaviours based on activity structure and personalize an interaction based on the individual’s behaviour and user state. Experiments show that the control architecture is effective in determining the robot’s optimal assistive behaviours for a memory game interaction and a meal assistance scenario. Human-robot Interaction Control Architecture Robotics Hierarchical Reinforcement Learning Multi-modal Human-machine Interfaces Cognitive Interventions Socially Assistive Robots 0771 0800 0548
15	Evaluating behaviour tree integration in the option critic framework in Starcraft 2 mini-games with training restricted by consumer level hardware Lundberg, Fredrik January 2022 (has links) This thesis investigates the performance of the option critic (OC) framework combined with behaviour trees (BTs) in Starcraft 2 mini-games when training time is constrained by a time frame limited by consumer level hardware. We test two such combination models: BTs as macro actions (OCBT) and BTs as options (OCBToptions) and measure the relative performance to the plain OC model through an ablation study. The tests were conducted in two of the mini-games called build marines (BM) and defeat zerglings and banelings (DZAB) and a set of metrics were collected, including game score. We find that BTs improve the performance in the BM mini-game using both OCBT and OCBToptions, but in DZAB the models performed equally. Additionally, results indicate that the improvement in BM scores does not stem solely from the complexity of the BTs but from the OC model learning to use the BTs effectively and learning beneficial options in relation to the BT options. Thus, it is concluded that BTs can improve performance when training time is limited by consumer level hardware. / Denna avhandling undersöker hur kombinationen av option critic (OC) ramverket och beteendeträd (BT) förbättrar resultatet i Starcraft 2 minispel när träningstiden är begränsad av konsumenthårdvara. Vi testar två kombinationsmodeller: BT som makrohandlingar (OCBT) och BT som options (OCBToptions) och mäter den relativa förbättringen jämte OC modellen med en ablationsstudie. Testen utfördes i två minispel build marines (BM) och defeat zerglings and banelings (DZAB) och olika typer av data insamlades, bland annat spelpoängen. Vi fann att BT förbättrade resultatet i BM på båda hierarkiska nivåerna men i DZAB var resultaten ungefär lika mellan de olika modellerna. Resultaten indikerar också att förbättringen i BM inte beror bara på BT komplexitet utan på att OC modellen lär sig att använda BT och lär sig options som kompletterar dess BT options. Vi finner därför att BT kan förbättra resultaten när träningen är begränsad av konsumenthårdvara. Hierarchical reinforcement learning reinforcement learning behaviour tree option critic Starcraft 2 Hierarkiskt förstärkningsinlärning förstärkningsinlärning beteendeträd option critic Starcraft 2 Computer Sciences Datavetenskap (datalogi)
16	Biased Exploration in Offline Hierarchical Reinforcement Learning Miller, Eric D. 26 January 2021 (has links) No description available. Computer Science Artificial Intelligence machine learning reinforcement learning offline learning biased sampling sampling hierarchy task hierarchy hierarchical reinforcement learning rl hrl exploration optimism offline reinforcement learning
17	Model-Free Reinforcement Learning for Hierarchical OO-MDPs Goldblatt, John Dallan 23 May 2022 (has links) No description available. Artificial Intelligence Computer Science reinforcement learning RL hierarchical reinforcement learning HRL hierarchical hierarchy object-oriented reinforcement learning object-oriented OO-MDP OOMDP MDP Q-Learning QLearning MaxQ KOOL
18	A hierarchical neural network approach to learning sensor planning and control Löfwenberg, Nicke January 2023 (has links) The ability to search their environment is one of the most fundamental skills for any living creature. Visual search in particular is abundantly common for almost all animals. This act of searching is generally active in nature, with vision not simply reacting to incoming stimuli but also actively searching the environment for potential stimuli (such as by moving their head or eyes). Automatic visual search, likewise, is a crucial and powerful tool within a wide variety of different fields. However, performing such an active search is a nontrivial issue for many machine learning approaches. The added complexity of choosing which area to observe, as well as the common case of having a camera with adaptive field-of-view capabilities further complicates the problem. Hierarchical Reinforcement Learning have in recent years proven to be a particularly powerful means of solving hard machine learning problems by a divide-and-conquer methodology, where one highly complex task can be broken down into smaller sub-tasks which on their own may be more easily learnable. In this thesis, we present a hierarchical reinforcement learning system for solving a visual search problem in a stationary camera environment with adjustable pan, tilt and field-of-view capabilities. This hierarchical model also incorporates non-reinforcement learning agents in its workflow to better utilize the strengths of different agents and form a more powerful overall model. This model is then compared to a non-hierarchical baseline as well as some learning-free approaches. sensor planning hierarchical reinforcement learning reinforcement learning sensor control camera control sensorplanering hierarkisk förstärkningsinlärning förstärkningsinlärning sensorkontroll kamerakontroll
19	Learning competitive ensemble of information-constrained primitives Sodhani, Shagun 07 1900 (has links) No description available. Reinforcement Learning Hierarchical Reinforcement Learning Information Bottleneck Compositionality Modular network Apprentissage par renforcement Goulot d'étranglement de l'information Compositionnalité Réseaux modulaires
20	Lifelong learning of concepts in CRAFT Vasishta, Nithin Venkatesh 08 1900 (has links) La planification à des niveaux d’abstraction plus élevés est essentielle lorsqu’il s’agit de résoudre des tâches à long horizon avec des complexités hiérarchiques. Pour planifier avec succès à un niveau d’abstraction donné, un agent doit comprendre le fonctionnement de l’environnement à ce niveau particulier. Cette compréhension peut être implicite en termes de politiques, de fonctions de valeur et de modèles, ou elle peut être définie explicitement. Dans ce travail, nous introduisons les concepts comme un moyen de représenter et d’accumuler explicitement des informations sur l’environnement. Les concepts sont définis en termes de transition d’état et des conditions requises pour que cette transition ait lieu. La simplicité de cette définition offre flexibilité et contrôle sur le processus d’apprentissage. Étant donné que les concepts sont de nature hautement interprétable, il est facile d’encoder les connaissances antérieures et d’intervenir au cours du processus d’apprentissage si nécessaire. Cette définition facilite également le transfert de concepts entre différents domaines. Les concepts, à un niveau d’abstraction donné, sont intimement liés aux compétences, ou actions temporellement abstraites. Toutes les transitions d’état suffisamment importantes pour être représentées par un concept se produisent après l’exécution réussie d’une compétence. En exploitant cette relation, nous introduisons un cadre qui facilite l’apprentissage tout au long de la vie et le raffinement des concepts à différents niveaux d’abstraction. Le cadre comporte trois volets: Le sytème 1 segmente un flux d’expérience (par exemple une démonstration) en une séquence de compétences. Cette segmentation peut se faire à différents niveaux d’abstraction. Le sytème 2 analyse ces segments pour affiner et mettre à niveau son ensemble de concepts, lorsqu’applicable. Le sytème 3 utilise les concepts disponibles pour générer un graphe de dépendance de sous-tâches. Ce graphe peut être utilisé pour planifier à différents niveaux d’abstraction. Nous démontrons l’applicabilité de ce cadre dans l’environnement hiérarchique 2D CRAFT. Nous effectuons des expériences pour explorer comment les concepts peuvent être appris de différents flux d’expérience et comment la qualité de la base de concepts affecte l’optimalité du plan général. Dans les tâches avec des dépendances de sous-tâches complexes, où la plupart des algorithmes ne parviennent pas à se généraliser ou prennent un temps impraticable à converger, nous démontrons que les concepts peuvent être utilisés pour simplifier considérablement la planification. Ce cadre peut également être utilisé pour comprendre l’intention d’une démonstration donnée en termes de concepts. Cela permet à l’agent de répliquer facilement la démonstration dans différents environnements. Nous montrons que cette méthode d’imitation est beaucoup plus robuste aux changements de configuration de l’environnement que les méthodes traditionnelles. Dans notre formulation du problème, nous faisons deux hypothèses: 1) que nous avons accès à un ensemble de compétences suffisamment exhaustif, et 2) que notre agent a accès à des environnements de pratique, qui peuvent être utilisés pour affiner les concepts en cas de besoin. L’objectif de ce travail est d’explorer l’aspect pratique des concepts d’apprentissage comme moyen d’améliorer la compréhension de l’environnement. Dans l’ensemble, nous démontrons que les concepts d’apprentissage / Planning at higher levels of abstraction is critical when it comes to solving long horizon tasks with hierarchical complexities. To plan successfully at a given level of abstraction, an agent must have an understanding of how the environment functions at that particular level. This understanding may be implicit in terms of policies, value functions, and world models, or it can be defined explicitly. In this work, we introduce concepts as a means to explicitly represent and accumulate information about the environment. Concepts are defined in terms of a state transition and the conditions required for that transition to take place. The simplicity of this definition offers flexibility and control over the learning process. Since concepts are highly interpretable in nature, it is easy to encode prior knowledge and intervene during the learning process if necessary. This definition also makes it relatively straightforward to transfer concepts across different domains wherever applicable. Concepts, at a given level of abstraction, are intricately linked to skills, or temporally abstracted actions. All the state transitions significant enough to be represented by a concept occur only after the successful execution of a skill. Exploiting this relationship, we introduce a framework that aids in lifelong learning and refining of concepts across different levels of abstraction. The framework has three components: - System 1 segments a stream of experience (e.g. a demonstration) into a sequence of skills. This segmentation can be done at different levels of abstraction. - System 2 analyses these segments to refine and upgrade its set of concepts, whenever applicable. - System 3 utilises the available concepts to generate a sub-task dependency graph. This graph can be used for planning at different levels of abstraction We demonstrate the applicability of this framework in the 2D hierarchical environment CRAFT. We perform experiments to explore how concepts can be learned from different streams of experience, and how the quality of the concept base affects the optimality of the overall plan. In tasks with complex sub-task dependencies, where most algorithms fail to generalise or take an impractical amount of time to converge, we demonstrate that concepts can be used to significantly simplify planning. This framework can also be used to understand the intention of a given demonstration in terms of concepts. This makes it easy for the agent to replicate a demonstration in different environments. We show that this method of imitation is much more robust to changes in the environment configurations than traditional methods. In our problem formulation, we make two assumptions: 1) that we have access to a sufficiently exhaustive set of skills, and 2) that our agent has access to practice environments, which can be used to refine concepts when needed. The objective behind this work is to explore the practicality of learning concepts as a means to improve one’s understanding about the environment. Overall, we demonstrate that learning concepts can be a light-weight yet efficient way to increase the capability of a system. Skill Segmentation Demonstration Segmentation Concept Learning Planning Hierarchical Reinforcement Learning Reinforcement Learning Segmentation des compétences Segmentation de démonstration Apprentissage conceptuel Planification Apprentissage par renforcement

Search results