• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 27
  • 2
  • 1
  • 1
  • Tagged with
  • 37
  • 37
  • 14
  • 9
  • 9
  • 9
  • 8
  • 8
  • 8
  • 8
  • 8
  • 8
  • 7
  • 7
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Représentations relationnelles et apprentissage interactif pour l'apprentissage efficace du comportement coopératif / Relational representations and interactive learning for efficient cooperative behavior learning

Munzer, Thibaut 21 April 2017 (has links)
Cette thèse présente de nouvelles approches permettant l’apprentissage efficace et intuitif de plans de haut niveau pour les robots collaboratifs. Plus précisément, nous étudions l’application d’algorithmes d’apprentissage par démonstration dans des domaines relationnels. L’utilisation de domaines relationnels pour représenter le monde permet de simplifier la représentation de comportements concurrents et collaboratifs. Nous avons commencé par développer et étudier le premier algorithme d’apprentissage par renforcement inverse pour domaines relationnels. Nous avons ensuite présenté comment utiliser le formalisme RAP pour représenter des tâches collaboratives comprenant un robot et un opérateur humain. RAP est une extension des MDP relationnels qui permet de modéliser des activités concurrentes. Utiliser RAP nous a permis de représenter à la fois l’humain et le robot dans le même processus, mais également de modéliser des activités concurrentes du robot. Sous ce formalisme, nous avons montré qu’il était possible d’apprendre le comportement d’une équipe, à la fois comme une politique et une récompense. Si des connaissances a priori sur la tâche à réaliser sont disponibles, il est possible d’utiliser le même algorithme pour apprendre uniquement les préférences de l’opérateur. Cela permet de s’adapter à l’utilisateur. Nous avons montré que l’utilisation des représentations relationnelles permet d’apprendre des comportements collaboratifs à partir de peu de démonstrations.Ces comportements sont à la fois robustes au bruit, généralisables à de nouveaux états, et transférables à de nouveaux domaines (par exemple en ajoutant des objets). Nous avons également introduit une architecture d’apprentissage interactive qui permet au système de faire moins d’erreurs tout en demandant moins d’efforts à l’opérateur humain. Le robot, en estimant sa confiance dans ses décisions, est capable de demander des instructions quand il est incertain de l’activité à réaliser. Enfin, nous avons implémenté ces approches sur un robot et montré leurs impacts potentiels dans un scenario réaliste. / This thesis presents new approaches toward efficient and intuitive high-level plan learning for cooperative robots. More specifically this work study Learning from Demonstration algorithm for relational domains. Using relational representation to model the world, simplify representing concurrentand cooperative behavior.We have first developed and studied the first algorithm for Inverse ReinforcementLearning in relational domains. We have then presented how one can use the RAP formalism to represent Cooperative Tasks involving a robot and a human operator. RAP is an extension of the Relational MDP framework that allows modeling concurrent activities. Using RAP allow us to represent both the human and the robot in the same process but also to model concurrent robot activities. Under this formalism, we have demonstrated that it is possible to learn behavior, as policy and as reward, of a cooperative team. Prior knowledge about the task can also be used to only learn preferences of the operator.We have shown that, using relational representation, it is possible to learn cooperative behaviors from a small number of demonstration. That these behaviors are robust to noise, can generalize to new states and can transfer to different domain (for example adding objects). We have also introduced an interactive training architecture that allows the system to make fewer mistakes while requiring less effort from the human operator. By estimating its confidence the robot is able to ask for instructions when the correct activity to dois unsure. Lastly, we have implemented these approaches on a real robot and showed their potential impact on an ecological scenario.
12

An Imitation-Learning based Agentplaying Super Mario

Lindberg, Magnus January 2014 (has links)
Context. Developing an Artificial Intelligence (AI) agent that canpredict and act in all possible situations in the dynamic environmentsthat modern video games often consists of is on beforehand nearly im-possible and would cost a lot of money and time to create by hand. Bycreating a learning AI agent that could learn by itself by studying itsenvironment with the help of Reinforcement Learning (RL) it wouldsimplify this task. Another wanted feature that often is required is AIagents with a natural acting behavior and a try to solve that problemcould be to imitating a human by using Imitation Learning (IL). Objectives. The purpose of this investigation is to study if it is pos-sible to create a learning AI agent feasible to play and complete somelevels in a platform game with the combination of the two learningtechniques RL and IL. Methods. To be able to investigate the research question an imple-mentation is done that combines one RL technique and one IL tech-nique. By letting a set of human players play the game their behavioris saved and applied to the agents. The RL is then used to train andtweak the agents playing performance. A couple of experiments areexecuted to evaluate the differences between the trained agents againsttheir respective human teacher. Results. The results of these experiments showed promising indica-tions that the agents during different phases of the experiments hadsimilarly behavior compared to their human trainers. The agents alsoperformed well when comparing them to other already existing ones. Conclusions. To conclude there is promising results of creating dy-namical agents with natural behavior with the combination of RL andIL and that it with additional adjustments would make it performeven better as a learning AI with a more natural behavior.
13

Programming by demonstration of robot manipulators

Skoglund, Alexander January 2009 (has links)
If a non-expert wants to program a robot manipulator he needs a natural interface that does not require rigorous robot programming skills. Programming-by-demonstration (PbD) is an approach which enables the user to program a robot by simply showing the robot how to perform a desired task. In this approach, the robot recognizes what task it should perform and learn how to perform it by imitating the teacher. One fundamental problem in imitation learning arises from the fact that embodied agents often have different morphologies. Thus, a direct skill transfer from human to a robot is not possible in the general case. Therefore, we need a systematic approach to PbD that takes the capabilities of the robot into account–regarding both perception and body structure. In addition, the robot should be able to learn from experience and improve over time. This raises the question of how to determine the demonstrator’s goal or intentions. We show that this is possible–to some degree–to infer from multiple demonstrations. We address the problem of generation of a reach-to-grasp motion that produces the same results as a human demonstration. It is also of interest to learn what parts of a demonstration provide important information about the task. The major contribution is the investigation of a next-state-planner using a fuzzy time-modeling approach to reproduce a human demonstration on a robot. We show that the proposed planner can generate executable robot trajectories based on a generalization of multiple human demonstrations. We use the notion of hand-states as a common motion language between the human and the robot. It allows the robot to interpret the human motions as its own, and it also synchronizes reaching with grasping. Other contributions include the model-free learning of human to robot mapping, and how an imitation metric ca be used for reinforcement learning of new robot skills. The experimental part of this thesis presents the implementation of PbD of pick-and-place-tasks on different robotic hands/grippers. The different platforms consist of manipulators and motion capturing devices.
14

Novel Learning-Based Task Schedulers for Domain-Specific SoCs

January 2020 (has links)
abstract: This Master’s thesis includes the design, integration on-chip, and evaluation of a set of imitation learning (IL)-based scheduling policies: deep neural network (DNN)and decision tree (DT). We first developed IL-based scheduling policies for heterogeneous systems-on-chips (SoCs). Then, we tested these policies using a system-level domain-specific system-on-chip simulation framework [11]. Finally, we transformed them into efficient code using a cloud engine [1] and implemented on a user-space emulation framework [61] on a Unix-based SoC. IL is one area of machine learning (ML) and a useful method to train artificial intelligence (AI) models by imitating the decisions of an expert or Oracle that knows the optimal solution. This thesis's primary focus is to adapt an ML model to work on-chip and optimize the resource allocation for a set of domain-specific wireless and radar systems applications. Evaluation results with four streaming applications from wireless communications and radar domains show how the proposed IL-based scheduler approximates an offline Oracle expert with more than 97% accuracy and 1.20× faster execution time. The models have been implemented as an add-on, making it easy to port to other SoCs. / Dissertation/Thesis / Masters Thesis Computer Engineering 2020
15

Apprendre par imitation : applications à quelques problèmes d'apprentissage structuré en traitement des langues / Imitation learning : application to several structured learning tasks in natural language processing

Knyazeva, Elena 25 May 2018 (has links)
L’apprentissage structuré est devenu omniprésent dans le traitement automatique des langues naturelles. De nombreuses applications qui font maintenant partie de notre vie telles que des assistants personnels, la traduction automatique, ou encore la reconnaissance vocale, reposent sur ces techniques. Les problèmes d'apprentissage structuré qu’il est nécessaire de résoudre sont de plus en plus complexes et demandent de prendre en compte de plus en plus d’informations à des niveaux linguistiques variés (morphologique, syntaxique, etc.) et reposent la question du meilleurs compromis entre la finesse de la modélisation et l’exactitude des algorithmes d’apprentissage et d’inférence. L’apprentissage par imitation propose de réaliser les procédures d’apprentissage et d’inférence de manière approchée afin de pouvoir exploiter pleinement des structures de dépendance plus riches. Cette thèse explore ce cadre d’apprentissage, en particulier l’algorithme SEARN, à la fois sur le plan théorique ainsi que ses possibilités d’application aux tâches de traitement automatique des langues, notamment aux plus complexes telles que la traduction. Concernant les aspects théoriques, nous présentons un cadre unifié pour les différentes familles d’apprentissage par imitation, qui permet de redériver de manière simple les propriétés de convergence de ces algorithmes; concernant les aspects plus appliqués, nous utilisons l’apprentissage par imitation d’une part pour explorer l’étiquetage de séquences en ordre libre; d’autre part pour étudier des stratégies de décodage en deux étapes pour la traduction automatique. / Structured learning has become ubiquitousin Natural Language Processing; a multitude ofapplications, such as personal assistants, machinetranslation and speech recognition, to name just afew, rely on such techniques. The structured learningproblems that must now be solved are becomingincreasingly more complex and require an increasingamount of information at different linguisticlevels (morphological, syntactic, etc.). It is thereforecrucial to find the best trade-off between the degreeof modelling detail and the exactitude of the inferencealgorithm. Imitation learning aims to perform approximatelearning and inference in order to better exploitricher dependency structures. In this thesis, we explorethe use of this specific learning setting, in particularusing the SEARN algorithm, both from a theoreticalperspective and in terms of the practical applicationsto Natural Language Processing tasks, especiallyto complex tasks such as machine translation.Concerning the theoretical aspects, we introduce aunified framework for different imitation learning algorithmfamilies, allowing us to review and simplifythe convergence properties of the algorithms. With regardsto the more practical application of our work, weuse imitation learning first to experiment with free ordersequence labelling and secondly to explore twostepdecoding strategies for machine translation.
16

Creating Human-like AI Movement in Games Using Imitation Learning / Imitation Learning som verktyg för att skapa människolik rörelse för AI-karaktärer i spel

Renman, Casper January 2017 (has links)
The way characters move and behave in computer and video games are important factors in their believability, which has an impact on the player’s experience. This project explores Imitation Learning using limited amounts of data as an approach to creating human-like AI behaviour in games, and through a user study investigates what factors determine if a character is human-like, when observed through the characters first-person perspective. The idea is to create or shape AI behaviour by recording one's own actions. The implemented framework uses a Nearest Neighbour algorithm with a KD-tree as the policy which maps a state to an action. Results showed that the chosen approach was able to create human-like AI behaviour while respecting the performance constraints of a modern 3D game. / Sättet karaktärer rör sig och beter sig på i dator- och tvspel är viktiga faktoreri deras trovärdighet, som i sin tur har en inverkan på spelarens upplevelse. Det här projektet utforskar Imitation Learning med begränsad mängd data som etttillvägagångssätt för att skapa människolik rörelse för AI-karaktärer i spel, ochutforskar genom en användarstudie vilka faktorer som avgör om en karaktärär människolik, när karaktären observeras genom dess förstapersonsperspektiv. Iden är att skapa eller forma AI-beteende genom att spela in sina egna handlingar. Det implementerade ramverket använder en Nearest Neighbour-algoritmmed ett KD-tree som den policy som kopplar ett tillstånd till en handling. Resultatenvisade att det valda tillvägagångssättet lyckades skapa människolikt AI-beteende samtidigt som det respekterar beräkningskomplexitetsrestriktionersom ett modernt 3D-spel har.
17

Pose Imitation Constraints For Kinematic Structures

Glebys T Gonzalez (14486934) 09 February 2023 (has links)
<p> </p> <p>The usage of robots has increased in different areas of society and human work, including medicine, transportation, education, space exploration, and the service industry. This phenomenon has generated a sudden enthusiasm to develop more intelligent robots that are better equipped to perform tasks in a manner that is equivalently good as those completed by humans. Such jobs require human involvement as operators or teammates since robots struggle with automation in everyday settings. Soon, the role of humans will be far beyond users or stakeholders and include those responsible for training such robots. A popular teaching form is to allow robots to mimic human behavior. This method is intuitive and natural and does not require specialized knowledge of robotics. While there are other methods for robots to complete tasks effectively, collaborative tasks require mutual understanding and coordination that is best achieved by mimicking human motion. This mimicking problem has been tackled through skill imitation, which reproduces human-like motion during a task shown by a trainer. Skill imitation builds on faithfully replicating the human pose and requires two steps. In the first step, an expert's demonstration is captured and pre-processed, and motion features are obtained; in the second step, a learning algorithm is used to optimize for the task. The learning algorithms are often paired with traditional control systems to transfer the demonstration to the robot successfully. However, this methodology currently faces a generalization issue as most solutions are formulated for specific robots or tasks. The lack of generalization presents a problem, especially as the frequency at which robots are replaced and improved in collaborative environments is much higher than in traditional manufacturing. Like humans, we expect robots to have more than one skill and the same skills to be completed by more than one type of robot. Thus, we address this issue by proposing a human motion imitation framework that can be efficiently computed and generalized for different kinematic structures (e.g., different robots).</p> <p> </p> <p>This framework is developed by training an algorithm to augment collaborative demonstrations, facilitating the generalization to unseen scenarios. Later, we create a model for pose imitation that converts human motion to a flexible constraint space. This space can be directly mapped to different kinematic structures by specifying a correspondence between the main human joints (i.e., shoulder, elbow, wrist) and robot joints. This model permits having an unlimited number of robotic links between two assigned human joints, allowing different robots to mimic the demonstrated task and human pose. Finally, we incorporate the constraint model into a reward that informs a Reinforcement Learning algorithm during optimization. We tested the proposed methodology in different collaborative scenarios. Thereafter, we assessed the task success rate, pose imitation accuracy, the occlusion that the robot produces in the environment, the number of collisions, and finally, the learning efficiency of the algorithm.</p> <p> </p> <p>The results show that the proposed framework creates effective collaboration in different robots and tasks.</p>
18

Playstyle Generation with Multimodal Generative Adversarial Imitation Learning : Style-reward from Human Demonstration for Playtesting Agents / Spelstilsgenerering med Multimodal Generativ Motståndarimitationsinlärning : Spelstilsbelöning från Demonstrationer för Playtesting-Agenter

Ahlberg, William January 2023 (has links)
Playtesting plays a crucial role in video game production. The presence of gameplay issues and faulty design choices can be of great detriment to the overall player experience. Machine learning has the potential to be applied to automated playtesting solutions, removing mundane and repetitive testing, and allowing game designers and playtesters to focus their efforts on rewarding tasks. It is important in playtesting to consider the different playstyles players might use to adapt game design choices accordingly. With Reinforcement learning, it is possible to create high quality agents able to play and traverse complex game environments with fairly simple task-rewards. However, an automated playtesting solution must also be able to incorporate unique behaviour which mimic human playstyles. It can often be difficult to handcraft a quantitative style-reward to drive agent learning, especially for those with limited reinforcement learning experience, like game developers. MultiGAIL, Multimodal Generative Adversarial Imitation Learning, is a proposed learning algorithm able to generate autonomous agents imbued with human playstyles from recorded playstyle demonstrations. The proposed method requires no handcrafted style-reward, and can generate novel intermediate playstyles from demonstrated ones. MultiGAIL is evaluated in game environments resembling complex 3D games with both discrete and continuous action spaces. The playstyle the agent exhibits is easily controllable at inference with an auxiliary input parameter. Evaluation shows the agent is able to successfully replicate the underlying playstyles in human demonstrations, and that novel playstyles generate explainable action distributions indicative of the level of blending the auxiliary input declares. The results indicate that MultiGAIL could be a suitable solution to incorporate style behaviours in playtesting autonomous agents, and can be easily be used by those with limited domain knowledge of reinforcement learning. / ”Playtesting” har en viktig roll i TV-spelsutveckling. Fel i spel, såsom buggar och dålig speldesign kan drastiskt försämra spelupplevelsen. Maskininlärning kan användas för att automatisera testandet av spel och därmed ta bort behovet för människor att utföra repetitiva och tråkiga test. Spelutvecklare och speltestare kan då istället inrikta sig på mer nyttiga uppgifter. I playtesting så behöver de diverse spelstilar som spelare kan ha beaktas, så att spelutvecklare har möjligheten att anpassa spelet därefter. Förstärkande inlärning har använts för att skapa högkvalitativa agenter som kan spela och navigera komplexa spelmiljöer genom att definiera relativt simpla belöningsfunktioner. Dock är uppgiften att skapa en belöningsfunktion som formar agenten att följa specifika spelstilar en mycket svårare uppgift. Att anta att de utan förkunskaper inom maskininlärning och förstärkande inlärning, som spelutvecklare, ska kunna skapa sådana belöningsfunktioner är orealistiskt. MultiGAIL, Multimodal Generative Adversarial Imitation Learning", är en maskininlärningsalgoritm som kan generera autonoma agenter som efterföljer spelstilar med hjälp av tillgången till inspelade spelstilsdemonstrationer. Metoden kräver inga hårdkodade stilbelöningar och kan interpolera de spelstilarna funna i demonstrationerna, därav skapa nya beteenden för agenterna. MultiGAIL evalueras i spelmiljöer liknande komplexa 3D spel och kan använda både diskreta och kontinuerliga åtgärdsrum. Den spelstil agenten uppvisar kan enkelt kontrolleras vid inferens av en varierbar parameter. Vår evaluering visar att metoden kan lära agenten att korrekt imitera de spelstilar som definieras av inspelade demonstrationer. Nya spelstilar generade av MultiGAIL har förutsägbara beteenden utefter värdet på den varierande parametern. MultiGAIL kan mycket troligt användas för att skapa playtesting autonoma agenter som beter sig utefter specifika spelstilar utan att behöva definiera en belöningsfunktion.
19

Using Imitation Learning for Human Motion Control in a Virtual Simulation

Akrin, Christoffer January 2022 (has links)
Test Automation is becoming a more vital part of the software development cycle, as it aims to lower the cost of testing and allow for higher test frequency. However, automating manual tests can be difficult as they tend to require complex human interaction. In this thesis, we aim to solve this by using Imitation Learning as a tool for automating manual software tests. The software under test consists of a virtual simulation, connected to a physical input device in the form of a sight. The sight can rotate on two axes, yaw and pitch, which require human motion control. Based on this, we use a Behavioral Cloning approach with a k-NN regressor trained on human demonstrations. Evaluation of model resemblance to the human is done by comparing the state path taken by the model and human. The model task performance is measured with a score based on the time taken to stabilize the sight pointing at a given object in the virtual world. The results show that a simple k-NN regression model using high-level states and actions, and with limited data, can imitate the human motion well. The model tends to be slightly faster than the human on the task while keeping realistic motion. It also shows signs of human errors, such as overshooting the object at higher angular velocities. Based on the results, we conclude that using Imitation Learning for Test Automation can be practical for specific tasks, where capturing human factors are of importance. However, further exploration is needed to identify the full potential of Imitation Learning in Test Automation.
20

Learning Preference Models for Autonomous Mobile Robots in Complex Domains

Silver, David 01 December 2010 (has links)
Achieving robust and reliable autonomous operation even in complex unstructured environments is a central goal of field robotics. As the environments and scenarios to which robots are applied have continued to grow in complexity, so has the challenge of properly defining preferences and tradeoffs between various actions and the terrains they result in traversing. These definitions and parameters encode the desired behavior of the robot; therefore their correctness is of the utmost importance. Current manual approaches to creating and adjusting these preference models and cost functions have proven to be incredibly tedious and time-consuming, while typically not producing optimal results except in the simplest of circumstances. This thesis presents the development and application of machine learning techniques that automate the construction and tuning of preference models within complex mobile robotic systems. Utilizing the framework of inverse optimal control, expert examples of robot behavior can be used to construct models that generalize demonstrated preferences and reproduce similar behavior. Novel learning from demonstration approaches are developed that offer the possibility of significantly reducing the amount of human interaction necessary to tune a system, while also improving its final performance. Techniques to account for the inevitability of noisy and imperfect demonstration are presented, along with additional methods for improving the efficiency of expert demonstration and feedback. The effectiveness of these approaches is confirmed through application to several real world domains, such as the interpretation of static and dynamic perceptual data in unstructured environments and the learning of human driving styles and maneuver preferences. Extensive testing and experimentation both in simulation and in the field with multiple mobile robotic systems provides empirical confirmation of superior autonomous performance, with less expert interaction and no hand tuning. These experiments validate the potential applicability of the developed algorithms to a large variety of future mobile robotic systems.

Page generated in 0.1313 seconds