Global ETD Search

Return to search

Reinforcement Learning from Demonstration

Off-the-shelf Reinforcement Learning (RL) algorithms suffer from slow learning performance, partly because they are expected to learn a task from scratch merely through an agent's own experience. In this thesis, we show that learning from scratch is a limiting factor for the learning performance, and that when prior knowledge is available RL agents can learn a task faster. We evaluate relevant previous work and our own algorithms in various experiments. Our first contribution is the first implementation and evaluation of an existing interactive RL algorithm in a real-world domain with a humanoid robot. Interactive RL was evaluated in a simulated domain which motivated us for evaluating its practicality on a robot. Our evaluation shows that guidance reduces learning time, and that its positive effects increase with state space size. A natural follow up question after our first evaluation was, how do some other previous works compare to interactive RL. Our second contribution is an analysis of a user study, where na"ive human teachers demonstrated a real-world object catching with a humanoid robot. We present the first comparison of several previous works in a common real-world domain with a user study. One conclusion of the user study was the high potential of RL despite poor usability due to slow learning rate. As an effort to improve the learning efficiency of RL learners, our third contribution is a novel human-agent knowledge transfer algorithm. Using demonstrations from three teachers with varying expertise in a simulated domain, we show that regardless of the skill level, human demonstrations can improve the asymptotic performance of an RL agent. As an alternative approach for encoding human knowledge in RL, we investigated the use of reward shaping. Our final contributions are Static Inverse Reinforcement Learning Shaping and Dynamic Inverse Reinforcement Learning Shaping algorithms that use human demonstrations for recovering a shaping reward function. Our experiments in simulated domains show that our approach outperforms the state-of-the-art in cumulative reward, learning rate and asymptotic performance. Overall we show that human demonstrators with varying skills can help RL agents to learn tasks more efficiently.

artificial intelligence

rule learning

machine learning

policy learning

robot learning from demonstration

transfer learning

agents

robot learning

learning from demonstration

reward shaping

reinforcement learning

Identifer	oai:union.ndltd.org:wpi.edu/oai:digitalcommons.wpi.edu:etd-dissertations-1172
Date	25 April 2016
Creators	Suay, Halit Bener
Contributors	Sonia Chernova, Advisor, ,
Publisher	Digital WPI
Source Sets	Worcester Polytechnic Institute
Detected Language	English
Type	text
Format	application/pdf
Source	Doctoral Dissertations (All Dissertations, All Years)

Page generated in 0.0016 seconds

Reinforcement Learning from Demonstration

Description

Links & Downloads

Tags

Additional Fields