Reinforcement learning is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment. Generally, the problem to be solved contains subtasks that repeat at different regions of the state space. Without any guidance an agent has to learn the solutions of all subtask instances independently, which degrades the learning performance.
In this thesis, we propose two approaches to build connections between different regions of the search space leading to better utilization of gained experience and accelerate learning is proposed. In the first approach, we first extend existing work of McGovern and propose the formalization of stochastic conditionally terminating sequences with higher representational power. Then, we describe how to efficiently discover and employ useful abstractions during learning based on such sequences. The method constructs a tree structure to keep track of frequently used action sequences together with visited states. This tree is then used to select actions to be executed at each step.
In the second approach, we propose a novel method to identify states with similar sub-policies, and show how they can be integrated into reinforcement learning framework to improve the learning performance. The method uses an efficient data structure to find common action sequences started from observed states and defines a similarity function between states based on the number of such sequences. Using this similarity function, updates on the action-value function of a state are reflected to all similar states. This, consequently, allows experience acquired during learning be applied to a broader context.
Effectiveness of both approaches is demonstrated empirically by conducting extensive experiments on various domains.
Identifer | oai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/12608257/index.pdf |
Date | 01 March 2007 |
Creators | Girgin, Sertan |
Contributors | Polat, Faruk |
Publisher | METU |
Source Sets | Middle East Technical Univ. |
Language | English |
Detected Language | English |
Type | Ph.D. Thesis |
Format | text/pdf |
Rights | To liberate the content for public access |
Page generated in 0.0016 seconds