Global ETD Search

Return to search

Automatic State Construction using Decision Trees for Reinforcement Learning Agents

Reinforcement Learning (RL) is a learning framework in which an agent learns a policy from continual interaction with the environment. A policy is a mapping from states to actions. The agent receives rewards as feedback on the actions performed. The objective of RL is to design autonomous agents to search for the policy that maximizes the expectation of the cumulative reward. When the environment is partially observable, the agent cannot determine the states with certainty. These states are called hidden in the literature. An agent that relies exclusively on the current observations will not always find the optimal policy. For example, a mobile robot needs to remember the number of doors went by in order to reach a specific door, down a corridor of identical doors. To overcome the problem of partial observability, an agent uses both current and past (memory) observations to construct an internal state representation, which is treated as an abstraction of the environment. This research focuses on how features of past events are extracted with variable granularity regarding the internal state construction. The project introduces a new method that applies Information Theory and decision tree technique to derive a tree structure, which represents the state and the policy. The relevance, of a candidate feature, is assessed by the Information Gain Ratio ranking with respect to the cumulative expected reward. Experiments carried out on three different RL tasks have shown that our variant of the U-Tree (McCallum, 1995) produces a more robust state representation and faster learning. This better performance can be explained by the fact that the Information Gain Ratio exhibits a lower variance in return prediction than the Kolmogorov-Smirnov statistical test used in the original U-Tree algorithm.

http://eprints.qut.edu.au/15965/

Reinforcement learning

Automatic state construction

Decision tree

Partial observability

U-Tree

Kolmogorov-Smirnov two sample test

Information gain ratio test

Identifer	oai:union.ndltd.org:ADTP/264960
Date	January 2005
Creators	Au, Manix
Publisher	Queensland University of Technology
Source Sets	Australiasian Digital Theses Program
Detected Language	English
Rights	Copyright Manix Au

Page generated in 0.0027 seconds

Automatic State Construction using Decision Trees for Reinforcement Learning Agents

Description

Links & Downloads

Tags

Additional Fields