Return to search

Reinforcement learning in the presence of rare events

Learning agents often find themselves in environments in which rare significant events occur independently of their current choice of action. Traditional reinforcement learning algorithms sample events according to their natural probability of occurring, and therefore tend to exhibit slow convergence and high variance in such environments. In this thesis, we assume that learning is done in a simulated environment in which the probability of these rare events can be artificially altered. We present novel algorithms for both policy evaluation and control, using both tabular and function approximation representations of the value function. These algorithms automatically tune the rare event probabilities to minimize the variance and use importance sampling to correct for changes in the dynamics. We prove that these algorithms converge, provide an analysis of their bias and variance, and demonstrate their utility in a number of domains, including a large network planning task.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:QMM.111576
Date January 2009
CreatorsFrank, Jordan William, 1980-
PublisherMcGill University
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Formatapplication/pdf
CoverageMaster of Science (School of Computer Science.)
RightsAll items in eScholarship@McGill are protected by copyright with all rights reserved unless otherwise indicated.
Relationalephsysno: 003164349, proquestno: AAIMR66878, Theses scanned by UMI/ProQuest.

Page generated in 0.0097 seconds