Return to search

Learning and planning in structured worlds

This thesis is concerned with the problem of how to make decisions in an uncertain
world. We use a model of uncertainty based on Markov decision problems, and
develop a number of algorithms for decision-making both for the planning problem,
in which the model is known in advance, and for the reinforcement learning problem
in which the decision-making agent does not know the model and must learn to make
good decisions by trial and error.
The basis for much of this work is the use of structural representations of
problems. If a problem is represented in a structured way we can compute or
learn plans that take advantage of this structure for computational gains. This
is because the structure allows us to perform abstraction. Rather than reasoning
about each situation in which a decision must be made individually, abstraction
allows us to group situations together and reason about a whole set of them in a
single step. Our approach to abstraction has the additional advantage that we can
dynamically change the level of abstraction, splitting a group of situations in two if
they need to be reasoned about separately to find an acceptable plan, or merging
two groups together if they no longer need to be distinguished. We present two
planning algorithms and one learning algorithm that use this approach.
A second idea we present in this thesis is a novel approach to the exploration
problem in reinforcement learning. The problem is to select actions to perform
given that we would like good performance now and in the future. We can select
the current best action to perform, but this may prevent us from discovering that
another action is better, or we can take an exploratory action, but we risk performing
poorly now as a result. Our Bayesian approach makes this tradeoff explicit by
representing our uncertainty about the values of states and using this measure of
uncertainty to estimate the value of the information we could gain by performing
each action. We present both model-free and model-based reinforcement learning
algorithms that make use of this exploration technique.
Finally, we show how these ideas fit together to produce a reinforcement
learning algorithm that uses structure to represent both the problem being solved

and the plan it learns, and that selects actions to perform in order to learn using
our Bayesian approach to exploration. / Science, Faculty of / Computer Science, Department of / Graduate

Identiferoai:union.ndltd.org:UBC/oai:circle.library.ubc.ca:2429/11051
Date11 1900
CreatorsDearden, Richard W.
Source SetsUniversity of British Columbia
LanguageEnglish
Detected LanguageEnglish
TypeText, Thesis/Dissertation
Format11244993 bytes, application/pdf
RightsFor non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Page generated in 0.0019 seconds