Global ETD Search

Return to search

State-similarity metrics for continuous Markov decision processes

In recent years, various metrics have been developed for measuring the similarity of states in probabilistic transition systems (Desharnais et al., 1999; van Breugel & Worrell, 2001a). In the context of Markov decision processes, we have devised metrics providing a robust quantitative analogue of bisimulation. Most importantly, the metric distances can be used to bound the differences in the optimal value function that is integral to reinforcement learning (Ferns et al. 2004; 2005). More recently, we have discovered an efficient algorithm to calculate distances in the case of finite systems (Ferns et al., 2006). In this thesis, we seek to properly extend state-similarity metrics to Markov decision processes with continuous state spaces both in theory and in practice. In particular, we provide the first distance-estimation scheme for metrics based on bisimulation for continuous probabilistic transition systems. Our work, based on statistical sampling and infinite dimensional linear programming, is a crucial first step in real-world planning; many practical problems are continuous in nature, e.g. robot navigation, and often a parametric model or crude finite approximation does not suffice. State-similarity metrics allow us to reason about the quality of replacing one model with another. In practice, they can be used directly to aggregate states.

http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=103383

Markov processes.

Reinforcement learning.

Identifer	oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:QMM.103383
Date	January 2007
Creators	Ferns, Norman Francis.
Publisher	McGill University
Source Sets	Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
Language	English
Detected Language	English
Type	Electronic Thesis or Dissertation
Format	application/pdf
Coverage	Doctor of Philosophy (School of Computer Science.)
Rights	© Norman Francis Ferns, 2007
Relation	alephsysno: 002769473, proquestno: AAINR50813, Theses scanned by UMI/ProQuest.

Page generated in 0.002 seconds

State-similarity metrics for continuous Markov decision processes

Description

Links & Downloads

Tags

Additional Fields