Global ETD Search

Return to search

Algorithms for stochastic finite memory control of partially observable systems

A partially observable Markov decision process (POMDP) is a mathematical framework for planning and control problems in which actions have stochastic effects and observations provide uncertain state information. It is widely used for research in decision-theoretic planning and reinforcement learning. % To cope with partial observability, a policy (or plan) must use memory, and previous work has shown that a finite-state controller provides a good policy representation. This thesis considers a previously-developed bounded policy iteration algorithm for POMDPs that finds policies that take the form of stochastic finite-state controllers. Two new improvements of this algorithm are developed. First improvement provides a simplification of the basic linear program, which is used to find improved controllers. This results in a considerable speed-up in efficiency of the original algorithm. Secondly, a branch and bound algorithm for adding the best possible node to the controller is presented, which provides an error bound and a test for global optimality. Experimental results show that these enhancements significantly improve the algorithm's performance.

POMDP

Identifer	oai:union.ndltd.org:MSSTATE/oai:scholarsjunction.msstate.edu:td-1432
Date	06 August 2005
Creators	Marwah, Gaurav
Publisher	Scholars Junction
Source Sets	Mississippi State University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations

Page generated in 0.0019 seconds

Algorithms for stochastic finite memory control of partially observable systems

Description

Links & Downloads

Tags

Additional Fields