Global ETD Search

Return to search

Piecewise linear Markov decision processes with an application to partially observable Markov models

This dissertation applies policy improvement and successive
approximation or value iteration to a general class of Markov decision processes with discounted costs. In particular, a class of Markov decision processes, called piecewise-linear, is studied. Piecewise-linear processes are characterized by the property that the value function of a process observed for one period and then terminated is piecewise-linear if the terminal reward function is piecewise-linear. Partially observable Markov decision processes have this property.
It is shown that there are e-optimal piecewise-linear value functions and piecewise-constant policies which are simple. Simple means that there are only finitely many pieces, each of which is defined on a convex polyhedral set. Algorithms based on policy improvement and successive approximation are developed to compute simple approximations to an optimal policy and the optimal value function. / Business, Sauder School of / Graduate

http://hdl.handle.net/2429/20920

Piecewise linear topology

Markov processes

Identifer	oai:union.ndltd.org:UBC/oai:circle.library.ubc.ca:2429/20920
Date	January 1977
Creators	Sawaki, Katsushige
Source Sets	University of British Columbia
Language	English
Detected Language	English
Type	Text, Thesis/Dissertation
Rights	For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use.

Page generated in 0.0017 seconds

Piecewise linear Markov decision processes with an application to partially observable Markov models

Description

Links & Downloads

Tags

Additional Fields