Return to search

Temporal Markov Decision Problems : Formalization and Resolution

This thesis addresses the question of planning under uncertainty within a time-dependent changing environment. Original motivation for this work came from the problem of building an autonomous agent able to coordinate with its
uncertain environment; this environment being composed of other agents communicating their intentions or non-controllable processes for which some discrete-event model is available. We investigate several approaches for modeling continuous time-dependency in the framework of Markov Decision Processes (MDPs), leading us to a definition of Temporal Markov Decision Problems. Then our approach focuses on two separate paradigms. First, we investigate time-dependent problems as \emph{implicit-event} processes and describe them through the formalism of Time-dependent MDPs (TMDPs). We extend the existing results concerning optimality equations and present a new Value Iteration algorithm based on piecewise polynomial function representations in order to solve a more general class of TMDPs. This paves the way to a more general discussion on parametric actions in hybrid state and action spaces MDPs with continuous time. In a second time, we investigate the
option of separately modeling the concurrent contributions of exogenous events. This approach of \emph{explicit-event} modeling leads to the use of Generalized Semi-Markov Decision Processes (GSMDP). We establish a link between the general framework of Discrete Events Systems Specification (DEVS) and the formalism of GSMDP, allowing us to build sound discrete-event compatible simulators. Then we introduce a simulation-based Policy Iteration approach for
explicit-event Temporal Markov Decision Problems. This algorithmic contribution brings together results from simulation theory, forward search in MDPs, and statistical learning theory. The implicit-event approach was tested on a
specific version of the Mars rover planning problem and on a drone patrol mission planning problem while the explicit-event approach was evaluated on a subway network control problem.

Identiferoai:union.ndltd.org:univ-toulouse.fr/oai:oatao.univ-toulouse.fr:17988
Date23 March 2009
CreatorsRachelson, Emmanuel
Source SetsUniversité de Toulouse
Detected LanguageEnglish
TypePhD Thesis, PeerReviewed, info:eu-repo/semantics/doctoralThesis
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
Relationhttps://ori-oai.isae-supaero.fr/ori-oai-search/notice/view/isae-875, http://oatao.univ-toulouse.fr/17988/

Page generated in 0.002 seconds