Spelling suggestions: "subject:"suboptimal bounds"" "subject:"suboptimale bounds""
1 |
Improved Heuristic Search Algorithms for Decision-Theoretic PlanningAbdoulahi, Ibrahim 08 December 2017 (has links)
A large class of practical planning problems that require reasoning about uncertain outcomes, as well as tradeoffs among competing goals, can be modeled as Markov decision processes (MDPs). This model has been studied for over 60 years, and has many applications that range from stochastic inventory control and supply-chain planning, to probabilistic model checking and robotic control. Standard dynamic programming algorithms solve these problems for the entire state space. A more efficient heuristic search approach focuses computation on solving these problems for the relevant part of the state space only, given a start state, and using heuristics to identify irrelevant parts of the state space that can be safely ignored. This dissertation considers the heuristic search approach to this class of problems, and makes three contributions that advance this approach. The first contribution is a novel algorithm for solving MDPs that integrates the standard value iteration algorithm with branch-and-bound search. Called branch-and-bound value iteration, the new algorithm has several advantages over existing algorithms. The second contribution is the integration of recently-developed suboptimality bounds in heuristic search algorithm for MDPs, making it possible for iterative algorithms for solving these planning problems to detect convergence to a bounded-suboptimal solution. The third contribution is the evaluation and analysis of some techniques that are widely-used by state-of-the-art planning algorithms, the identification of some weaknesses of these techniques, and the development of a more efficient implementation of one of these techniques -- a solved-labeling procedure that speeds converge by leveraging a decomposition of the state-space graph of a planning problem into strongly-connected components. The new algorithms and techniques introduced in this dissertation are experimentally evaluated on a range of widely-used planning benchmarks.
|
Page generated in 0.0533 seconds