Global ETD Search

1	Accelerating Successive Approximation Algorithm Via Action Elimination Jaber, Nasser M. A. Jr. 20 January 2009 (has links) This research is an effort to improve the performance of successive approximation algorithm with a prime aim of solving finite states and actions, infinite horizon, stationary, discrete and discounted Markov Decision Processes (MDPs). Successive approximation is a simple and commonly used method to solve MDPs. Successive approximation often appears to be intractable for solving large scale MDPs due to its computational complexity. Action elimination, one of the techniques used to accelerate solving MDPs, reduces the problem size through identifying and eliminating sub-optimal actions. In some cases successive approximation is terminated when all actions but one per state are eliminated. The bounds on value functions are the key element in action elimination. New terms (action gain, action relative gain and action cumulative relative gain) were introduced to construct tighter bounds on the value functions and to propose an improved action elimination algorithm. When span semi-norm is used, we show numerically that the actual convergence of successive approximation is faster than the known theoretical rate. The absence of easy-to-compute bounds on the actual convergence rate motivated the current research to try a heuristic action elimination algorithm. The heuristic utilizes an estimated convergence rate in the span semi-norm to speed up action elimination. The algorithm demonstrated exceptional performance in terms of solution optimality and savings in computational time. Certain types of structured Markov processes are known to have monotone optimal policy. Two special action elimination algorithms are proposed in this research to accelerate successive approximation for these types of MDPs. The first algorithm uses the state space partitioning and prioritize iterate values updating in a way that maximizes temporary elimination of sub-optimal actions based on the policy monotonicity. The second algorithm is an improved version that includes permanent action elimination to improve the performance of the algorithm. The performance of the proposed algorithms are assessed and compared to that of other algorithms. The proposed algorithms demonstrated outstanding performance in terms of number of iterations and omputational time to converge. 0546
2	Accelerating Successive Approximation Algorithm Via Action Elimination Jaber, Nasser M. A. Jr. 20 January 2009 (has links) This research is an effort to improve the performance of successive approximation algorithm with a prime aim of solving finite states and actions, infinite horizon, stationary, discrete and discounted Markov Decision Processes (MDPs). Successive approximation is a simple and commonly used method to solve MDPs. Successive approximation often appears to be intractable for solving large scale MDPs due to its computational complexity. Action elimination, one of the techniques used to accelerate solving MDPs, reduces the problem size through identifying and eliminating sub-optimal actions. In some cases successive approximation is terminated when all actions but one per state are eliminated. The bounds on value functions are the key element in action elimination. New terms (action gain, action relative gain and action cumulative relative gain) were introduced to construct tighter bounds on the value functions and to propose an improved action elimination algorithm. When span semi-norm is used, we show numerically that the actual convergence of successive approximation is faster than the known theoretical rate. The absence of easy-to-compute bounds on the actual convergence rate motivated the current research to try a heuristic action elimination algorithm. The heuristic utilizes an estimated convergence rate in the span semi-norm to speed up action elimination. The algorithm demonstrated exceptional performance in terms of solution optimality and savings in computational time. Certain types of structured Markov processes are known to have monotone optimal policy. Two special action elimination algorithms are proposed in this research to accelerate successive approximation for these types of MDPs. The first algorithm uses the state space partitioning and prioritize iterate values updating in a way that maximizes temporary elimination of sub-optimal actions based on the policy monotonicity. The second algorithm is an improved version that includes permanent action elimination to improve the performance of the algorithm. The performance of the proposed algorithms are assessed and compared to that of other algorithms. The proposed algorithms demonstrated outstanding performance in terms of number of iterations and omputational time to converge. 0546
3	Improved Heuristic Search Algorithms for Decision-Theoretic Planning Abdoulahi, Ibrahim 08 December 2017 (has links) A large class of practical planning problems that require reasoning about uncertain outcomes, as well as tradeoffs among competing goals, can be modeled as Markov decision processes (MDPs). This model has been studied for over 60 years, and has many applications that range from stochastic inventory control and supply-chain planning, to probabilistic model checking and robotic control. Standard dynamic programming algorithms solve these problems for the entire state space. A more efficient heuristic search approach focuses computation on solving these problems for the relevant part of the state space only, given a start state, and using heuristics to identify irrelevant parts of the state space that can be safely ignored. This dissertation considers the heuristic search approach to this class of problems, and makes three contributions that advance this approach. The first contribution is a novel algorithm for solving MDPs that integrates the standard value iteration algorithm with branch-and-bound search. Called branch-and-bound value iteration, the new algorithm has several advantages over existing algorithms. The second contribution is the integration of recently-developed suboptimality bounds in heuristic search algorithm for MDPs, making it possible for iterative algorithms for solving these planning problems to detect convergence to a bounded-suboptimal solution. The third contribution is the evaluation and analysis of some techniques that are widely-used by state-of-the-art planning algorithms, the identification of some weaknesses of these techniques, and the development of a more efficient implementation of one of these techniques -- a solved-labeling procedure that speeds converge by leveraging a decomposition of the state-space graph of a planning problem into strongly-connected components. The new algorithms and techniques introduced in this dissertation are experimentally evaluated on a range of widely-used planning benchmarks. Planning under Uncertainty Value Iteration Heuristic Search Suboptimality Bounds Action Elimination Markov Decision Process

1

Page generated in 0.447 seconds