Return to search

Accelerating Successive Approximation Algorithm Via Action Elimination

This research is an effort to improve the performance of successive approximation algorithm with a prime aim of solving finite states and actions, infinite horizon, stationary, discrete and discounted
Markov Decision Processes (MDPs). Successive approximation is a simple and commonly used method to solve MDPs. Successive approximation often appears to be intractable for solving large scale MDPs due to its computational complexity. Action elimination, one of the techniques used to accelerate solving MDPs, reduces the
problem size through identifying and eliminating sub-optimal actions. In some cases successive approximation is terminated when all actions but one per state are eliminated.




The bounds on value functions are the key element in action elimination. New terms (action gain, action relative gain and action
cumulative relative gain) were introduced to construct tighter bounds on the value functions and to propose an improved action
elimination algorithm.




When span semi-norm is used, we show numerically that the actual convergence of successive approximation is faster than the known theoretical rate. The absence of easy-to-compute bounds on the actual convergence rate motivated the current research to try a
heuristic action elimination algorithm. The heuristic utilizes an estimated convergence rate in the span semi-norm to speed up action
elimination. The algorithm demonstrated exceptional performance in terms of solution optimality and savings in computational time.




Certain types of structured Markov processes are known to have monotone optimal policy. Two special action elimination algorithms
are proposed in this research to accelerate successive approximation for these types of MDPs. The first algorithm uses the state space partitioning and prioritize iterate values updating in a way that maximizes temporary elimination of sub-optimal actions based on the policy monotonicity. The second algorithm is an improved version that includes permanent action elimination to improve the performance of the algorithm. The performance of the proposed algorithms are assessed and compared to that of other algorithms. The proposed algorithms demonstrated outstanding performance in
terms of number of iterations and omputational time to converge.

Identiferoai:union.ndltd.org:TORONTO/oai:tspace.library.utoronto.ca:1807/16794
Date20 January 2009
CreatorsJaber, Nasser M. A. Jr.
ContributorsLee, Chi-Guhn
Source SetsUniversity of Toronto
Languageen_ca
Detected LanguageEnglish
TypeThesis
Format2790800 bytes, application/pdf

Page generated in 0.0029 seconds