Global ETD Search

1	Decision-Theoretic Planning under Risk-Sensitive Planning Objectives Liu, Yaxin 18 April 2005 (has links) Risk attitudes are important for human decision making, especially in scenarios where huge wins or losses are possible, as exemplified by planetary rover navigation, oilspill response, and business applications. Decision-theoretic planners therefore need to take risk aspects into account to serve their users better. However, most existing decision-theoretic planners use simplistic planning objectives that are risk-neutral. The thesis research is the first comprehensive study of how to incorporate risk attitudes into decision-theoretic planners and solve large-scale planning problems represented as Markov decision process models. The thesis consists of three parts. The first part of the thesis work studies risk-sensitive planning in case where exponential utility functions are used to model risk attitudes. I show that existing decision-theoretic planners can be transformed to take risk attitudes into account. Moreover, different versions of the transformation are needed if the transition probabilities are implicitly given, namely, temporally extended probabilities and probabilities given in a factored form. The second part of the thesis work studies risk-sensitive planning in case where general nonlinear utility functions are used to model risk attitudes. I show that a state-augmentation approach can be used to reduce a risk-sensitive planning problem to a risk-neutral planning problem with an augmented state space. I further use a functional interpretation of value functions and approximation methods to solve the planning problems efficiently with value iteration. I also show an exact method for solving risk-sensitive planning problems where one-switch utility functions are used to model risk attitudes. The third part of the thesis work studies risk sensitive planning in case where arbitrary rewards are used. I propose a spectrum of conditions that can be used to constrain the utility function and the planning problem so that the optimal expected utilities exist and are finite. I prove that the existence and finiteness properties hold for stationary plans, where the action to perform in each state does not change over time, under different sets of conditions. Risk attitudes Utility theory Markov decision processes Risk-sensitive planning Decision-theoretic planning
2	FASTER DYNAMIC PROGRAMMING FOR MARKOV DECISION PROCESSES Dai, Peng 01 January 2007 (has links) Markov decision processes (MDPs) are a general framework used by Artificial Intelligence (AI) researchers to model decision theoretic planning problems. Solving real world MDPs has been a major and challenging research topic in the AI literature. This paper discusses two main groups of approaches in solving MDPs. The first group of approaches combines the strategies of heuristic search and dynamic programming to expedite the convergence process. The second makes use of graphical structures in MDPs to decrease the effort of classic dynamic programming algorithms. Two new algorithms proposed by the author, MBLAO* and TVI, are described here.
3	Bidirectional LAO* Algorithm (A Faster Approach to Solve Goal-directed MDPs) Bhuma, Venkata Deepti Kiran 01 January 2004 (has links) Uncertainty is a feature of many AI applications. While there are polynomial-time algorithms for planning in stochastic systems, planning is still slow, in part because most algorithms plan for all eventualities. Algorithms such as LAO* are able to find good or optimal policies more quickly when the starting state of the system is known. In this thesis we present an extension to LAO, called BLAO. BLAO* is an extension of the LAO* algorithm to a bidirectional search. We show that BLAO* finds optimal or E-optimal solutions for goal-directed MDPs without necessarily evaluating the entire state space. BLAO* converges much faster than LAO* or RTDP on our benchmarks. Planning under uncertainty Markov decision processes heuristic search Decision-theoretic planning bidirectional search Computer and Systems Architecture Computer Sciences
4	Policy Explanation and Model Refinement in Decision-Theoretic Planning Khan, Omar Zia January 2013 (has links) Decision-theoretic systems, such as Markov Decision Processes (MDPs), are used for sequential decision-making under uncertainty. MDPs provide a generic framework that can be applied in various domains to compute optimal policies. This thesis presents techniques that offer explanations of optimal policies for MDPs and then refine decision theoretic models (Bayesian networks and MDPs) based on feedback from experts. Explaining policies for sequential decision-making problems is difficult due to the presence of stochastic effects, multiple possibly competing objectives and long-range effects of actions. However, explanations are needed to assist experts in validating that the policy is correct and to help users in developing trust in the choices recommended by the policy. A set of domain-independent templates to justify a policy recommendation is presented along with a process to identify the minimum possible number of templates that need to be populated to completely justify the policy. The rejection of an explanation by a domain expert indicates a deficiency in the model which led to the generation of the rejected policy. Techniques to refine the model parameters such that the optimal policy calculated using the refined parameters would conform with the expert feedback are presented in this thesis. The expert feedback is translated into constraints on the model parameters that are used during refinement. These constraints are non-convex for both Bayesian networks and MDPs. For Bayesian networks, the refinement approach is based on Gibbs sampling and stochastic hill climbing, and it learns a model that obeys expert constraints. For MDPs, the parameter space is partitioned such that alternating linear optimization can be applied to learn model parameters that lead to a policy in accordance with expert feedback. In practice, the state space of MDPs can often be very large, which can be an issue for real-world problems. Factored MDPs are often used to deal with this issue. In Factored MDPs, state variables represent the state space and dynamic Bayesian networks model the transition functions. This helps to avoid the exponential growth in the state space associated with large and complex problems. The approaches for explanation and refinement presented in this thesis are also extended for the factored case to demonstrate their use in real-world applications. The domains of course advising to undergraduate students, assisted hand-washing for people with dementia and diagnostics for manufacturing are used to present empirical evaluations. Decision-Theoretic Planning Markov Decision Processes Sequential Decision Making Reasoning under Uncertainty Bayesian Networks Policy Explanation Parameter Learning Model Refinement Computer Science
5	Policy Explanation and Model Refinement in Decision-Theoretic Planning Khan, Omar Zia January 2013 (has links) Decision-theoretic systems, such as Markov Decision Processes (MDPs), are used for sequential decision-making under uncertainty. MDPs provide a generic framework that can be applied in various domains to compute optimal policies. This thesis presents techniques that offer explanations of optimal policies for MDPs and then refine decision theoretic models (Bayesian networks and MDPs) based on feedback from experts. Explaining policies for sequential decision-making problems is difficult due to the presence of stochastic effects, multiple possibly competing objectives and long-range effects of actions. However, explanations are needed to assist experts in validating that the policy is correct and to help users in developing trust in the choices recommended by the policy. A set of domain-independent templates to justify a policy recommendation is presented along with a process to identify the minimum possible number of templates that need to be populated to completely justify the policy. The rejection of an explanation by a domain expert indicates a deficiency in the model which led to the generation of the rejected policy. Techniques to refine the model parameters such that the optimal policy calculated using the refined parameters would conform with the expert feedback are presented in this thesis. The expert feedback is translated into constraints on the model parameters that are used during refinement. These constraints are non-convex for both Bayesian networks and MDPs. For Bayesian networks, the refinement approach is based on Gibbs sampling and stochastic hill climbing, and it learns a model that obeys expert constraints. For MDPs, the parameter space is partitioned such that alternating linear optimization can be applied to learn model parameters that lead to a policy in accordance with expert feedback. In practice, the state space of MDPs can often be very large, which can be an issue for real-world problems. Factored MDPs are often used to deal with this issue. In Factored MDPs, state variables represent the state space and dynamic Bayesian networks model the transition functions. This helps to avoid the exponential growth in the state space associated with large and complex problems. The approaches for explanation and refinement presented in this thesis are also extended for the factored case to demonstrate their use in real-world applications. The domains of course advising to undergraduate students, assisted hand-washing for people with dementia and diagnostics for manufacturing are used to present empirical evaluations. Decision-Theoretic Planning Markov Decision Processes Sequential Decision Making Reasoning under Uncertainty Bayesian Networks Policy Explanation Parameter Learning Model Refinement Computer Science

1

Page generated in 0.0999 seconds