Spelling suggestions: "subject:"risk sensitive"" "subject:"disk sensitive""
1 |
Variable Risk Policy Search for Dynamic Robot ControlKuindersma, Scott Robert 01 September 2012 (has links)
A central goal of the robotics community is to develop general optimization algorithms for producing high-performance dynamic behaviors in robot systems. This goal is challenging because many robot control tasks are characterized by significant stochasticity, high-dimensionality, expensive evaluations, and unknown or unreliable system models. Despite these challenges, a range of algorithms exist for performing efficient optimization of parameterized control policies with respect to average cost criteria. However, other statistics of the cost may also be important. In particular, for many stochastic control problems, it can be advantageous to select policies based not only on their average cost, but also their variance (or risk).
In this thesis, I present new efficient global and local risk-sensitive stochastic optimization algorithms suitable for performing policy search in a wide variety of problems of interest to robotics researchers. These algorithms exploit new techniques in nonparameteric heteroscedastic regression to directly model the policy-dependent distribution of cost. For local search, learned cost models can be used as critics for performing risk-sensitive gradient descent. Alternatively, decision-theoretic criteria can be applied to globally select policies to balance exploration and exploitation in a principled way, or to perform greedy minimization with respect to various risk-sensitive criteria. This separation of learning and policy selection permits variable risk control, where risk sensitivity can be flexibly adjusted and appropriate policies can be selected at runtime without requiring additional policy executions.
To evaluate these algorithms and highlight the importance of risk in dynamic control tasks, I describe several experiments with the UMass uBot-5 that include learning dynamic arm motions to stabilize after large impacts, lifting heavy objects while balancing, and developing safe fall bracing behaviors. The results of these experiments suggest that the ability to select policies based on risk-sensitive criteria can lead to greater flexibility in dynamic behavior generation.
|
2 |
Estratégias para otimização do algoritmo de Iteração de Valor Sensível a Risco / Strategies for optimization of Risk Sensitive Value Iteration algorithmIgor Oliveira Borges 11 October 2018 (has links)
Processos de decisão markovianos sensíveis a risco (Risk Sensitive Markov Decision Process - RS-MDP) permitem modelar atitudes de aversão e propensão ao risco no processo de tomada de decisão usando um fator de risco para representar a atitude ao risco. Para esse modelo, existem operadores que são baseados em funções de transformação linear por partes que incluem fator de risco e fator de desconto. Nesta dissertação são formulados dois algoritmos de Iteração de Valor Sensível a Risco baseados em um desses operadores, esses algoritmos são chamados de Iteração de Valor Sensível a Risco Síncrono (Risk Sensitive Value Iteration - RSVI) e Iteração de Valor Sensível a Risco Assíncrono (Asynchronous Risk Sensitive Value Iteration- A-RSVI). Também são propostas duas heurísticas que podem ser utilizadas para inicializar os valores dos algoritmos de forma a torná-los mais eficentes. Os resultados dos experimentos no domínio de Travessia do Rio em dois cenários de recompensas distintos mostram que: (i) o custo de processamento de políticas extremas a risco, tanto de aversão quanto de propensão, é elevado; (ii) um desconto elevado aumenta o tempo de convergência do algoritmo e reforça a sensibilidade ao risco adotada; (iii) políticas com valores para o fator de risco intermediários possuem custo computacional baixo e já possuem certa sensibilidade ao risco dependendo do fator de desconto utilizado; e (iv) o algoritmo A-RSVI com a heurística baseada no fator de risco pode reduzir o tempo para o algoritmo convergir, especialmente para valores extremos do fator de risco / Risk Sensitive Markov Decision Process (RS-MDP) allows modeling risk-averse and risk-prone attitudes in decision-making process using a risk factor to represent the risk-attitude. For this model, there are operators that are based on a piecewise linear transformation function that includes a risk factor and a discount factor. In this dissertation we formulate two Risk Sensitive Value Iteration algorithms based on one of these operators, these algorithms are called Synchronous Risk Sensitive Value Iteration (RSVI) and Asynchronous Risk Sensitive Value Iteration (A-RSVI). We also propose two heuristics that can be used to initialize the value of the RSVI or A-RSVI algorithms in order to make them more efficient. The results of experiments with the River domain in two distinct rewards scenarios show that: (i) the processing cost in extreme risk policies, for both risk-averse and risk-prone, is high; (ii) a high discount value increases the convergence time and reinforces the chosen risk attitude; (iii) policies with intermediate risk factor values have a low computational cost and show a certain sensitivity to risk based on the discount factor; and (iv) the A-RSVI algorithm with the heuristic based on the risk factor can decrease the convergence time of the algorithm, especially when we need a solution for extreme values of the risk factor
|
3 |
Estratégias para otimização do algoritmo de Iteração de Valor Sensível a Risco / Strategies for optimization of Risk Sensitive Value Iteration algorithmBorges, Igor Oliveira 11 October 2018 (has links)
Processos de decisão markovianos sensíveis a risco (Risk Sensitive Markov Decision Process - RS-MDP) permitem modelar atitudes de aversão e propensão ao risco no processo de tomada de decisão usando um fator de risco para representar a atitude ao risco. Para esse modelo, existem operadores que são baseados em funções de transformação linear por partes que incluem fator de risco e fator de desconto. Nesta dissertação são formulados dois algoritmos de Iteração de Valor Sensível a Risco baseados em um desses operadores, esses algoritmos são chamados de Iteração de Valor Sensível a Risco Síncrono (Risk Sensitive Value Iteration - RSVI) e Iteração de Valor Sensível a Risco Assíncrono (Asynchronous Risk Sensitive Value Iteration- A-RSVI). Também são propostas duas heurísticas que podem ser utilizadas para inicializar os valores dos algoritmos de forma a torná-los mais eficentes. Os resultados dos experimentos no domínio de Travessia do Rio em dois cenários de recompensas distintos mostram que: (i) o custo de processamento de políticas extremas a risco, tanto de aversão quanto de propensão, é elevado; (ii) um desconto elevado aumenta o tempo de convergência do algoritmo e reforça a sensibilidade ao risco adotada; (iii) políticas com valores para o fator de risco intermediários possuem custo computacional baixo e já possuem certa sensibilidade ao risco dependendo do fator de desconto utilizado; e (iv) o algoritmo A-RSVI com a heurística baseada no fator de risco pode reduzir o tempo para o algoritmo convergir, especialmente para valores extremos do fator de risco / Risk Sensitive Markov Decision Process (RS-MDP) allows modeling risk-averse and risk-prone attitudes in decision-making process using a risk factor to represent the risk-attitude. For this model, there are operators that are based on a piecewise linear transformation function that includes a risk factor and a discount factor. In this dissertation we formulate two Risk Sensitive Value Iteration algorithms based on one of these operators, these algorithms are called Synchronous Risk Sensitive Value Iteration (RSVI) and Asynchronous Risk Sensitive Value Iteration (A-RSVI). We also propose two heuristics that can be used to initialize the value of the RSVI or A-RSVI algorithms in order to make them more efficient. The results of experiments with the River domain in two distinct rewards scenarios show that: (i) the processing cost in extreme risk policies, for both risk-averse and risk-prone, is high; (ii) a high discount value increases the convergence time and reinforces the chosen risk attitude; (iii) policies with intermediate risk factor values have a low computational cost and show a certain sensitivity to risk based on the discount factor; and (iv) the A-RSVI algorithm with the heuristic based on the risk factor can decrease the convergence time of the algorithm, especially when we need a solution for extreme values of the risk factor
|
4 |
Algoritmos eficientes para o problema do orçamento mínimo em processos de decisão Markovianos sensíveis ao risco / Efficient algorithms for the minimum budget problem in risk-sensitive Markov decision processesMoreira, Daniel Augusto de Melo 06 November 2018 (has links)
O principal critério de otimização utilizado em Processos de Decisão Markovianos (mdps) é minimizar o custo acumulado esperado. Embora esse critério de otimização seja útil, em algumas aplicações, o custo gerado por algumas execuções pode exceder um limite aceitável. Para lidar com esse problema foram propostos os Processos de Decisão Markovianos Sensíveis ao Risco (rs-mdps) cujo critério de otimização é maximizar a probabilidade do custo acumulado não ser maior que um orçamento limite definido pelo usuário, portanto garantindo que execuções custosas de um mdp ocorram com menos probabilidade. Algoritmos para rs-mdps possuem problemas de escalabilidade quando lidam com intervalos de custo amplos, uma vez que operam no espaço aumentado que enumera todos os possíveis orçamentos restantes. Neste trabalho é proposto um novo problema que é encontrar o orçamento mínimo para o qual a probabilidade de que o custo acumulado não exceda esse orçamento converge para um máximo. Para resolver esse problema são propostas duas abordagens: (i) uma melhoria no algoritmo tvi-dp (uma solução previamente proposta para rsmdps) e (ii) o primeiro algoritmo de programação dinâmica simbólica para rs-mdps que explora as independências condicionais da função de transição no espaço de estados aumentado. Os algoritmos propostos eliminam estados inválidos e adicionam uma nova condição de parada. Resultados empíricos mostram que o algoritmo rs-spudd é capaz de resolver problemas até 103 vezes maior que o algoritmo tvi-dp e é até 26.2 vezes mais rápido que tvi-dp (nas instâncias que o algoritmo tvi-dp conseguiu resolver). De fato, é mostrado que o algoritmo rs-spudd é o único que consegue resolver instâncias grandes dos domínios analisados. Outro grande desafio em rs-mdps é lidar com custos contínuos. Para resolver esse problema são definidos os rs-mdps híbridos que incluem variáveis contínuas e discretas, além do orçamento limite definido pelo usuário. É mostrado que o algoritmo de programação dinâmica simbólica (sdp), existente na literatura, pode ser usado para resolver esse tipo de mdps. Esse algoritmo foi empiricamente testado de duas maneiras diferentes: (i) comparado com os demais algoritmos propostos em um domínio em que todos são capazes de resolver e (ii) testado em um domínio que somente ele é capaz de resolver. Os resultados mostram que o algoritmo sdp para rs-mdp híbridos é capaz de resolver domínios com custos contínuos sem a necessidade de enumeração de estados, porém em troca do aumento do custo computacional. / The main optimization criterion used in Markovian Decision Processes (mdps) is to minimize the expected cumulative cost. Although this optimization criterion is useful, in some applications the cost generated by some executions may exceed an acceptable threshold. In order to deal with this problem, the Risk-Sensitive Markov Decision Processes (rs-mdps) were proposed whose optimization criterion is to maximize the probability of the cumulative cost not to be greater than an user-defined budget, thus guaranteeing that costly executions of an mdp occur with least probability. Algorithms for rs-mdps face scalability issues when handling large cost intervals, since they operate in an augmented state space which enumerates the possible remaining budgets. In this work, we propose a new challenging problem of finding the minimum budget for which the probability that the cumulative cost does not exceed this budget converges to a maximum. To solve this problem, we propose: (i) an improved version of tvi-dp (a previous solution for rs-mdps) and (ii) the first symbolic dynamic programming algorithm for rs-mdps that explores conditional independence of the transition function in the augmented state space. The proposed algorithms prune invalid states and perform early termination. Empirical results show that rs-spudd is able to solve problems up to 103 times larger than tvi-dp and is up to 26.2 times faster than tvi-dp (in the instances tvi-dp was able to solve). In fact, we show that rs-spudd is the only one that can solve large instances of the analyzed domains. Another challenging problem for rs-mdps is handle continous costs. To solve this problem, we define Hybrid rs-mdps which include continous and discrete variables, and the user-defined budget. In this work, we show that Symbolic Dynamic Programming (sdp) algorithm can be used to solve this kind of mdps. We empirically evaluated the sdp algorithm: (i) in a domain that can be solved with the previously proposed algorithms and (ii) in a domain that only sdp can solve. Results shown that sdp algorithm for Hybrid rs-mdps is capable of solving domains with continous costs, but with a higher computational cost.
|
5 |
Decision-Theoretic Planning under Risk-Sensitive Planning ObjectivesLiu, Yaxin 18 April 2005 (has links)
Risk attitudes are important for human decision making, especially in scenarios where huge wins or losses are possible, as exemplified by planetary rover navigation, oilspill response, and business applications. Decision-theoretic planners therefore need to take risk aspects into account to serve their users better. However, most existing decision-theoretic planners use simplistic planning objectives that are risk-neutral. The thesis research is the first comprehensive study of how to incorporate risk attitudes into decision-theoretic planners and solve large-scale planning problems represented as Markov decision process models. The thesis consists of three parts.
The first part of the thesis work studies risk-sensitive planning in case where exponential utility functions are used to model risk attitudes. I show that existing decision-theoretic planners can be transformed to take risk attitudes into account. Moreover, different versions of the transformation are needed if the transition probabilities are implicitly given, namely, temporally extended probabilities and probabilities given in a factored form.
The second part of the thesis work studies risk-sensitive planning in case where general nonlinear utility functions are used to model risk attitudes. I show that a state-augmentation approach can be used to reduce a risk-sensitive planning problem to a risk-neutral planning problem with an augmented state space. I further use a functional interpretation of value functions and approximation methods to solve the planning problems efficiently with value iteration. I also show an exact method for solving risk-sensitive planning problems where one-switch utility functions are used to model risk attitudes.
The third part of the thesis work studies risk sensitive planning in case where arbitrary rewards are used. I propose a spectrum of conditions that can be used to constrain the utility function and the planning problem so that the optimal expected utilities exist and are finite. I prove that the existence and finiteness properties hold for stationary plans, where the action to perform in each state does not change over time, under different sets of conditions.
|
6 |
Risk-Prone and Risk-Averse Foraging Strategies Enable Niche Partitioning by Two Diurnal Orb-Weaving Spider SpeciesLong, Mitchell D 01 May 2022 (has links)
Niche partitioning is a major component in understanding community ecology and how ecologically similar species coexist. Temporal and spatial partitioning and differences in foraging strategy, including sensitivity to risk (variance), likely contribute to partitioning as well. Here, we approach this partitioning with fine resolution to investigate differences in overall strategy between two species of diurnal, orb-weaving spiders, Verrucosa arenata and Micrathena gracilis (Araneae: Araneidae), that share similar spatial positioning, temporal foraging window, and prey. Through field observation, we found that V. arenata individuals appear to increase spatial and temporal sampling to compensate for an overall risk-prone strategy that depends on the interception and active capture of rare, large prey. Conversely, M. gracilis individuals employ a risk-averse strategy relying on passive capture of small but abundant prey consumed alongside the orb. We have thus identified how differing risk-sensitive foraging strategies may contribute to niche partitioning between otherwise similar species.
|
7 |
Enhancing Coastal Community's Disaster and Climate Resilience in the Mangrove Rich Indian Sundarban / インド・スンダルバン マングローブ豊穣地域における沿岸域コミュニティの気象災害対応力向上に関する研究Rajarshi, Dasgupta 23 March 2016 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(地球環境学) / 甲第19875号 / 地環博第149号 / 新制||地環||30(附属図書館) / 32911 / 京都大学大学院地球環境学舎地球環境学専攻 / (主査)教授 藤井 滋穂, 教授 岡﨑 健二, 准教授 西前 出 / 学位規則第4条第1項該当 / Doctor of Global Environmental Studies / Kyoto University / DFAM
|
8 |
Essays on model uncertainty in macroeconomicsZhao, Mingjun 12 September 2006 (has links)
No description available.
|
9 |
Essays on the Term Structure of Interest Rates and Long Run Variance of Stock ReturnsWu, Ting 15 September 2010 (has links)
No description available.
|
10 |
Effects of Timber Harvesting on Terrestrial Salamander Abundance and BehaviorKnapp, Shannon Michele 04 June 1999 (has links)
We examined the short-term (1 - 4 years postharvest) effects of 7 silvicultural treatments on terrestrial salamander populations at 4 sites in southwest Virginia and West Virginia. The 3 silvicultural treatments with the most canopy removal (4-7 m2 basal area Shelterwood, Leavetree, Clearcut) had significantly fewer salamanders than the control (p < 0.10) postharvest. No differences were found among treatments in age class distribution, the percent of females that were gravid, or average clutch size.
We tested the nighttime, surface-count census method for visibility and behavior-induced bias among silviculture treatments and estimated the proportion of a salamander population that is active on the surface in harvested and control habitats. Instantaneous rates of salamander activity ranged from 1.3 to 11.7% of the population for redback (Plethodon cinereus) and slimy salamanders (P. glutinosus). Timber harvest caused up to a 2-fold increase or decrease in activity rates. There was evidence for bias in the night census method, but differences were not consistent enough to suggest general bias corrections.
We also tested whether poorly fed salamanders exhibited risk-sensitive foraging in a dry environment in a laboratory experiment. Poorly fed salamanders were observed out of their simulated burrows less than well fed salamanders suggesting salamanders, particularly females and small adults, are risk-averse. / Master of Science
|
Page generated in 0.0855 seconds