Global ETD Search

1	Direct Policy Search for Adaptive Management of Flood Risk Jingya Wang (15354619) 29 April 2023 (has links) <p> Direct policy search (DPS) has been shown to be an efficient method for identifying optimal rules (i.e., policies) for adapting a system in response to changing conditions. This dissertation describes three major advances in the usage of DPS for long-range infrastructure planning, using a specific application domain of flood risk management. We first introduce a new adaptive way to incorporate learning into DPS. The standard approach identifies policies by optimizing their average performance over a large ensemble of future states of the world (SOW). Our approach exploits information gained over time, regarding what kind of SOW is being experienced, to further improve performance via adaptive meta-policies defining how control of the system should switch between policies identified by a standard DPS approach (but trained on different SOWs). We outline the general method and illustrate it using a case study of optimal dike heightening extending the work of Garner and Keller (2018). The meta-policies identified by the adaptive algorithm show Pareto-dominance in two objectives over the standard DPS, with an overall 68% improvement in hypervolume. We also see the improved performance over three grouped SOWs based on future extreme water levels, with the hypervolume improvements of 90%, 46%, and 35% for low, medium, and high water level SOWs respectively. Additionally, we evaluate the degree of improvement achieved by different ways of implementing the algorithm (i.e., different hyperparameter values). This provides guidance for decision makers with different degrees of risk aversion, and computational budgets. </p> <p>Due to simplifying assumptions and limitations of the adaptive DPS model used in the chapter, such as uniform levee design heights, the Surge and Waves Model for Protection Systems (SWaMPS) is presented as a more realistic application of the DPS framework. SWaMPS is a process-based model of surge-based flood risk. This chapter marks the first implementation of DPS using a realistic process-based risk model. The physical process of storm surge and rainfall is simulated independently over multiple reaches, and different frequencies are explored to manage the production system in SWaMPS. The performance of the DPS algorithm is evaluated versus a static intertemporal optimization.</p> <p>The computational burden of evaluating the large ensemble of SOWs to include possible future events in DPS motivates us to apply scenario reduction methods to select representative scenarios that more efficiently span an uncertain parameter space. This allows us to reduce the runtime of the optimization process. We explore a range of data-mining tools, including principal component analysis (PCA) and clustering to reduce the scenarios. We compare the computational efficiency and quality of policies to this optimization problem with reduced ensembles of SOWs.</p> Industrial engineering Flood Risk Management reinforcement leaning adaptive management policies Deep Uncertainty decision making adaptive learning
2	Evolution of reward functions for reinforcement learning applied to stealth games Mendonça, Matheus Ribeiro Furtado de January 2016 (has links) Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-05-31T11:40:17Z No. of bitstreams: 1 matheusribeirofurtadodemendonca.pdf: 1083096 bytes, checksum: bb42372f22411bc93823b92e7361a490 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-05-31T12:42:30Z (GMT) No. of bitstreams: 1 matheusribeirofurtadodemendonca.pdf: 1083096 bytes, checksum: bb42372f22411bc93823b92e7361a490 (MD5) / Made available in DSpace on 2017-05-31T12:42:30Z (GMT). No. of bitstreams: 1 matheusribeirofurtadodemendonca.pdf: 1083096 bytes, checksum: bb42372f22411bc93823b92e7361a490 (MD5) Previous issue date: 2016 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Muitos jogos modernos apresentam elementos que permitem que o jogador complete certos objetivos sem ser visto pelos inimigos. Isso culminou no surgimento de um novo gênero chamado de jogos furtivos, onde a furtividade é essencial. Embora elementos de furtividade sejam muito comuns em jogos modernos, este tema não tem sido estudado extensivamente. Este trabalho aborda três problemas distintos: (i) como utilizar uma abordagem por aprendizado de máquinas de forma a permitir que o agente furtivo aprenda como se comportar adequadamente em qualquer ambiente, (ii) criar um método eﬁciente para planejamento de caminhos furtivos que possa ser acoplado à nossa formulação por aprendizado de máquinas e (iii) como usar computação evolutiva de forma a deﬁnir certos parâmetros para nossa abordagem por aprendizado de máquinas. É utilizado aprendizado por reforço para aprender bons comportamentos que sejam capazes de atingir uma alta taxa de sucesso em testes aleatórios de um jogo furtivo. Também é proposto uma abor dagem evolucionária capaz de deﬁnir automaticamente uma boa função de reforço para a abordagem por aprendizado por reforço. / Many modern games present stealth elements that allow the player to accomplish a certain objective without being spotted by enemy patrols. This gave rise to a new genre called stealth games, where covertness plays a major role. Although quite popular in modern games, stealthy behaviors has not been extensively studied. In this work, we tackle three diﬀerent problems: (i) how to use a machine learning approach in order to allow the stealthy agent to learn good behaviors for any environment, (ii) create an eﬃcient stealthy path planning method that can be coupled with our machine learning formulation, and (iii) how to use evolutionary computing in order to deﬁne speciﬁc parameters for our machine learning approach without any prior knowledge of the problem. We use Reinforcement Learning in order to learn good covert behavior capable of achieving a high success rate in random trials of a stealth game. We also propose an evolutionary approach that is capable of automatically deﬁning a good reward function for our reinforcement learning approach. Planejamento de caminhos furtivos Aprendizado por reforço Algoritmos genéticos Evolução de funções de reforço Stealthy Path Planning Reinforcement Leaning Genetic Algorithms Evolution of Reward Functions

Search results

Direct Policy Search for Adaptive Management of Flood Risk

Evolution of reward functions for reinforcement learning applied to stealth games