• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Game Players Using Distributional Reinforcement Learning

Pettersson, Adam, Pei Purroy, Francesc January 2024 (has links)
Reinforcement learning (RL) algorithms aim to identify optimal action sequences for an agent in a given environment, traditionally maximizing the expected rewards received from the environment by taking each action and transitioning between states. This thesis explores approaching RL distributionally, replacing the expected reward function by the full distribution over the possible rewards received, known as the value distribution. We focus on the quantile regression distributional RL (QR-DQN) algorithm introduced by Dabney et al. (2017), which models the value distribution by representing its quantiles. With such information of the value distribution, we modify the QR-DQN algorithm to enhance the agent's risk sensitivity. Our risk-averse algorithm is evaluated against the original QR-DQN in the Atari 2600 and in the Gymnasium environment, specifically in the games Breakout, Pong, Lunar Lander and Cartpole. Results indicate that the risk-averse variant performs comparably in terms of rewards while exhibiting increased robustness and risk aversion. Potential refinements of the risk-averse algorithm are presented.

Page generated in 0.1712 seconds