Global ETD Search

Return to search

On Hierarchical Goal Based Reinforcement Learning

Discrete time sequential decision processes require that an agent select an action
at each time step. As humans, we plan over long time horizons and use temporal
abstraction by selecting temporally extended actions such as “make lunch” or “get
a masters degree”, whereby each is comprised of more granular actions. This thesis
concerns itself with such hierarchical temporal abstractions in the form of macro
actions and options, as they apply to goal-based Markov Decision Processes. A novel
algorithm for discovering hierarchical macro actions in goal-based MDPs, as well as
a novel algorithm utilizing landmark options for transfer learning in multi-task goal-
based reinforcement learning settings are introduced. Theoretical properties regarding the life-long regret of an agent executing the latter algorithm are also discussed.

Markov decision process

Reinforcement learning

Options framework

Temporal abstraction

Macro actions

Identifer	oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/39552
Date	27 August 2019
Creators	Denis, Nicholas
Contributors	Fraser, Maia
Publisher	Université d'Ottawa / University of Ottawa
Source Sets	Université d’Ottawa
Language	English
Detected Language	English
Type	Thesis
Format	application/pdf

Page generated in 0.0027 seconds

On Hierarchical Goal Based Reinforcement Learning

Description

Links & Downloads

Tags

Additional Fields