<p dir="ltr">Reinforcement learning algorithms have traditionally been implemented with the goal</p><p dir="ltr">of maximizing a reward signal. By contrast, Decision Transformer (DT) uses a transformer</p><p dir="ltr">model to predict the next action in a sequence. The transformer model is trained on datasets</p><p dir="ltr">consisting of state, action, return trajectories. The original DT paper examined a small</p><p dir="ltr">number of environments, five from the Atari domain, and three from continuous control,</p><p dir="ltr">and one that examined credit assignment. While this gives an idea of what the decision</p><p dir="ltr">transformer can do, the variety of environments in the Atari domain are limited. In this</p><p dir="ltr">work, we propose an extension of the environments that decision transformer can be trained</p><p dir="ltr">on by adding support for the VizDoom environment. We also developed a faster method for</p><p dir="ltr">offline RL dataset generation, using Sample Factory, a library focused on high throughput,</p><p dir="ltr">to generate a dataset comparable in quality to existing methods using significantly less time.</p><p dir="ltr"><br></p>
Identifer | oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/24725112 |
Date | 06 December 2023 |
Creators | Mark R Trovinger (17549493) |
Source Sets | Purdue University |
Detected Language | English |
Type | Text, Thesis |
Rights | CC BY 4.0 |
Relation | https://figshare.com/articles/thesis/FAST_ER_DATA_GENERATION_FOR_OFFLINE_RL_AND_FPS_ENVIRONMENTS_FOR_DECISION_TRANSFORMERS/24725112 |
Page generated in 0.0015 seconds