• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

FAST(ER) DATA GENERATION FOR OFFLINE RL AND FPS ENVIRONMENTS FOR DECISION TRANSFORMERS

Mark R Trovinger (17549493) 06 December 2023 (has links)
<p dir="ltr">Reinforcement learning algorithms have traditionally been implemented with the goal</p><p dir="ltr">of maximizing a reward signal. By contrast, Decision Transformer (DT) uses a transformer</p><p dir="ltr">model to predict the next action in a sequence. The transformer model is trained on datasets</p><p dir="ltr">consisting of state, action, return trajectories. The original DT paper examined a small</p><p dir="ltr">number of environments, five from the Atari domain, and three from continuous control,</p><p dir="ltr">and one that examined credit assignment. While this gives an idea of what the decision</p><p dir="ltr">transformer can do, the variety of environments in the Atari domain are limited. In this</p><p dir="ltr">work, we propose an extension of the environments that decision transformer can be trained</p><p dir="ltr">on by adding support for the VizDoom environment. We also developed a faster method for</p><p dir="ltr">offline RL dataset generation, using Sample Factory, a library focused on high throughput,</p><p dir="ltr">to generate a dataset comparable in quality to existing methods using significantly less time.</p><p dir="ltr"><br></p>
2

Transformer Offline Reinforcement Learning for Downlink Link Adaptation

Mo, Alexander January 2023 (has links)
Recent advancements in Transformers have unlocked a new relational analysis technique for Reinforcement Learning (RL). This thesis researches the models for DownLink Link Adaptation (DLLA). Radio resource management methods such as DLLA form a critical facet for radio-access networks, where intricate optimization problems are continuously resolved under strict latency constraints in the order of milliseconds. Although previous work has showcased improved downlink throughput in an online RL approach, time dependence of DLLA obstructs its wider adoption. Consequently, this thesis ventures into uncharted territory by extending the DLLA framework with sequence modelling to fit the Transformer architecture. The objective of this thesis is to assess the efficacy of an autoregressive sequence modelling based offline RL Transformer model for DLLA using a Decision Transformer. Experimentally, the thesis demonstrates that the attention mechanism models environment dynamics effectively. However, the Decision Transformer framework lacks in performance compared to the baseline, calling for a different Transformer model. / De senaste framstegen inom Transformers har möjliggjort ny teknik för Reinforcement Learning (RL). I denna uppsats undersöks modeller för länkanpassning, närmare bestämt DownLink Link Adaptation (DLLA). Metoder för hantering av radioresurser som DLLA utgör en kritisk aspekt för radioåtkomstnätverk, där invecklade optimeringsproblem löses kontinuerligt under strikta villkor kring latens och annat, i storleksordningen millisekunder. Även om tidigare arbeten har påvisat förbättrad länkgenomströmning med en online-RL-metod, så gäller att tidsberoenden i DLLA hindrar dess bredare användning. Följaktligen utökas här DLLA-ramverket med sekvensmodellering för att passa Transformer-arkitekturer. Syftet är att bedöma effekten av en autoregressiv sekvensmodelleringsbaserad offline-RL-modell för DLLA med en Transformer för beslutsstöd. Experimentellt visas att uppmärksamhetsmekanismen modellerar miljöns dynamik effektivt. Men ramverket saknar prestanda jämfört med tidigare forsknings- och utvecklingprojekt, vilket antyder att en annan Transformer-modell krävs.

Page generated in 0.1101 seconds