Global ETD Search

Return to search

Asynchronous Advantage Actor-Critic with Adam Optimization and a Layer Normalized Recurrent Network

State-of-the-art deep reinforcement learning models rely on asynchronous training using multiple learner agents and their collective updates to a central neural network. In this thesis, one of the most recent asynchronous policy gradientbased reinforcement learning methods, i.e. asynchronous advantage actor-critic (A3C), will be examined as well as improved using prior research from the machine learning community. With application of the Adam optimization method and addition of a long short-term memory (LSTM) with layer normalization, it is shown that the performance of A3C is increased. / Moderna modeller inom förstärkningsbaserad djupinlärning förlitar sig på asynkron träning med hjälp av ett flertal inlärningsagenter och deras kollektiva uppdateringar av ett centralt neuralt nätverk. I denna studie undersöks en av de mest aktuella policygradientbaserade förstärkningsinlärningsmetoderna, i.e. asynchronous advantage actor-critic (A3C) med avsikt att förbättra dess prestanda med hjälp av tidigare forskning av maskininlärningssamfundet. Genom applicering av optimeringsmetoden Adam samt långt korttids minne (LSTM) med nätverkslagernormalisering visar det sig att prestandan för A3C ökar.

http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-220698

Computational Mathematics

Beräkningsmatematik

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-220698
Date	January 2017
Creators	Bergdahl, Joakim
Publisher	KTH, Optimeringslära och systemteori
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	TRITA-MAT-E ; 2017:81

Page generated in 0.0012 seconds

Asynchronous Advantage Actor-Critic with Adam Optimization and a Layer Normalized Recurrent Network

Description

Links & Downloads

Tags

Additional Fields