Return to search

Using Reinforcement Learning for Games with Nondeterministic State Transitions / Reinforcement Learning för spel med icke-deterministiska tillståndsövergångar

Given the recent advances within a subfield of machine learning called reinforcement learning, several papers have shown that it is possible to create self-learning digital agents, agents that take actions and pursue strategies in complex environments without any prior knowledge. This thesis investigates the performance of the state-of-the-art reinforcement learning algorithm proximal policy optimization, when trained on a task with nondeterministic state transitions. The agent’s policy was constructed using a convolutional neural network and the game Candy Crush Friends Saga, a single-player match-three tile game, was used as the environment. The purpose of this research was to evaluate if the described agent could achieve a higher win rate than average human performance when playing the game of Candy Crush Friends Saga. The research also analyzed the algorithm's generalization capabilities on this task. The results showed that all trained models perform better than a random policy baseline, thus showing it is possible to use the proximal policy optimization algorithm to learn tasks in an environment with nondeterministic state transitions. It also showed that, given the hyperparameters chosen, it was not able to perform better than average human performance.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-158523
Date January 2019
CreatorsFischer, Max
PublisherLinköpings universitet, Statistik och maskininlärning
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.2931 seconds