Global ETD Search

Return to search

Using Reinforcement Learning for Games with Nondeterministic State Transitions / Reinforcement Learning för spel med icke-deterministiska tillståndsövergångar

Given the recent advances within a subfield of machine learning called reinforcement learning, several papers have shown that it is possible to create self-learning digital agents, agents that take actions and pursue strategies in complex environments without any prior knowledge. This thesis investigates the performance of the state-of-the-art reinforcement learning algorithm proximal policy optimization, when trained on a task with nondeterministic state transitions. The agent’s policy was constructed using a convolutional neural network and the game Candy Crush Friends Saga, a single-player match-three tile game, was used as the environment. The purpose of this research was to evaluate if the described agent could achieve a higher win rate than average human performance when playing the game of Candy Crush Friends Saga. The research also analyzed the algorithm's generalization capabilities on this task. The results showed that all trained models perform better than a random policy baseline, thus showing it is possible to use the proximal policy optimization algorithm to learn tasks in an environment with nondeterministic state transitions. It also showed that, given the hyperparameters chosen, it was not able to perform better than average human performance.

http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158523

reinforcement learning

proximal policy optimization

PPO

machine learning

artificial intelligence

Datavetenskap (datalogi)

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-158523
Date	January 2019
Creators	Fischer, Max
Publisher	Linköpings universitet, Statistik och maskininlärning
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0021 seconds

Using Reinforcement Learning for Games with Nondeterministic State Transitions / Reinforcement Learning för spel med icke-deterministiska tillståndsövergångar

Description

Links & Downloads

Tags

Additional Fields