Global ETD Search

Return to search

Zero-Knowledge Agent Trained for the Game of Risk

Recent developments in deep reinforcement learning applied to abstract strategy games such as Go, chess and Hex have sparked an interest within military planning. This Master thesis explores if it is possible to implement an algorithm similar to Expert Iteration and AlphaZero to wargames. The studied wargame is Risk, which is a turn-based multiplayer game played on a simplified political map of the world. The algorithms consist of an expert, in the form of a Monte Carlo tree search algorithm, and an apprentice, implemented through a neural network. The neural network is trained by imitation learning, trained to mimic expert decisions generated from self-play reinforcement learning. The apprentice is then used as heuristics in forthcoming tree searches. The results demonstrated that a Monte Carlo tree search algorithm could, to some degree, be employed on a strategy game as Risk, dominating a random playing agent. The neural network, fed with a state representation in the form of a vector, had difficulty in learning expert decisions and could not beat a random playing agent. This led to a halt in the expert/apprentice learning process. However, possible solutions are provided as future work.

http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-429214

Deep Reinforcement Learning

Engineering and Technology

Teknik och teknologier

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-429214
Date	January 2020
Creators	Bethdavid, Simon
Publisher	Uppsala universitet, Signaler och system
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	English
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess
Relation	UPTEC F, 1401-5757 ; 20063

Page generated in 0.0023 seconds

Zero-Knowledge Agent Trained for the Game of Risk

Description

Links & Downloads

Tags

Additional Fields