Global ETD Search

Return to search

Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero

In this work, the algorithm used by AlphaZero is adapted for dots and boxes, a two-player game. This algorithm is explored using different numbers of convolutional filters and training loops, in order to better understand the effect these parameters have on the learning of the player. Different board sizes are also tested to compare these parameters in relation to game complexity. AlphaZero originated as a Go player using an algorithm which combines Monte Carlo tree search and convolutional neural networks. This novel approach, integrating a reinforcement learning method previously applied to Go (MCTS) with a supervised learning method (neural networks) led to a player which beat all its competitors.

Monte Carlo tree search

neural network

dots and boxes

Other Computer Sciences

Robotics

Theory and Algorithms

Identifer	oai:union.ndltd.org:WKU/oai:digitalcommons.wku.edu:theses-4090
Date	01 October 2018
Creators	Prince, Jared
Publisher	TopSCHOLAR®
Source Sets	Western Kentucky University Theses
Detected Language	English
Type	text
Format	application/pdf
Source	Masters Theses & Specialist Projects

Page generated in 0.002 seconds

Exploring the Effect of Different Numbers of Convolutional Filters and Training Loops on the Performance of AlphaZero

Description

Links & Downloads

Tags

Additional Fields