Return to search

Decision forests for computer Go feature learning

Thesis (MSc)--Stellenbosch University, 2014. / ENGLISH ABSTRACT: In computer Go, moves are typically selected with the aid of a tree search
algorithm. Monte-Carlo tree search (MCTS) is currently the dominant algorithm
in computer Go. It has been shown that the inclusion of domain
knowledge in MCTS is able to vastly improve the strength of MCTS engines.
A successful approach to representing domain knowledge in computer Go
is the use of appropriately weighted tactical features and pattern features,
which are comprised of a number of hand-crafted heuristics and a collection
of patterns respectively. However, tactical features are hand-crafted specifically
for Go, and pattern features are Go-specific, making it unclear how
they can be easily transferred to other domains.
As such, this work proposes a new approach to representing domain
knowledge, decision tree features. These features evaluate a state-action
pair by descending a decision tree, with queries recursively partitioning the
state-action pair input space, and returning a weight corresponding to the
partition element represented by the resultant leaf node. In this work, decision
tree features are applied to computer Go, in order to determine their
feasibility in comparison to state-of-the-art use of tactical and pattern features.
In this application of decision tree features, each query in the decision
tree descent path refines information about the board position surrounding
a candidate move.
The results of this work showed that a feature instance with decision tree
features is a feasible alternative to the state-of-the-art use of tactical and
pattern features in computer Go, in terms of move prediction and playing
strength, even though computer Go is a relatively well-developed research
area. A move prediction rate of 35.9% was achieved with tactical and decision
tree features, and they showed comparable performance to the state of the
art when integrated into an MCTS engine with progressive widening.
We conclude that the decision tree feature approach shows potential as
a method for automatically extracting domain knowledge in new domains.
These features can be used to evaluate state-action pairs for guiding searchbased
techniques, such as MCTS, or for action-prediction tasks. / AFRIKAANSE OPSOMMING: In rekenaar Go, word skuiwe gewoonlik geselekteer met behulp van ’n boomsoektogalgoritme.
Monte-Carlo boomsoektog (MCTS) is tans die dominante
algoritme in rekenaar Go. Dit is bekend dat die insluiting van gebiedskennis
in MCTS in staat is om die krag van MCTS enjins aansienlik te verbeter.
’n Suksesvolle benadering tot die voorstelling van gebiedskennis in rekenaar
Go is taktiek- en patroonkenmerke met geskikte gewigte. Hierdie behels ’n
aantal handgemaakte heuristieke en ’n versameling van patrone onderskeidelik.
Omdat taktiekkenmerke spesifiek vir Go met die hand gemaak is, en dat
patroonkenmerke Go-spesifiek is, is dit nie duidelik hoe hulle maklik oorgedra
kan word na ander velde toe nie.
Hierdie werk stel dus ’n nuwe verteenwoordiging van gebiedskennis voor,
naamlik besluitboomkenmerke. Hierdie kenmerke evalueer ’n toestand-aksie
paar deur rekursief die toevoerruimte van toestand-aksie pare te verdeel deur
middel van die keuses in die besluitboom, en dan die gewig terug te keer
wat ooreenstem met die verdelingselement wat die ooreenstemmende blaarnodus
verteenwoordig. In hierdie werk, is besluitboomkenmerke geëvalueer
op rekenaar Go, om hul lewensvatbaarheid in vergelyking met veldleidende
gebruik van taktiek- en patroonkenmerke te bepaal. In hierdie toepassing
van besluitboomkenmerke, verfyn elke navraag in die pad na onder van die
besluitboom inligting oor die posisie rondom ’n kandidaatskuif.
Die resultate van hierdie werk het getoon dat ’n kenmerkentiteit met
besluitboomkenmerke ’n haalbare alternatief is vir die veldleidende gebruik
van taktiek- en patroonkenmerke in rekenaar Go in terme van skuifvoorspelling
as ook speelkrag, ondanks die feit dat rekenaar Go ’n relatief goedontwikkelde
navorsingsgebied is. ’n Skuifvoorspellingskoers van 35.9% is
behaal met taktiek- en besluitboomkenmerke, en hulle het vergelykbaar met
veldleidende tegnieke presteer wanneer hulle in ’n MCTS enjin met progressiewe
uitbreiding geïntegreer is.
Ons lei af dat ons voorgestelde besluitboomkenmerke potensiaal toon as ’n
metode vir die outomaties onttrek van gebiedskennis in nuwe velde. Hierdie
eienskappe kan gebruik word om toestand-aksie pare te evalueer vir die leiding
van soektog-gebaseerde tegnieke, soos MCTS, of vir aksie-voorspelling.

Identiferoai:union.ndltd.org:netd.ac.za/oai:union.ndltd.org:sun/oai:scholar.sun.ac.za:10019.1/86280
Date04 1900
CreatorsVan Niekerk, Francois
ContributorsKroon, Steve, Stellenbosch University. Faculty of Science. Dept. of Mathematical Sciences.
PublisherStellenbosch : Stellenbosch University
Source SetsSouth African National ETD Portal
Languageen_ZA
Detected LanguageUnknown
TypeThesis
Format95 p. : ill.
RightsStellenbosch University

Page generated in 0.0029 seconds