• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Application of temporal difference learning and supervised learning in the game of Go.

January 1996 (has links)
by Horace Wai-Kit, Chan. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1996. / Includes bibliographical references (leaves 109-112). / Acknowledgement --- p.i / Abstract --- p.ii / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Overview --- p.1 / Chapter 1.2 --- Objective --- p.3 / Chapter 1.3 --- Organization of This Thesis --- p.3 / Chapter 2 --- Background --- p.5 / Chapter 2.1 --- Definitions --- p.5 / Chapter 2.1.1 --- Theoretical Definition of Solving a Game --- p.5 / Chapter 2.1.2 --- Definition of Computer Go --- p.7 / Chapter 2.2 --- State of the Art of Computer Go --- p.7 / Chapter 2.3 --- A Framework for Computer Go --- p.11 / Chapter 2.3.1 --- Evaluation Function --- p.11 / Chapter 2.3.2 --- Plausible Move Generator --- p.14 / Chapter 2.4 --- Problems Tackled in this Research --- p.14 / Chapter 3 --- Application of TD in Game Playing --- p.15 / Chapter 3.1 --- Introduction --- p.15 / Chapter 3.2 --- Reinforcement Learning and TD Learning --- p.15 / Chapter 3.2.1 --- Models of Learning --- p.16 / Chapter 3.2.2 --- Temporal Difference Learning --- p.16 / Chapter 3.3 --- TD Learning and Game-playing --- p.20 / Chapter 3.3.1 --- Game-Playing as a Delay-reward Prediction Problem --- p.20 / Chapter 3.3.2 --- Previous Work of TD Learning in Backgammon --- p.20 / Chapter 3.3.3 --- Previous Works of TD Learning in Go --- p.22 / Chapter 3.4 --- Design of this Research --- p.23 / Chapter 3.4.1 --- Limitations in the Previous Researches --- p.24 / Chapter 3.4.2 --- Motivation --- p.25 / Chapter 3.4.3 --- Objective and Methodology --- p.26 / Chapter 4 --- Deriving a New Updating Rule to Apply TD Learning in Multi-layer Perceptron --- p.28 / Chapter 4.1 --- Multi-layer Perceptron (MLP) --- p.28 / Chapter 4.2 --- Derivation of TD(A) Learning Rule for MLP --- p.31 / Chapter 4.2.1 --- Notations --- p.31 / Chapter 4.2.2 --- A New Generalized Delta Rule --- p.31 / Chapter 4.2.3 --- Updating rule for TD(A) Learning --- p.34 / Chapter 4.3 --- Algorithm of Training MLP using TD(A) --- p.35 / Chapter 4.3.1 --- Definitions of Variables in the Algorithm --- p.35 / Chapter 4.3.2 --- Training Algorithm --- p.36 / Chapter 4.3.3 --- Description of the Algorithm --- p.39 / Chapter 5 --- Experiments --- p.41 / Chapter 5.1 --- Introduction --- p.41 / Chapter 5.2 --- Experiment 1 : Training Evaluation Function for 7 x 7 Go Games by TD(λ) with Self-playing --- p.42 / Chapter 5.2.1 --- Introduction --- p.42 / Chapter 5.2.2 --- 7 x 7 Go --- p.42 / Chapter 5.2.3 --- Experimental Designs --- p.43 / Chapter 5.2.4 --- Performance Testing for Trained Networks --- p.44 / Chapter 5.2.5 --- Results --- p.44 / Chapter 5.2.6 --- Discussions --- p.45 / Chapter 5.2.7 --- Limitations --- p.47 / Chapter 5.3 --- Experiment 2 : Training Evaluation Function for 9 x 9 Go Games by TD(λ) Learning from Human Games --- p.47 / Chapter 5.3.1 --- Introduction --- p.47 / Chapter 5.3.2 --- 9x 9 Go game --- p.48 / Chapter 5.3.3 --- Training Data Preparation --- p.49 / Chapter 5.3.4 --- Experimental Designs --- p.50 / Chapter 5.3.5 --- Results --- p.52 / Chapter 5.3.6 --- Discussion --- p.54 / Chapter 5.3.7 --- Limitations --- p.56 / Chapter 5.4 --- Experiment 3 : Life Status Determination in the Go Endgame --- p.57 / Chapter 5.4.1 --- Introduction --- p.57 / Chapter 5.4.2 --- Training Data Preparation --- p.58 / Chapter 5.4.3 --- Experimental Designs --- p.60 / Chapter 5.4.4 --- Results --- p.64 / Chapter 5.4.5 --- Discussion --- p.65 / Chapter 5.4.6 --- Limitations --- p.66 / Chapter 5.5 --- A Postulated Model --- p.66 / Chapter 6 --- Conclusions --- p.69 / Chapter 6.1 --- Future Direction of Research --- p.71 / Chapter A --- An Introduction to Go --- p.72 / Chapter A.l --- A Brief Introduction --- p.72 / Chapter A.1.1 --- What is Go? --- p.72 / Chapter A.1.2 --- History of Go --- p.72 / Chapter A.1.3 --- Equipment used in a Go game --- p.73 / Chapter A.2 --- Basic Rules in Go --- p.74 / Chapter A.2.1 --- A Go game --- p.74 / Chapter A.2.2 --- Liberty and Capture --- p.75 / Chapter A.2.3 --- Ko --- p.77 / Chapter A.2.4 --- "Eyes, Live and Death" --- p.81 / Chapter A.2.5 --- Seki --- p.83 / Chapter A.2.6 --- Endgame and Scoring --- p.83 / Chapter A.2.7 --- Rank and Handicap Games --- p.85 / Chapter A.3 --- Strategies and Tactics in Go --- p.87 / Chapter A.3.1 --- Strategy vs Tactics --- p.87 / Chapter A.3.2 --- Open-game --- p.88 / Chapter A.3.3 --- Middle-game --- p.91 / Chapter A.3.4 --- End-game --- p.92 / Chapter B --- Mathematical Model of Connectivity --- p.94 / Chapter B.1 --- Introduction --- p.94 / Chapter B.2 --- Basic Definitions --- p.94 / Chapter B.3 --- Adjacency and Connectivity --- p.96 / Chapter B.4 --- String and Link --- p.98 / Chapter B.4.1 --- String --- p.98 / Chapter B.4.2 --- Link --- p.98 / Chapter B.5 --- Liberty and Atari --- p.99 / Chapter B.5.1 --- Liberty --- p.99 / Chapter B.5.2 --- Atari --- p.101 / Chapter B.6 --- Ko --- p.101 / Chapter B.7 --- Prohibited Move --- p.104 / Chapter B.8 --- Path and Distance --- p.105 / Bibliography --- p.109

Page generated in 0.1292 seconds