Spelling suggestions: "subject:"montecarlo tre search"" "subject:"contestarlo tre search""
21 |
Playing and solving the game of HexHenderson, Philip 11 1900 (has links)
The game of Hex is of interest to the mathematics, algorithms, and artificial intelligence communities. It is a classical PSPACE-complete problem, and its invention is intrinsically tied to the Four Colour Theorem and the well-known strategy-stealing argument. Nash, Shannon, Tarjan, and Berge are among the mathematicians who have researched and published about this game.
In this thesis we expand on previous research, further developing the mathematical theory and algorithmic techniques relating to Hex. In particular, we identify new classes of moves that can be pruned from consideration, and devise new algorithms to identify connection strategies efficiently.
As a result of these theoretical improvements, we produce an automated solver capable of solving all 8 x 8 Hex openings and most 9 x 9 Hex openings; this marks the first time that computers have solved all Hex openings solved by humans. We also produce the two strongest automated Hex players in the world --- Wolve and MoHex --- and obtain both the gold and silver medals in the 2008 and 2009 International Computer Olympiads.
|
22 |
Playing and solving the game of HexHenderson, Philip Unknown Date
No description available.
|
23 |
Playing and Solving HavannahEwalds, Timo V Unknown Date
No description available.
|
24 |
Introduction of statistics in optimizationTeytaud, Fabien 08 December 2011 (has links) (PDF)
In this thesis we study two optimization fields. In a first part, we study the use of evolutionary algorithms for solving derivative-free optimization problems in continuous space. In a second part we are interested in multistage optimization. In that case, we have to make decisions in a discrete environment with finite horizon and a large number of states. In this part we use in particular Monte-Carlo Tree Search algorithms. In the first part, we work on evolutionary algorithms in a parallel context, when a large number of processors are available. We start by presenting some state of the art evolutionary algorithms, and then, show that these algorithms are not well designed for parallel optimization. Because these algorithms are population based, they should be we well suitable for parallelization, but the experiments show that the results are far from the theoretical bounds. In order to solve this discrepancy, we propose some rules (such as a new selection ratio or a faster decrease of the step-size) to improve the evolutionary algorithms. Experiments are done on some evolutionary algorithms and show that these algorithms reach the theoretical speedup with the help of these new rules.Concerning the work on multistage optimization, we start by presenting some of the state of the art algorithms (Min-Max, Alpha-Beta, Monte-Carlo Tree Search, Nested Monte-Carlo). After that, we show the generality of the Monte-Carlo Tree Search algorithm by successfully applying it to the game of Havannah. The application has been a real success, because today, every Havannah program uses Monte-Carlo Tree Search algorithms instead of the classical Alpha-Beta. Next, we study more precisely the Monte-Carlo part of the Monte-Carlo Tree Search algorithm. 3 generic rules are proposed in order to improve this Monte-Carlo policy. Experiments are done in order to show the efficiency of these rules.
|
25 |
Introduction of statistics in optimization / Introduction de statistiques en optimisationTeytaud, Fabien 08 December 2011 (has links)
Cette thèse se situe dans le contexte de l'optimisation. Deux grandes parties s'en dégagent ; la première concerne l'utilisation d'algorithmes évolutionnaires pour résoudre des problèmes d'optimisation continue et sans dérivées. La seconde partie concerne l'optimisation de séquences de décisions dans un environnement discret et à horizon fini en utilisant des méthodes de type Monte-Carlo Tree Search. Dans le cadre de l'optimisation évolutionnaire, nous nous intéressons particulièrement au cadre parallèle à grand nombre d'unités de calcul. Après avoir présenté les algorithmes de référence du domaine, nous montrons que ces algorithmes, sous leur forme classique, ne sont pas adaptés à ce cadre parallèle et sont loin d'atteindre les vitesses de convergence théoriques. Nous proposons donc ensuite différentes règles (comme la modification du taux de sélection des individus ainsi que la décroissance plus rapide du pas) afin de corriger et améliorer ces algorithmes. Nous faisons un comparatif empirique de ces règles appliquées à certains algorithmes. Dans le cadre de l'optimisation de séquences de décisions, nous présentons d'abord les algorithmes de référence dans ce domaine (Min-Max, Alpha-Beta, Monte-carlo Tree Search, Nested Monte-Carlo). Nous montrons ensuite la généricité de l'algorithme Monte-Carlo Tree Search en l'appliquant avec succès au jeu de Havannah. Cette application a été un réel succès puisqu'aujourd'hui les meilleurs joueurs artificiels au jeu de Havannah utilisent cet algorithme et non plus des algorithmes de type Min-Max ou Alpha-Beta. Ensuite, nous nous sommes particulièrement intéressés à l'amélioration de la politique Monte-Carlo de ces algorithmes. Nous proposons trois améliorations, chacune étant générique. Des expériences sont faites pour mesurer l'impact de ces améliorations, ainsi que la généricité de l'une d'entre elles. Nous montrons à travers ces expériences que les résultats sont positifs. / In this thesis we study two optimization fields. In a first part, we study the use of evolutionary algorithms for solving derivative-free optimization problems in continuous space. In a second part we are interested in multistage optimization. In that case, we have to make decisions in a discrete environment with finite horizon and a large number of states. In this part we use in particular Monte-Carlo Tree Search algorithms. In the first part, we work on evolutionary algorithms in a parallel context, when a large number of processors are available. We start by presenting some state of the art evolutionary algorithms, and then, show that these algorithms are not well designed for parallel optimization. Because these algorithms are population based, they should be we well suitable for parallelization, but the experiments show that the results are far from the theoretical bounds. In order to solve this discrepancy, we propose some rules (such as a new selection ratio or a faster decrease of the step-size) to improve the evolutionary algorithms. Experiments are done on some evolutionary algorithms and show that these algorithms reach the theoretical speedup with the help of these new rules.Concerning the work on multistage optimization, we start by presenting some of the state of the art algorithms (Min-Max, Alpha-Beta, Monte-Carlo Tree Search, Nested Monte-Carlo). After that, we show the generality of the Monte-Carlo Tree Search algorithm by successfully applying it to the game of Havannah. The application has been a real success, because today, every Havannah program uses Monte-Carlo Tree Search algorithms instead of the classical Alpha-Beta. Next, we study more precisely the Monte-Carlo part of the Monte-Carlo Tree Search algorithm. 3 generic rules are proposed in order to improve this Monte-Carlo policy. Experiments are done in order to show the efficiency of these rules.
|
26 |
Monte-Carlo Tree Search in Continuous Action Spaces for Autonomous Racing : F1-tenthJönsson, Jonatan, Stenbäck, Felix January 2020 (has links)
Autonomous cars involve problems with control and planning. In thispaper, we implement and evaluate an autonomous agent based ona Monte-Carlo Tree Search in continuous action space. To facilitatethe algorithm, we extend an existing simulation framework and usea GPU for faster calculations. We compare three action generatorsand two rewards functions. The results show that MCTS convergesto an effective driving agent in static environments. However, it onlysucceeds at driving slow speeds in real-time. We discuss the problemsthat arise in dynamic and static environments and look to future workin improving the simulation tool and the MCTS algorithm. See code, https://github.com/felrock/PyRacecarSimulator
|
27 |
Monte-Carlo Tree Search for Fox GameJanshagen, Anton, Mattsson, Olof January 2022 (has links)
This report explores if Monte-Carlo Tree Search (MCTS) can perform well in Fox Game, a classic Scandinavian strategy game. MCTS is implemented using a cutoff in the simulation phase. The game state is then evaluated using a heuristic function that is formulated using theoretical arguments from its chess counterpart. MCTS is shown to perform on the same level as highly experienced human players using limited computational resources. The method is used to explore how the imbalance in Fox Game (favoring sheep) can be mended by reducing the number of sheep pieces from 20 to 18. / I denna rapport undersöks om Monte-Carlo trädsökning (MCTS) kan prestera väl i rävspel, ett klassiskt skandinaviskt strategispel. MCTS implementeras med en cutoff i simuleringsfasen. Speltillståndet utvärderas där med hjälp av en heuristisk funktion som formuleras med hjälp av teoretiska argument från dess motsvarighet i schack. MCTS med endast begränsade beräkningsresurser visas kunna prestera på samma nivå som mycket erfarna människor. Metoden används för att utforska hur obalansen i rävspel (som gynnar får) kan förbättras genom att minska antalet fårpjäser från 20 till 18. / Kandidatexjobb i elektroteknik 2022, KTH, Stockholm
|
28 |
Monte Carlo Tree Search for RiskLimér, Christoffer, Kalmér, Erik January 2020 (has links)
The idea of using artificial intelligence to evaluatemilitary strategies is relevant for a large number of governmentstoday. With programs like AlphaZero beating world championsin games of ever-increasing complexity, military adaptations areprobably not far away, if they are not in use already. Partof these programs’ recent success is due to a heuristic searchalgorithm called Monte Carlo Tree Search. In this project,we explored the possibility of using this algorithm to build aprogram capable of playing the strategy board game of Riskat a high level. The complexity and stochastic dynamic ofthe game demanded the use of chance nodes and aggressivegameplay limitations, known as action-pruning. By changing theconditions and game environment of the algorithm, we observedperformance differences, mainly simulation length considerablyimproved convergence. We suggest that the created program,optimized with correct algorithm parameters, has the potentialof playing Risk at a high level. / Tanken att använda artificiell intelligensför att evaluera militära strategierär relevant för ett stortantal regeringar idag. När program så som AlphaZero slårvärldsmästare i allt mer komplexa spel bör militära tillämpningarinte ligga långt borta, om de inte redanär implementerade. Endel av programmens framgång kan härledas till dess användningav en heuristisk sökalgoritm, kallad Monte Carlo-Trädsökning. Idet här projektet, utforskade vi möjligheten att använda dennaalgoritm för att konstruera ett program, kapabel att spela detstrategiska brädspelet Risk på en hög nivå. Spelets komplexitetoch stokastiska natur krävde användning av så kallade ”chance-nodes” och en aggressiv användning av spelbegränsningar kändasom ”action-pruning”. Genom attändra villkoren och spelmiljönför algoritmen observerades prestandaförändringar, där konver-gensen i huvudsakökade vid begränsningar av möjliga val. Viföreslår att det skapade programmet, optimerat med korrektaalgoritmparametrar, har potentialen att spela Risk på en högnivå. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
|
29 |
Monte-Carlo Tree Search Used for Fortification in the Game of RiskBolin, Jakob, Palmroos, Nico January 2020 (has links)
The strategy game Risk is a very popular boardgame, requiring little effort to learn but lots of skill to master.The aim of this project is to explore the fortification phase of thegame, where the player’s troops are moved between territories.Our method is based on adapting Monte Carlo tree search(MCTS) to Risk. To improve the troop movements, we proposetwo techniques, hierarchical search and progressive bias. Thesemethods, combined with other extensions of MCTS are thencompared against a baseline player of the game. Our results showthat hierarchical search improved the MCTS agent’s playingpower and the progressive bias have potential to improve theagent but needs further investigation. / Strategispelet Risk är ett väldigt populärt brädspel som är lätt att lära sig men svårt att bemästra. Syftet med detta projekt är att utforska spelets befästningsfas, där spelarens trupper flyttas mellan territorier. Vår metod är baserad på en anpassning av Monte Carlo trädsökning (MCTS) till Risk. För att förbättra trupprörelserna föreslår vi två tekniker, ”hierarchical search” och ”progressive bias”. Dessa metoder, i kombination med andra tillägg av MCTS, jämförs sedan mot en standard agent i spelet. Våra resultat visar att hierarchical search förbättrade MCTS agentens spelstyrka och att progressivce bias har möjlighet att förbättra agenten men kräver fortsatt utforskning. / Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
|
30 |
Multi-objective sequential decision making / La prise de décisions séquentielles multi-objectifWang, Weijia 11 July 2014 (has links)
La présente thèse porte sur l'étude de prise de décisions séquentielles multi-Objectif (MOSDM). La motivation de ce travail est double. D'un côté, la prise de décision, par exemple, dans les domaines de robotique et de planification, concerne l'optimisation séquentielle. De l'autre côté, nombreuses applications dans le monde réel sont plus naturellement formulés en termes d'optimisation multi-Objectif (MOO). La méthode proposée dans la thèse adapte le cadre bien connue de recherche Monte-Carlo arborescente (MCTS) à l'optimisation multi-Objectif, dans lequel multiple séquences de décision optimales sont développées dans un seul arbre de recherche. Le principal défi est de proposer une nouvelle récompense, capable de guider l'exploration de l'arbre bien que le problème de MOO n'applique pas un ordre total entre les solutions. La contribution principale de cette thèse est de proposer et d'étudier expérimentalement ces deux récompenses : l'indicateur de hypervolume et la récompense de dominance Pareto, qui sont inspirées de la littérature de MOO et basés sur une archive de solutions antérieures (archives Pareto). L'étude montre la complémentarité de ces deux récompenses. L'indicateur de hypervolume souffre de sa complexité algorithmique. Cependant, cet indicateur fournit des informations à grains fins de la qualité des solutions à l'égard de l'archive actuelle. Bien au contraire, la complexité de la récompense de dominance Pareto est linéaire, mais cette récompense fournit des informations de plus en plus rare au long de la recherche. Les preuves de principe de l'approche sont donnés sur les problèmes articiaux et les défis internationaux, et confirment la valeur de l'approche. En particulier, MOMCTS est capable de découvrir les politiques se trouvant dans les régions non-Convexes du front Pareto, qui contraste avec l'état de l'art: les algorithmes d'apprentissage par renforcement multi-Objectif existants sont basés sur scalarization linéaire et donc ne sont pas capables de explorer ces régions non-Convexes. Enfin, MOMCTS a fait honorablement la concurrence avec l'état de l'art sur la compétition internationale de MOPTSP 2013. / This thesis is concerned with multi-Objective sequential decision making (MOSDM). The motivation is twofold. On the one hand, many decision problems in the domains of e.g., robotics, scheduling or games, involve the optimization of sequences of decisions. On the other hand, many real-World applications are most naturally formulated in terms of multi-Objective optimization (MOO). The proposed approach extends the well-Known Monte-Carlo tree search (MCTS) framework to the MOO setting, with the goal of discovering several optimal sequences of decisions through growing a single search tree. The main challenge is to propose a new reward, able to guide the exploration of the tree although the MOO setting does not enforce a total order among solutions. The main contribution of the thesis is to propose and experimentally study two such rewards, inspired from the MOO literature and assessing a solution with respect to the archive of previous solutions (Pareto archive): the hypervolume indicator and the Pareto dominance reward. The study shows the complementarity of these two criteria. The hypervolume indicator suffers from its known computational complexity; however the proposed extension thereof provides fine-Grained information about the quality of solutions with respect to the current archive. Quite the contrary, the Pareto-Dominance reward is linear but it provides increasingly rare information. Proofs of principle of the approach are given on artificial problems and challenges, and confirm the merits of the approach. In particular, MOMCTS is able to discover policies lying in non-Convex regions of the Pareto front, contrasting with the state of the art: existing Multi-Objective Reinforcement Learning algorithms are based on linear scalarization and thus fail to sample such non-Convex regions. Finally MOMCTS honorably competes with the state of the art on the 2013 MOPTSP competition.
|
Page generated in 0.1627 seconds