• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Dynamic opponent modelling in two-player games

Mealing, Richard Andrew January 2015 (has links)
This thesis investigates decision-making in two-player imperfect information games against opponents whose actions can affect our rewards, and whose strategies may be based on memories of interaction, or may be changing, or both. The focus is on modelling these dynamic opponents, and using the models to learn high-reward strategies. The main contributions of this work are: 1. An approach to learn high-reward strategies in small simultaneous-move games against these opponents. This is done by using a model of the opponent learnt from sequence prediction, with (possibly discounted) rewards learnt from reinforcement learning, to lookahead using explicit tree search. Empirical results show that this gains higher average rewards per game than state-of-the-art reinforcement learning agents in three simultaneous-move games. They also show that several sequence prediction methods model these opponents effectively, supporting the idea of using them from areas such as data compression and string matching; 2. An online expectation-maximisation algorithm that infers an agent's hidden information based on its behaviour in imperfect information games; 3. An approach to learn high-reward strategies in medium-size sequential-move poker games against these opponents. This is done by using a model of the opponent learnt from sequence prediction, which needs its hidden information (inferred by the online expectation-maximisation algorithm), to train a state-of-the-art no-regret learning algorithm by simulating games between the algorithm and the model. Empirical results show that this improves the no-regret learning algorithm's rewards when playing against popular and state-of-the-art algorithms in two simplified poker games; 4. Demonstrating that several change detection methods can effectively model changing categorical distributions with experimental results comparing their accuracies to empirical distributions. These results also show that their models can be used to outperform state-of-the-art reinforcement learning agents in two simultaneous-move games. This supports the idea of modelling changing opponent strategies with change detection methods; 5. Experimental results for the self-play convergence to mixed strategy Nash equilibria of the empirical distributions of plays of sequence prediction and change detection methods. The results show that they converge faster, and in more cases for change detection, than fictitious play.
2

[en] GAME THEORY AND MATHEMATICS IN SECONDARY EDUCATION: INTRODUCTION TO NASH EQUILIBRIUM / [pt] TEORIA DOS JOGOS E A MATEMÁTICA NO ENSINO MÉDIO: INTRODUÇÃO AO EQUILÍBRIO DE NASH

THIAGO OLIVEIRA NASCIMENTO 03 March 2015 (has links)
[pt] O objetivo deste trabalho é investigar como os alunos do Ensino Médio da rede pública estadual de ensino do Rio de Janeiro se comportam com a aplicação da Teoria dos Jogos como elemento motivador no ensino da Matemática, uma vez que apresentam, com grande frequência, dificuldades nesta disciplina. Para atingir o objetivo proposto elaboramos uma sequência didática que consistia na realização dos jogos Barganha com Ultimato e Dilema do Prisioneiro em sala de aula, sem qualquer explicação prévia sobre os conceitos básicos da Teoria dos Jogos. Nesta sequência didática, após a realização de cada jogo explicamos os resultados previstos pela teoria, introduzindo os conceitos de matriz de ganhos, estratégia dominante e equilíbrio de Nash, e explicamos o funcionamento do jogo Pôquer Simplificado com seus resultados teóricos. Ao término da aplicação da sequência didática, realizamos um teste de auto-avaliação simples, para que pudéssemos verificar o nível de aprendizado dos alunos envolvidos. Por fim, comparamos os resultados obtidos pelos pares de alunos que participaram do jogo Barganha com Ultimato (realizado quando ainda não possuíam qualquer experiência em Teoria dos Jogos) com aqueles obtidos por Bianchi, Carter e Irons e Castro e Ribeiro. / [en] The objective of this work is to investigate the effect of game theory as a motivator for mathematics education on those second year high school students in the state public schools of Rio de Janeiro who have already shown frequent difficulties with the discipline. In order to achieve the proposed goal, we develop a didactic sequence involving the application in the classroom of the games the Ultimatum Game and the Prisoner s Dilema without any prior introduction to the basic concepts of game theory. After the completion of each game, we explain the results predicted by the theory, introducing the concepts of the payoff matrix, the dominant strategy and the Nash Equilibrium. In addition, we explain the operation of the game of Simplified Poker along which its theoretical results. Upon completion of the application of this didactic sequence, we apply a simple self-evaluation test in order to verify the academic level of the students involved. Finally, we compare the results obtained by the pairs of students who participated in the game the Ultimatum Game (performed when the students still had no experience of Game Theory) with the results obtained by Bianchi, Carter e Irons and Castro e Ribeiro.

Page generated in 0.1532 seconds