Spelling suggestions: "subject:"opponents modelling""
1 |
Predicting opponent locations in first-person shooter video gamesHladky, Stephen Michael 11 1900 (has links)
Commercial video game developers constantly strive to create intelligent humanoid characters that are controlled by computers. To ensure computer opponents are challenging to human players, these characters are often allowed to cheat. Although they appear skillful at playing video games, cheating characters may not behave in a human-like manner and can contribute to a lack of player enjoyment if caught. This work investigates the problem of predicting opponent positions in the video game Counter-Strike: Source without cheating. Prediction models are machine-learned from records of past matches and are informed only by game information available to a human player. Results show that the best models estimate opponent positions with similar or better accuracy than human experts. Moreover, the mistakes these models make are closer to human predictions than actual opponent locations perturbed by a corresponding amount of Gaussian noise.
|
2 |
Predicting opponent locations in first-person shooter video gamesHladky, Stephen Michael Unknown Date
No description available.
|
3 |
Dynamic opponent modelling in two-player gamesMealing, Richard Andrew January 2015 (has links)
This thesis investigates decision-making in two-player imperfect information games against opponents whose actions can affect our rewards, and whose strategies may be based on memories of interaction, or may be changing, or both. The focus is on modelling these dynamic opponents, and using the models to learn high-reward strategies. The main contributions of this work are: 1. An approach to learn high-reward strategies in small simultaneous-move games against these opponents. This is done by using a model of the opponent learnt from sequence prediction, with (possibly discounted) rewards learnt from reinforcement learning, to lookahead using explicit tree search. Empirical results show that this gains higher average rewards per game than state-of-the-art reinforcement learning agents in three simultaneous-move games. They also show that several sequence prediction methods model these opponents effectively, supporting the idea of using them from areas such as data compression and string matching; 2. An online expectation-maximisation algorithm that infers an agent's hidden information based on its behaviour in imperfect information games; 3. An approach to learn high-reward strategies in medium-size sequential-move poker games against these opponents. This is done by using a model of the opponent learnt from sequence prediction, which needs its hidden information (inferred by the online expectation-maximisation algorithm), to train a state-of-the-art no-regret learning algorithm by simulating games between the algorithm and the model. Empirical results show that this improves the no-regret learning algorithm's rewards when playing against popular and state-of-the-art algorithms in two simplified poker games; 4. Demonstrating that several change detection methods can effectively model changing categorical distributions with experimental results comparing their accuracies to empirical distributions. These results also show that their models can be used to outperform state-of-the-art reinforcement learning agents in two simultaneous-move games. This supports the idea of modelling changing opponent strategies with change detection methods; 5. Experimental results for the self-play convergence to mixed strategy Nash equilibria of the empirical distributions of plays of sequence prediction and change detection methods. The results show that they converge faster, and in more cases for change detection, than fictitious play.
|
Page generated in 0.1128 seconds