• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • 1
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

A study of learning models for analyzing prisoners' dilemma game data / 囚犯困境資料分析之學習模型研究

賴宜祥, Lai, Yi Hsiang Unknown Date (has links)
人們如何在重覆的囚犯困境賽局選擇策略是本文探討的議題,其中的賽局學習理論就是預測賽局的參與者(player)會選擇何種策略。本文使用的資料包括3個囚犯困境的實驗,各自有不同的實驗設定及配對程序,參加者都是政治大學的大學部學生,我們將使用這些資料比較不同的學習模型。除了常見的3個學習模型:增強學習模型(Reinforcement Learning model)、信念學習模型(Belief Learning model)及加權經驗吸引模型(Experience-Weighted Attraction model),本文也提出一個延伸的增強學習模型(Extended reinforcement learning model)。接著將分析劃為Training (in-sample)及Testing (out-sample),並比較各實驗間或模型間的結果。   雖然延伸增強學習模型(Extended reinforcement learning model)較原始的增強學習模型(Reinforcement learning model)多了一個參數,該模型(Extended reinforcement learning model)在Training(in-sample)及Testing(out-sample)表現多較之前的模型來得些許的好。 / How people choose strategies in a finite repeated prisoners’ dilemma game is of interest in Game Theory. The way to predict which strategies the people choose in a game is so-called game learning theory. The objective of this study is to find a proper learning model for the prisoners’ dilemma game data collected in National Cheng-Chi University. The game data consist of three experiments with different game and matching rules. Four learning models are considered, including Reinforcement learning model, Belief learning model, Experience Weighted Attraction learning model and a proposed model modified from reinforcement learning model. The data analysis was divided into 2 parts: training (in-sample) and testing (out-sample). The proposed learning model is slightly better than the original reinforcement learning model no matter when in training or testing prediction although one more parameter is added. The performances of prediction by model fitting are all better than guessing the decisions with equal chance.
2

Evolution and learning in games

Josephson, Jens January 2001 (has links)
This thesis contains four essays that analyze the behaviors that evolve when populations of boundedly rational individuals interact strategically for a long period of time. Individuals are boundedly rational in the sense that their strategy choices are determined by simple rules of adaptation -- learning rules. Convergence results for general finite games are first obtained in a homogenous setting, where all populations consist either of stochastic imitators, who almost always imitate the most successful strategy in a sample from their own population's past strategy choices, or stochastic better repliers, who almost always play a strategy that gives at least as high expected payoff as a sample distribution of all populations' past play. Similar results are then obtained in a heterogeneous setting, where both of these learning rules are represented in each population. It is found that only strategies in certain sets are played in the limit, as time goes to infinity and the mutation rate tends to zero. Sufficient conditions for the selection of a Pareto efficient such set are also provided. Finally, the analysis is extended to natural selection among learning rules. The question is whether there exists a learning rule that is evolutionarily stable, in the sense that a population employing this learning rule cannot be invaded by individuals using a different rule. Monte Carlo simulations for a large class of learning rules and four different games indicate that only a learning rule that takes full account of hypothetical payoffs to strategies that are not played is evolutionarily stable in almost all cases. / Diss. Stockholm : Handelshögsk., 2001

Page generated in 0.0757 seconds