Many environments in which an agent can use reinforcement learning techniques to learn profitable strategies are affected by other learning agents. These situations can be modeled as general-sum games. When playing repeated general-sum games with other learning agents, the goal of a self-interested learning agent is to maximize its own payoffs over time. Traditional reinforcement learning algorithms learn myopic strategies in these games. As a result, they learn strategies that produce undesirable results in many games. In this dissertation, we develop and analyze algorithms that learn non-myopic strategies when playing many important infinitely repeated general-sum games. We show that, in many of these games, these algorithms outperform existing multiagent learning algorithms. We derive performance guarantees for these algorithms (for certain learning parameters) and show that these guarantees become stronger and apply to larger classes of games as more information is observed and used by the agents. We establish these results through empirical studies and mathematical proofs.
Identifer | oai:union.ndltd.org:BGMYU2/oai:scholarsarchive.byu.edu:etd-1336 |
Date | 21 December 2005 |
Creators | Crandall, Jacob W. |
Publisher | BYU ScholarsArchive |
Source Sets | Brigham Young University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Theses and Dissertations |
Rights | http://lib.byu.edu/about/copyright/ |
Page generated in 0.002 seconds