The major objective of this dissertation is extending the capabilities of game theoretic distributed control to more general settings. In particular, we are interested in drifting environments and/or constrained communications.
The first part of the dissertation concerns slowly varying dynamics, i.e., drifting environments. A standard assumption in game theoretic learning is a stationary environment, e.g., the game is fixed. We investigate the case of slow variations and show that for sufficiently slow time variations, the limiting behavior “tracks” the stochastically stable states. Since the analysis is regarding Markov processes, the results could be applied to various game theoretic learning rules. In this research, the results were applied to log-linear learning. A mobile sensor coverage example was tested in both simulation and laboratory experiments.
The second part considers a problem of coordinating team players' actions without any communications in team-based zero-sum games. Generally, some global signalling devices are required for common randomness between players, but communications are very limited or impossible in many practical applications. Instead of learning a one-shot strategy, we let players coordinate a periodic sequence of deterministic actions and put an assumption on opponent's rationality. Since team players' action sequences are periodic and deterministic, common randomness is no longer required to coordinate players. It is proved that if a length of a periodic action sequence is long enough, then opponents with limited rationality cannot recognize its pattern. Because the opponents cannot recognize that the players are playing deterministic actions, the players' behavior looks like a correlated and randomized joint strategy with empirical distribution of their action sequences. Consequently players can coordinate their action sequences without any communications or global signals, and the resulting action sequences have correlated behavior.
Moreover, the notion of micro-players are introduced for efficient learning of long action sequences. Micro-player matching approach provides a new framework that converts the original team-based zero-sum game to a game between micro-players. By introducing a de Bruijn sequence to micro-player matching, we successfully separate the level of opponent's rationality and the size of the game of micro-players. The simulation results are shown to demonstrate the performance of micro-player matching methods.
Lastly, the results of the previous two topics are combined by considering a problem of coordinating actions without communications in drifting environments. More specifically, it is assumed that the opponent player in the team-based zero-sum games tries to adjust its strategy in the set of bounded recall strategies. Then the time-varying opponent's strategy can be considered as a dynamic environment parameter in a coordination game between the team players. Additionally, we develop a human testbed program for further study regarding a human as an adaptive opponent in the team-based zero-sum games. The developed human testbed program can be a starting point for studying game theoretic correlated behavior learning against a human.
Identifer | oai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/52986 |
Date | 12 January 2015 |
Creators | Lim, Yusun |
Contributors | Shamma, Jeff S. |
Publisher | Georgia Institute of Technology |
Source Sets | Georgia Tech Electronic Thesis and Dissertation Archive |
Language | en_US |
Detected Language | English |
Type | Dissertation |
Format | application/pdf |
Page generated in 0.0047 seconds