In large multiagent games, partial observability, coordination, and credit assignment persistently plague attempts to design good learning algorithms. We provide a simple and efficient algorithm that in part uses a linear system to model the world from a single agent’s limited perspective, and takes advantage of Kalman filtering to allow an agent to construct a good training signal and effectively learn a near-optimal policy in a wide variety of settings. A sequence of increasingly complex empirical tests verifies the efficacy of this technique. / Singapore-MIT Alliance (SMA)
Identifer | oai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/3851 |
Date | 01 1900 |
Creators | Chang, Yu-Han, Ho, Tracey, Kaelbling, Leslie P. |
Source Sets | M.I.T. Theses and Dissertation |
Language | en_US |
Detected Language | English |
Type | Article |
Format | 1408858 bytes, application/pdf |
Relation | Computer Science (CS); |
Page generated in 0.0016 seconds