The surge in the use of adaptive Artificial Intelligent (AI) systems have been made possible by leveraging the increasing processing and storage power that modern computers are able to provide. These systems are designed to make quality decisions that assist in making predictions in a wide variety of application fields. When such a system is fueled by data, the foundation for a Machine Learning (ML) approach can be modelled. Reinforcement Learning (RL) is an active model of ML going beyond the traditional supervised or unsupervised ML methods. RL studies algorithms to take actions so that the resulting reward is expected to be optimal. This thesis investigates the use of methods of RL in a context where the reward is highly time-varying: a setup is studied where two agents compete for a common resource. Specifically, we study a robotic setting inspired by the "cat-and-mouse" (hunter-prey) game. We refer to the hunter robot as to Tom, and to the competing prey robot as Jerry. In order to study this setup, two practical setups are considered. The first one is based on a LEGO platform, enabling us to run a number of actual experiments. The second one is based on a known RL simulator environment, enabling us to run many virtual experiments. We use these environments to explore the setting: indicate the non-stationary solution it generates, evaluate a number of de-facto standard approaches to RL, and identify key future avenues to be addressed.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-389998 |
Date | January 2018 |
Creators | Murali, Suraj |
Publisher | Uppsala universitet, Institutionen för informationsteknologi |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Relation | IT ; 18068 |
Page generated in 0.002 seconds