Return to search

Proximal Policy Optimization in StarCraft

Deep reinforcement learning is an area of research that has blossomed tremendously in recent years and has shown remarkable potential in computer games. Real-time strategy game has become an important field of artificial intelligence in game for several years. This paper is about to introduce a kind of algorithm that used to train agents to fight against computer bots. Not only because games are excellent tools to test deep reinforcement learning algorithms for their valuable insight into how well an algorithm can perform in isolated environments without the real-life consequences, but also real-time strategy games are a very complex genre that challenges artificial intelligence agents in both short-term or long-term planning. In this paper, we introduce some history of deep learning and reinforcement learning. Then we combine them with StarCraft. PPO is the algorithm which have some of the benefits of trust region policy optimization (TRPO), but it is much simpler to implement, more general for environment, and have better sample complexity. The StarCraft environment: Blood War Application Programming Interface (BWAPI) is open source to test. The results show that PPO can work well in BWAPI and train units to defeat the opponents. The algorithm presented in the thesis is corroborated by experiments.

Identiferoai:union.ndltd.org:unt.edu/info:ark/67531/metadc1505267
Date05 1900
CreatorsLiu, Yuefan
ContributorsZhong, Xiangnan, Li, Xinrong, Yang, Tao
PublisherUniversity of North Texas
Source SetsUniversity of North Texas
LanguageEnglish
Detected LanguageEnglish
TypeThesis or Dissertation
Formatvii, 70 pages, Text
RightsUse restricted to UNT Community, Liu, Yuefan, Copyright, Copyright is held by the author, unless otherwise noted. All rights Reserved.

Page generated in 0.0014 seconds