Return to search

Quantile Regression Deep Q-Networks for Multi-Agent System Control

Training autonomous agents that are capable of performing their assigned job without fail is the ultimate goal of deep reinforcement learning. This thesis introduces a dueling Quantile Regression Deep Q-network, where the network learns the state value quantile function and advantage quantile function separately. With this network architecture the agent is able to learn to control simulated robots in the Gazebo simulator. Carefully crafted reward functions and state spaces must be designed for the agent to learn in complex non-stationary environments. When trained for only 100,000 timesteps, the agent is able reach asymptotic performance in environments with moving and stationary obstacles using only the data from the inertial measurement unit, LIDAR, and positional information. Through the use of transfer learning, the agents are also capable of formation control and flocking patterns. The performance of agents with frozen networks is improved through advice giving in Deep Q-networks by use of normalized Q-values and majority voting.

Identiferoai:union.ndltd.org:unt.edu/info:ark/67531/metadc1505241
Date05 1900
CreatorsHowe, Dustin
ContributorsZhong, Xiangnan, Yang, Tao, Yang, Qing
PublisherUniversity of North Texas
Source SetsUniversity of North Texas
LanguageEnglish
Detected LanguageEnglish
TypeThesis or Dissertation
Formatviii, 80 pages, Text
RightsUse restricted to UNT Community, Howe, Dustin, Copyright, Copyright is held by the author, unless otherwise noted. All rights Reserved.

Page generated in 0.0022 seconds