We propose a Model-Based Deep Reinforcement Learning (MBDRL) framework for collaborative paylaod transportation using Unmanned Aerial Vehicles (UAVs) in Search and Rescue (SAR) missions, enabling heavier payload conveyance while maintaining vehicle agility.
Our approach extends the single-drone application to a novel multi-drone one, using the Probabilistic Ensembles with Trajectory Sampling (PETS) algorithm to model the unknown stochastic system dynamics and uncertainty. We use the Multi-Agent Reinforcement Learning (MARL) framework via a centralized controller in a leader-follower configuration. The agents utilize the approximated transition function in a Model Predictive Controller (MPC) configured to maximize the reward function for waypoint navigation, while a position-based formation controller ensures stable flights of these physically linked UAVs. We also developed an Unreal Engine (UE) simulation connected to an offboard planner and controller via a Robot Operating System (ROS) framework that is transferable to real robots. This work achieves stable waypoint navigation in a stochastic environment with a sample efficiency following that seen in single UAV work.
This work has been funded by the National Science Foundation (NSF) under Award No.
2046770. / Master of Science / We apply the Model-Based Deep Reinforcement Learning (MBDRL) framework to the novel application of a UAV team transporting a suspended payload during Search and Rescue missions.
Collaborating UAVs can transport heavier payloads while staying agile, reducing the need for human involvement. We use the Probabilistic Ensemble with Trajectory Sampling (PETS) algorithm to model uncertainties and build on the previously used single UAVpayload system. By utilizing the Multi-Agent Reinforcement Learning (MARL) framework via a centralized controller, our UAVs learn to transport the payload to a desired position while maintaining stable flight through effective cooperation. We also develop a simulation in Unreal Engine (UE) connected to a controller using a Robot Operating System (ROS) architecture, which can be transferred to real robots. Our method achieves stable navigation in unpredictable environments while maintaining the sample efficiency observed in single UAV scenarios.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/120971 |
Date | 20 August 2024 |
Creators | Khursheed, Shahwar Atiq |
Contributors | Electrical Engineering, Williams, Ryan K., Boker, Almuatazbellah M., Doan, Thinh Thanh |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Thesis |
Format | ETD, application/pdf |
Rights | Creative Commons Attribution 4.0 International, http://creativecommons.org/licenses/by/4.0/ |
Page generated in 0.0022 seconds