Global ETD Search

Return to search

Optimal Control of Perimeter Patrol Using Reinforcement Learning

Unmanned Aerial Vehicles (UAVs) are being used more frequently in surveillance scenarios for both civilian and military applications. One such application addresses
a UAV patrolling a perimeter, where certain stations can receive alerts at random intervals. Once the UAV arrives at an alert site it can take two actions:

1. Loiter and gain information about the site.
2. Move on around the perimeter.

The information that is gained is transmitted to an operator to allow him to classify the alert. The information is a function of the amount of time the UAV is at the alert site, also called the dwell time, and the maximum delay. The goal of the optimization is to classify the alert so as to maximize the expected discounted information gained by the UAV's actions at a station about an alert. This optimization problem can be readily solved using Dynamic Programming. Even though this approach generates feasible solutions, there are reasons to experiment with different approaches. A
complication for Dynamic Programming arises when the perimeter patrol problem is expanded. This is that the number of states increases rapidly when one adds additional stations, nodes, or UAVs to the perimeter. This in effect greatly increases the computation time making the determination of the solution intractable. The following attempts to alleviate this problem by implementing a Reinforcement Learning technique to obtain the optimal solution, more specifically Q-Learning. Reinforcement Learning is a simulation-based version of Dynamic Programming and requires lesser information to compute sub-optimal solutions. The effectiveness of the policies generated using Reinforcement Learning for the perimeter patrol problem have been corroborated numerically in this thesis.

http://hdl.handle.net/1969.1/ETD-TAMU-2011-05-9520

Unmanned Aerial Vehicles

Dynamic Programming

Reinforcement Learning

Identifer	oai:union.ndltd.org:tamu.edu/oai:repository.tamu.edu:1969.1/ETD-TAMU-2011-05-9520
Date	2011 May 1900
Creators	Walton, Zachary
Source Sets	Texas A and M University
Language	en_US
Detected Language	English
Type	thesis, text
Format	application/pdf

Page generated in 0.0021 seconds

Optimal Control of Perimeter Patrol Using Reinforcement Learning

Description

Links & Downloads

Tags

Additional Fields