Global ETD Search

Return to search

Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks

Two controller performances are assessed for generalization in the path following task of autonomously backing up a tractor-trailer. Starting from random locations and orientations, paths are generated to loading docks with arbitrary pose using Dubins Curves. The combination vehicles can be varied in wheelbase, hitch length, weight distributions, and tire cornering stiffness. The closed form calculation of the gains for the Linear Quadratic Regulator (LQR) rely heavily on having an accurate model of the plant. However, real-world applications cannot expect to have an updated model for each new trailer. Finding alternative robust controllers when the trailer model is changed was the motivation of this research.
Reinforcement learning, with neural networks as their function approximators, can allow for generalized control from its learned experience that is characterized by a scalar reward value. The Linear Quadratic Regulator and the Deep Deterministic Policy Gradient (DDPG) are compared for robust control when the trailer is changed. This investigation quantifies the capabilities and limitations of both controllers in simulation using a kinematic model. The controllers are evaluated for generalization by altering the kinematic model trailer wheelbase, hitch length, and velocity from the nominal case.
In order to close the gap from simulation and reality, the control methods are also assessed with sensor noise and various controller frequencies. The root mean squared and maximum errors from the path are used as metrics, including the number of times the controllers cause the vehicle to jackknife or reach the goal. Considering the runs where the LQR did not cause the trailer to jackknife, the LQR tended to have slightly better precision. DDPG, however, controlled the trailer successfully on the paths where the LQR jackknifed. Reinforcement learning was found to sacrifice a short term reward, such as precision, to maximize the future expected reward like reaching the loading dock. The reinforcement learning agent learned a policy that imposed nonlinear constraints such that it never jackknifed, even when it wasn't the trailer it trained on.

Linear Quadratic Regulator

Deep Deterministic Policy Gradient

Identifer	oai:union.ndltd.org:CALPOLY/oai:digitalcommons.calpoly.edu:theses-3540
Date	01 November 2019
Creators	McDowell, Journey
Publisher	DigitalCommons@CalPoly
Source Sets	California Polytechnic State University
Detected Language	English
Type	text
Format	application/pdf
Source	Master's Theses

Page generated in 0.002 seconds

Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks

Description

Links & Downloads

Tags

Additional Fields