This dissertation presents a series of new results of iterative learning control (ILC) that progresses from model-based ILC algorithms to data-driven ILC algorithms. ILC is a type of trial-and-error algorithm to learn by repetitions in practice to follow a pre-defined finite-time maneuver with high tracking accuracy.
Mathematically ILC constructs a contraction mapping between the tracking errors of successive iterations, and aims to converge to a tracking accuracy approaching the reproducibility level of the hardware. It produces feedforward commands based on measurements from previous iterations to eliminates tracking errors from the bandwidth limitation of these feedback controllers, transient responses, model inaccuracies, unknown repeating disturbance, etc.
Generally, ILC uses an a priori model to form the contraction mapping that guarantees monotonic decay of the tracking error. However, un-modeled high frequency dynamics may destabilize the control system. The existing infinite impulse response filtering techniques to stop the learning at such frequencies, have initial condition issues that can cause an otherwise stable ILC law to become unstable. A circulant form of zero-phase filtering for finite-time trajectories is proposed here to avoid such issues. This work addresses the problem of possible lack of stability robustness when ILC uses an imperfect a prior model.
Besides the computation of feedforward commands, measurements from previous iterations can also be used to update the dynamic model. In other words, as the learning progresses, an iterative data-driven model development is made. This leads to adaptive ILC methods.
An indirect adaptive linear ILC method to speed up the desired maneuver is presented here. The updates of the system model are realized by embedding an observer in ILC to estimate the system Markov parameters. This method can be used to increase the productivity or to produce high tracking accuracy when the desired trajectory is too fast for feedback control to be effective.
When it comes to nonlinear ILC, data is used to update a progression of models along a homotopy, i.e., the ILC method presented in this thesis uses data to repeatedly create bilinear models in a homotopy approaching the desired trajectory. The improvement here makes use of Carleman bilinearized models to capture more nonlinear dynamics, with the potential for faster convergence when compared to existing methods based on linearized models.
The last work presented here finally uses model-free reinforcement learning (RL) to eliminate the need for an a priori model. It is analogous to direct adaptive control using data to directly produce the gains in the ILC law without use of a model. An off-policy RL method is first developed by extending a model-free model predictive control method and then applied in the trial domain for ILC. Adjustments of the ILC learning law and the RL recursion equation for state-value function updates allow the collection of enough data while improving the tracking accuracy without much safety concerns. This algorithm can be seen as the first step to bridge ILC and RL aiming to address nonlinear systems.
Identifer | oai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/D8H14JX8 |
Date | January 2019 |
Creators | Song, Bing |
Source Sets | Columbia University |
Language | English |
Detected Language | English |
Type | Theses |
Page generated in 0.0035 seconds