Return to search

Learning to control

This thesis examines whether it is possible for a machine to incrementally build a complex model of its environment, and then use this model for control purposes. Given a sequence of noisy observations, the machine forms a piecewise linear approximation to the nonlinear dynamic equations that are assumed to describe the real world. A number of existing online system identification techniques are examined, but it is found that they all either scale poorly with dimensionality, have a number of parameters that make them difficult to apply, or do not learn sufficiently accurate approximations. Therefore a novel framework is developed for learning linear model trees in both batch and online settings. The algorithms are evaluated empirically on a number of commonly used benchmark datasets, a simple test function, and three dynamic domains ranging from a simple pendulum to a complex flight simulator. The new batch algorithm is compared with three state-of-the-art algorithms and is seen to perform favourably overall. The new incremental model tree learner also compares well with a recent online function approximator from the literature. Armed with a tool for effectively constructing piecewise linear models of the environment, a control framework is developed that learns trajectories from a demonstrator and attempts to follow these trajectories within each linear region usinglinear quadratic control. The induced controllers are able to swing up and balance a simple forced pendulum both in simulation and in the real world. They can also swing up and balance a real double pendulum. The induced controllers are empirically shown to perform better than the original demonstrator, and could therefore be used to either replace a human operator or improve upon an existing automatic controller. In addition an ability to generalise the learnt trajectories enables the system to perform novel tasks. This is demonstrated on a flight simulator where, having observed an aircraft flying several times around a circuit, the controller is able to copy the take-off procedure, fly a completely new circuit that includes new manoeuvres, and successfully land the plane.
Date January 2007
CreatorsPotts, Duncan, Computer Science & Engineering, Faculty of Engineering, UNSW
Source SetsAustraliasian Digital Theses Program
Detected LanguageEnglish

Page generated in 0.0186 seconds