Pure reinforcement learning does not scale well to domains with many degrees of freedom and particularly to continuous domains. In this thesis, we introduce a hybrid method in which a symbolic planner constructs all approximate solution to a control problem.. Subsequently, a numerical optimisation algorithm is used to refine the qualitative plan into an operational policy. The method is demonstrated on the problem of learning a stable walking gait for a bipedal robot. The contributions of this thesis are as follows. Firstly, the thesis proposes a novel way to generate gait patterns by using a genetic algorithm to generate walking gaits for a humanoid robot using zero moment point as the stability criterion. This is validated on physical robot. Second, we propose an innovative generic learning method that utilises the trainer's domain knowledge about the task to accelerate learning and extend the capabilities of the learning algorithm. The proposed method, which takes advantage of domain knowledge and combines symbolic planning and learning to accelerate and reduce the search space of the learning problem, is tested on a bipedal humanoid robot learning to walk. Finally, it is shown that the extended capability of the learning algorithm handles high complexity learning tasks in the physical world with experimental verification on a physical robot.
Identifer | oai:union.ndltd.org:ADTP/258276 |
Date | January 2007 |
Creators | Yik, Tak Fai, Computer Science & Engineering, Faculty of Engineering, UNSW |
Publisher | Awarded by:University of New South Wales. Computer Science & Engineering |
Source Sets | Australiasian Digital Theses Program |
Language | English |
Detected Language | English |
Rights | Copyright Yik Tak Fai., http://unsworks.unsw.edu.au/copyright |
Page generated in 0.0121 seconds