Military, Medical, Exploratory, and Commercial robots have much to gain from exchanging wheels for legs. However, the equations of motion of dynamic bipedal walker models are highly coupled and non-linear, making the selection of an appropriate control scheme difficult. A temporal difference reinforcement learning method known as Q-learning develops complex control policies through environmental exploration and exploitation. As a proof of concept, Q-learning was applied through simulation to a benchmark single pendulum swing-up/balance task; the value function was first approximated with a look-up table, and then an artificial neural network. We then applied Evolutionary Function Approximation for Reinforcement Learning to effectively control the swing-leg and torso of a 3 degree of freedom active dynamic bipedal walker in simulation. The model began each episode in a stationary vertical configuration. At each time-step the learning agent was rewarded for horizontal hip displacement scaled by torso altitude--which promoted faster walking while maintaining an upright posture--and one of six coupled torque activations were applied through two first-order filters. Over the course of 23 generations, an approximation of the value function was evolved which enabled walking at an average speed of 0.36 m/s. The agent oscillated the torso forward then backward at each step, driving the walker forward for forty-two steps in thirty seconds without falling over. This work represents the foundation for improvements in anthropomorphic bipedal robots, exoskeleton mechanisms to assist in walking, and smart prosthetics. / Master of Science
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/34602 |
Date | 22 August 2007 |
Creators | Renner, Michael Robert |
Contributors | Mechanical Engineering, Granata, Kevin P., Hong, Dennis W., Kasarda, Mary E., Reinholtz, Charles F., Sandu, Corina |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Detected Language | English |
Type | Thesis |
Format | application/pdf, application/pdf, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Relation | etd_part3.pdf, etd_part2.pdf, etd_part1.pdf |
Page generated in 0.0037 seconds