Return to search

Reinforcement learning and convergence analysis with applications to agent-based systems

Agent-based systems usually operate in real-time, stochastic and dynamic environments. Many theoretical and applied techniques have been applied to the investigation of agent architecture with respect to communication, cooperation, and learning, in order to provide a framework for implementing artificial intelligence and computing techniques. Intelligent agents are required to be able to adapt and learn in uncertain environments via communication and collaboration (in both competitive and cooperative situations). The ability of reasoning and learning is one fundamental feature for intelligent agents. Due to the inherent complexity, however, it is difficult to verify the properties of the complex and dynamic environments a priori. Since analytic techniques are inadequate for solving these problems, reinforcement learning (RL) has appeared as a popular approach by mapping states to actions, so as to maximise the long-term rewards. Computer simulation is needed to replicate an experiment for testing and verifying the efficiency of simulation-based optimisation techniques. In doing so, a simulation testbed called robot soccer is used to test the learning algorithms in the specified scenarios. This research involves the investigation of simulation-based optimisation techniques in agent-based systems. Firstly, a hybrid agent teaming framework is presented for investigating agent team architecture, learning abilities, and other specific behaviors. Secondly, the novel reinforcement learning algorithms to verify goal-oriented agents; competitive and cooperative learning abilities for decision-making are developed. In addition, the function approximation technique known as tile coding (TC), is used to avoid the state space growing exponentially with the curse of dimensionality. Thirdly, the underlying mechanism of eligibility traces is analysed in terms of on-policy algorithm and off-policy algorithm, accumulating traces and replacing traces. Fourthly, the "design of experiment" techniques, such as Simulated Annealing method and Response Surface methodology, are integrated with reinforcement learning techniques to enhance the performance. Fifthly, a methodology is proposed to find the optimal parameter values to improve convergence and efficiency of the learning algorithms. Finally, this thesis provides a serious full-fledged numerical analysis on the efficiency of various RL techniques.

Identiferoai:union.ndltd.org:ADTP/284049
Date January 2008
CreatorsLeng, Jinsong
Source SetsAustraliasian Digital Theses Program
LanguageEN-AUS
Detected LanguageEnglish
RightsCopyright Jinsong 2008

Page generated in 0.002 seconds