Global ETD Search

Return to search

ENHANCING POLICY OPTIMIZATION FOR IMPROVED SAMPLE EFFICIENCY AND GENERALIZATION IN DEEP REINFORCEMENT LEARNING

<p dir="ltr">The field of reinforcement learning has made significant progress in recent years, with deep reinforcement learning (RL) being a major contributor. However, there are still challenges associated with the effective training of RL algorithms, particularly with respect to sample efficiency and generalization. This thesis aims to address these challenges by developing RL algorithms capable of generalizing to unseen environments and adapting to dynamic conditions, thereby expanding the practical applicability of RL in real-world tasks. The first contribution of this thesis is the development of novel policy optimization techniques that enhance the generalization capabilities of RL agents. These techniques include the Thinker method, which employs style transfer to diversify observation trajectories, and Bootstrap Advantage Estimation, which improves policy and value function learning through augmented data. These methods have demonstrated superior performance in standard benchmarks, outperforming existing data augmentation and policy optimization techniques. Additionally, this thesis introduces Robust Policy Optimization, a method that enhances exploration in policy gradient-based RL by perturbing action distributions. This method addresses the limitations of traditional methods, such as entropy collapse and primacy bias, resulting in improved sample efficiency and adaptability in continuous action spaces. The thesis further explores the potential of natural language descriptions as an alternative to image-based state representations in RL. This approach enhances interpretability and generalization in tasks involving complex visual observations by leveraging large language models. Furthermore, this work contributes to the field of semi-autonomous teleoperated robotic surgery by developing systems capable of performing complex surgical tasks remotely, even under challenging conditions such as communication delays and data scarcity. The creation of the DESK dataset supports knowledge transfer across different robotic platforms, further enhancing the capabilities of these systems. Overall, the advancements presented in this thesis represent significant steps toward developing more robust, adaptable, and efficient autonomous agents. These contributions have broad implications for various real-world applications, including autonomous systems, robotics, and safety-critical tasks such as medical surgery.</p>

10.25394/pgs.27188148.v1

Applications in health

Intelligent robotics

Planning and decision making

Reinforcement learning

Deep Reinforcement Learning

Reinforcement Learning

AI in Medical Surgery

Identifer	oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/27188148
Date	08 October 2024
Creators	Md Masudur Rahman (19818171)
Source Sets	Purdue University
Detected Language	English
Type	Text, Thesis
Rights	CC BY 4.0
Relation	https://figshare.com/articles/thesis/ENHANCING_POLICY_OPTIMIZATION_FOR_IMPROVED_SAMPLE_EFFICIENCY_AND_GENERALIZATION_IN_DEEP_REINFORCEMENT_LEARNING/27188148

Page generated in 0.0027 seconds

ENHANCING POLICY OPTIMIZATION FOR IMPROVED SAMPLE EFFICIENCY AND GENERALIZATION IN DEEP REINFORCEMENT LEARNING

Description

Links & Downloads

Tags

Additional Fields