In the rapidly evolving field of autonomous driving, the integration of Deep Reinforcement Learning (DRL) promises significant advancements towards achieving reliable and efficient vehicular systems. This study presents a comprehensive examination of DRL’s application within a simulated autonomous driving context, with a focus on the nuanced impact of representation learning parameters on the performance of end-to-end models. An overview of the theoretical underpinnings of machine learning, deep learning, and reinforcement learning is provided, laying the groundwork for their application in autonomous driving scenarios. The methodology outlines a detailed framework for training autonomous vehicles in the Duckietown simulation environment, employing both non-end-to-end and end-to-end models to investigate the effectiveness of various reinforcement learning algorithms and representation learning techniques.
At the heart of this research are extensive simulation experiments designed to evaluate the Proximal Policy Optimization (PPO) algorithm’s effectiveness within the established framework. The study delves into reward structures and the impact of representation learning parameters on the performance of end-to-end models. A critical comparison of the models in the validation chapter highlights the significant role of representation learning parameters in the outcomes of DRL-based autonomous driving systems.
The findings reveal that meticulous adjustment of representation learning parameters markedly influences the end-to-end training process. Notably, image segmentation techniques significantly enhance feature recognizability and model performance.:Contents
List of Figures
List of Tables
List of Abbreviations
List of Symbols
1 Introduction
1.1 Autonomous Driving Overview
1.2 Problem Description
1.3 Research Structure
2 Research Background
2.1 Theoretical Basis
2.1.1 Machine Learning
2.1.2 Deep Learning
2.1.3 Reinforcement Learning
2.2 Related Work
3 Methodology
3.1 Problem Definition
3.2 Simulation Platform
3.3 Observation Space
3.3.1 Observation Space of Non-end-to-end model
3.3.2 Observation Space of end-to-end model
3.4 Action Space
3.5 Reward Shaping
3.5.1 speed penalty
3.5.2 position reward
3.6 Map and training dataset
3.6.1 Map Design
3.6.2 Training Dataset
3.7 Variational Autoencoder Structure
3.7.1 Mathematical fundation for VAE
3.8 Reinforcement Learning Framework
3.8.1 Actor-Critic Method
3.8.2 Policy Gradient
3.8.3 Trust Region Policy Optimization
3.8.4 Proximal Policy Optimization
4 Simulation Experiments
4.1 Experimental Setup
4.2 Representation Learning Model
4.3 End-to-end Model
5 Result
6 Validation and Evaluation
6.1 Validation of End-to-end Model
6.2 Evaluation of End-to-end Model
6.2.1 Comparison with Baselines
6.2.2 Comparison with Different Representation Learning Model
7 Conclusion and Future Work
7.1 Summary
7.2 Future Research
Identifer | oai:union.ndltd.org:DRESDEN/oai:qucosa:de:qucosa:90752 |
Date | 10 April 2024 |
Creators | Wang, Bingyu |
Contributors | Okhrin, Ostap, Li, Dianzhao, Technische Universität Dresden |
Source Sets | Hochschulschriftenserver (HSSS) der SLUB Dresden |
Language | English |
Detected Language | English |
Type | info:eu-repo/semantics/publishedVersion, doc-type:StudyThesis, info:eu-repo/semantics/StudyThesis, doc-type:Text |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0016 seconds