• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 74
  • 4
  • 3
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 104
  • 104
  • 104
  • 32
  • 24
  • 19
  • 19
  • 18
  • 17
  • 17
  • 17
  • 17
  • 17
  • 16
  • 16
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Game AI of StarCraft II based on Deep Reinforcement Learning

Junjie Luo (8786552) 30 April 2020 (has links)
The research problem of this article is the Game AI agent of StarCraft II based on Deep Reinforcement Learning (DRL). StarCraft II is viewed as the most challenging Real-time Strategy (RTS) game for now, and it is also the most popular game where researchers are developing and improving AI agents. Building AI agents of StarCraft II can help researchers on machine learning figure out the weakness of DRL and improve this series of algorithms. In 2018, DeepMind and Blizzard developed the StarCraft II Learning Environment (PySC2) to enable researchers to promote the development of AI agents. DeepMind started to develop a new project called AlphaStar after AlphaGo based on DRL, while several laboratories also published articles about the AI agents of StarCraft II. Most of them are researching on the AI agents of Terran and Zerg, which are two of three races in StarCraft II. AI agents show high-level performance compared with most StarCraft II players. However, the performance is far from defeating E-sport players because Game AI for StarCraft II has large observation space and large action space. However, there is no publication on Protoss, which is the remaining and most complicated race to deal with (larger action space, larger observation space) for AI agents due to its characteristics. Thus, in this paper, the research question is whether the AI agent of Protoss, which is developed by the model based on DRL, for a full-length game on a particular map can defeat the high-level built-in cheating AI. The population of this research design is the StarCraft II AI agents that researchers built based on their DRL models, while the sample is the Protoss AI agent in this paper. The raw data is from the game matches between the Protoss AI agent and built-in AI agents. PySC2 can capture features and numerical variables in each match to obtain the training data. The expected outcome is the model based on DRL, which can train a Protoss AI agent to defeat high-level game AI agents with the win rate. The model includes the action space of Protoss, the observation space and the realization of DRL algorithms. Meanwhile, the model is built on PySC2 v2.0, which provides additional action functions. Due to the complexity and the unique characteristics of Protoss in StarCraft II, the model cannot be applied to other games or platforms. However, how the model trains a Protoss AI agent can show the limitation of DRL and push DRL algorithm a little forward.
22

Radio Resource Allocation and Beam Management under Location Uncertainty in 5G mmWave Networks

Yao, Yujie 16 June 2022 (has links)
Millimeter wave (mmWave) plays a critical role in the Fifth-generation (5G) new radio due to the rich bandwidth it provides. However, one shortcoming of mmWave is the substantial path loss caused by poor diffraction at high frequencies, and consequently highly directional beams are applied to mitigate this problem. A typical way of beam management is to cluster users based on their locations. However, localization uncertainty is unavoidable due to measurement accuracy, system performance fluctuation, and so on. Meanwhile, the traffic demand may change dynamically in wireless environments, which increases the complexity of network management. Therefore, a scheme that can handle both the uncertainty of localization and dynamic radio resource allocation is required. Moreover, since the localization uncertainty will influence the network performance, more state-of-the-art localization methods, such as vision-aided localization, are expected to reduce the localization error. In this thesis, we proposed two algorithms for joint radio resource allocation and beam management in 5G mmWave networks, namely UK-means-based Clustering and Deep Reinforcement Learning-based resource allocation (UK-DRL) and UK-medoids-based Clustering and Deep Reinforcement Learning-based resource allocation (UKM-DRL). Specifically, we deploy UK-means and UK-medoids clustering method in UK-DRL and UKM-DRL, respectively, which is designed to handle the clustering under location uncertainties. Meanwhile, we apply Deep Reinforcement Learning (DRL) for intra-beam radio resource allocations in UK-DRL and UKM-DRL. Moreover, to improve the localization accuracy, we develop a vision-aided localization scheme, where pixel characteristics-based features are extracted from satellite images as additional input features for location prediction. The simulations show that UK-DRL and UKM-DRL successfully improve the network performance in data rate and delay than baseline algorithms. When the traffic load is 4 Mbps, UK-DRL has a 172.4\% improvement in sum rate and 64.1\% improvement in latency than K-means-based Clustering and Deep Reinforcement Learning-based resource allocation (K-DRL). UKM-DRL has 17.2\% higher throughput and 7.7\% lower latency than UK-DRL, and 231\% higher throughput and 55.8\% lower latency than K-DRL. On the other hand, the vision-aided localization scheme can significantly reduce the localization error from 17.11 meters to 3.6 meters.
23

Navigation of Mobile Robots in Human Environments with Deep Reinforcement Learning / Navigering av mobila robotar i mänskliga miljöer med deep reinforcement learning

Coors, Benjamin January 2016 (has links)
For mobile robots which operate in human environments it is not sufficient to simply travel to their target destination as quickly as possible. Instead, mobile robots in human environments need to travel to their destination safely,  keeping a comfortable distance to humans and not colliding with any obstacles along the way. As the number of possible human-robot interactions is very large, defining a rule-based navigation approach is difficult in such highly dynamic environments. Current approaches solve this task by predicting the trajectories of humans in the scene and then planning a collision-free path. However, this requires separate components for detecting and predicting human motion and does not scale well to densely populated environments. Therefore, this work investigates the use of deep reinforcement learning for the navigation of mobile robots in human environments. This approach is based on recent research on utilizing deep neural networks in reinforcement learning to successfully play Atari 2600 video games on human level. A deep convolutional neural network is trained end-to-end from one-dimensional laser scan data to command velocities. Discrete and continuous action space implementations are evaluated in a simulation and are shown to outperform a Social Force Model baseline approach on the navigation problem for mobile robots in human environments.
24

Exploiting Multi-Modal Fusion for Urban Autonomous Driving Using Latent Deep Reinforcement Learning

Khalil, Yasser 29 April 2022 (has links)
Human driving decisions are the leading cause of road fatalities. Autonomous driving naturally eliminates such incompetent decisions and thus can improve traffic safety and efficiency. Deep reinforcement learning (DRL) has shown great potential in learning complex tasks. Recently, researchers investigated various DRL-based approaches for autonomous driving. However, exploiting multi-modal fusion to generate pixel-wise perception and motion prediction and then leveraging these predictions to train a latent DRL has not been targeted yet. Unlike other DRL algorithms, the latent DRL algorithm distinguishes representation learning from task learning, enhancing sampling efficiency for reinforcement learning. In addition, supplying the latent DRL algorithm with accurate perception and motion prediction simplifies the surrounding urban scenes, improving training and thus learning a better driving policy. To that end, this Ph.D. research initially develops LiCaNext, a novel real-time multi-modal fusion network to produce accurate joint perception and motion prediction at a pixel level. Our proposed approach relies merely on a LIDAR sensor, where its multi-modal input is composed of bird's-eye view (BEV), range view (RV), and range residual images. Further, this Ph.D. thesis proposes leveraging these predictions with another simple BEV image to train a sequential latent maximum entropy reinforcement learning (MaxEnt RL) algorithm. A sequential latent model is deployed to learn a more compact latent representation from high-dimensional inputs. Subsequently, the MaxEnt RL model trains on this latent space to learn a driving policy. The proposed LiCaNext is trained on the public nuScenes dataset. Results demonstrated that LiCaNext operates in real-time and performs better than the state-of-the-art in perception and motion prediction, especially for small and distant objects. Furthermore, simulation experiments are conducted on CARLA to evaluate the performance of our proposed approach that exploits LiCaNext predictions to train sequential latent MaxEnt RL algorithm. The simulated experiments manifest that our proposed approach learns a better driving policy outperforming other prevalent DRL-based algorithms. The learned driving policy achieves the objectives of safety, efficiency, and comfort. Experiments also reveal that the learned policy maintains its effectiveness under different environments and varying weather conditions.
25

Neural Approaches for Syntactic and Semantic Analysis / 構文・意味解析に対するニューラルネットワークを利用した手法

Kurita, Shuhei 25 March 2019 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第21911号 / 情博第694号 / 新制||情||119(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授 黒橋 禎夫, 教授 鹿島 久嗣, 准教授 河原 大輔 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
26

Training an Artificial Bat: Modeling Sonar-based Obstacle Avoidance using Deep-reinforcement Learning

Mohan, Adithya Venkatesh January 2020 (has links)
No description available.
27

Connected and Automated Traffic Control at Signalized Intersections under Mixed-autonomy Environments

Guo, Yi January 2020 (has links)
No description available.
28

Performance Enhancement of Aerial Base Stations via Reinforcement Learning-Based 3D Placement Techniques

Parvaresh, Nahid 21 December 2022 (has links)
Deploying unmanned aerial vehicles (UAVs) as aerial base stations (BSs) in order to assist terrestrial connectivity, has drawn significant attention in recent years. UAV-BSs can take over quickly as service providers during natural disasters and many other emergency situations when ground BSs fail in an unanticipated manner. UAV-BSs can also provide cost-effective Internet connection to users who are out of infrastructure. UAV-BSs benefit from their mobility nature that enables them to change their 3D locations if the demand of ground users changes. In order to effectively make use of the mobility of UAV-BSs in a dynamic network and maximize the performance, the 3D location of UAV-BSs should be continuously optimized. However, solving the location optimization problem of UAV-BSs is NP-hard with no optimal solution in polynomial time for which near optimal solutions have to be exploited. Besides the conventional solutions, i.e. heuristic solutions, machine learning (ML), specifically reinforcement learning (RL), has emerged as a promising solution for tackling the positioning problem of UAV-BSs. The common practice for optimizing the 3D location of UAV-BSs using RL algorithms is assuming fixed location of ground users (i.e., UEs) in addition to defining discrete and limited action space for the agent of RL. In this thesis, we focus on improving the location optimization of UAV-BSs in two ways: 1-Taking into account the mobility of users in the design of RL algorithm, 2-Extending the action space of RL to a continuous action space so that the UAV-BS agent can flexibly change its location by any distance (limited by the maximum speed of UAV-BS). Three types of RL algorithms, i.e. Q-learning (QL), deep Q-learning (DQL) and actor-critic deep Q-learning (ACDQL) have been employed in this thesis to step-by-step improve the performance results of a UAV-assisted cellular network. QL is the first type of RL algorithm we use for the autonomous movement of the UAV-BS in the presence of mobile users. As a solution to the limitations of QL, we next propose a DQL-based strategy for the location optimization of the UAV-BS which largely improves the performance results of the network compared to the QL-based model. Third, we propose an ACDQL-based solution for autonomously moving the UAV-BS in a continuous action space wherein the performance results significantly outperforms both QL and DQL strategies.
29

A Deep Reinforcement Learning Framework for Optimizing Fuel Economy of Vehicles

Toor, Omar Zia January 2022 (has links)
Machine learning (ML) has experienced immense growth in the last decade; applications of ML are found in nearly every sphere of society. We gather a vast amount of data during our day-to-day operations, and the machines use this data to make intelligent decisions for us. Transport is an essential part of our life. Since early times it has been a critical requirement of human beings, and as we progressed, its need has risen tremendously. Nowadays, we cannot think of a life without any means of transportation. A vehicle is the primary means of transportation in the modern industrial world. Most of the world's population uses a vehicle for transportation needs. The internal combustion engine (ICE) has been the vehicle's primary power source since the early days. Approximately 99% of world transport uses internal combustion engine-based vehicles. As with every other device, Internal combustion technology has seen enormous improvement during the past half a century. These gradual improvements have made these ICEs better than ever before. But as we have progressed, we have also been exposed to the harms of combustion-based power sources like ICEs. Their effect on the environment has been damaging. The future way is to limit the use of hydrocarbon-based fuels for combustion. As it's nearly impossible to eradicate the use of these fuels instantly as most of the population is deeply dependent on them, the best way to proceed is to find alternate transportation sources like EVs. But as these technologies are developing, it's of great importance; we use our modern ML-based technologies to make the current ICE-based vehicles as fuel-efficient as possible so that this transition to electric-based vehicles becomes smooth and the effect of ICEs is minimized on the environment. The thesis presents a deep reinforcement learning (DRL) based technique, which can develop an optimal policy to effectively reduce vehicle fuel consumption by providing intelligent driving inputs to the vehicle. The study shows that it is possible to use (DRL) methods to improve fuel efficiency in a simulated vehicular environment. Moreover, the models seem to develop new types of operational vehicle policies that are very unconventional but result in improved fuel efficiency in simulated environments. Although these cannot be implemented in the current form in the actual environment, modified versions of these policies may be able to work in real-life scenarios.
30

Deep Reinforcement Learning for Dynamic Grasping

Ström, Andreas January 2022 (has links)
Dynamic grasping is the action of, using only contact force, manipulating the position of a moving object in space. Doing so with a robot is a quite complex task in itself, but is one with wide-ranging applications. Today, the work of automating most processes in society undergoes rapidly, and for many of these processes, the grasping of objects has a natural place. This work has explored using deep reinforcement learning techniques for dynamic grasping, in a simulated environment. Deep Deterministic Policy Gradient was chosen and evaluated for the task, both by itself and in combination with the Hindsight Experience Replay buffer. The reinforcement learning agent observed the initial state of the target object and the robot in the environment, simulated using AGX Dynamics, and then determined with what speed to move to which position. The agent's chosen action was relayed to ABB's virtual controller, which controlled the robot in the simulation. This meant that the agent was tasked with, in advance, parametrizing a predefined set of instructions to the robot, in such a way that the moving target object would be grasped and picked up. Doing it in this matter, as opposed to having the agent continuously control the robot, was a necessary challenge making it possible to utilize the intelligence already created for the virtual controller. It also means that transferring the things learned by an agent in a simulated environment to a real-world environment becomes easier. The accuracy of the target policy for the simpler agent was 99.07%, while the accuracy of the agent with the more advanced replay buffer came up to 99.30%. These results show promise for the future, both as we expect further fine-tuning to raise them even more, and as they indicate that deep reinforcement learning methods can be highly applicable to the robotics systems of today.

Page generated in 0.1322 seconds