• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 640
  • 81
  • 66
  • 22
  • 11
  • 8
  • 8
  • 7
  • 7
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 1051
  • 1051
  • 254
  • 216
  • 192
  • 177
  • 157
  • 154
  • 152
  • 149
  • 142
  • 127
  • 120
  • 120
  • 112
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Learning from Immediate and Delayed Rewards

Cotet, Miruna Gabriela January 2021 (has links)
No description available.
92

Mutual Reinforcement Learning

Reid, Cameron 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Mutual learning is an emerging field in intelligent systems which takes inspiration from naturally intelligent agents and attempts to explore how agents can communicate and coop- erate to share information and learn more quickly. While agents in many biological systems have little trouble learning from one another, it is not immediately obvious how artificial agents would achieve similar learning. In this thesis, I explore how agents learn to interact with complex systems. I further explore how these complex learning agents may be able to transfer knowledge to one another to improve their learning performance when they are learning together and have the power of communication. While significant research has been done to explore the problem of knowledge transfer, the existing literature is concerned ei- ther with supervised learning tasks or relatively simple discrete reinforcement learning. The work presented here is, to my knowledge, the first which admits continuous state spaces and deep reinforcement learning techniques. The first contribution of this thesis, presented in Chapter 2, is a modified version of deep Q-learning which demonstrates improved learning performance due to the addition of a mutual learning term which penalizes disagreement between mutually learning agents. The second contribution, in Chapter 3, is a presentation work which describes effective communication of agents which use fundamentally different knowledge representations and systems of learning (model-free deep Q learning and model- based adaptive dynamic programming), and I discuss how the agents can mathematically negotiate their trust in one another to achieve superior learning performance. I conclude with a discussion of the promise shown by this area of research and a discussion of problems which I believe are exciting directions for future research.
93

A Defender-Aware Attacking Guidance Policy for the TAD Differential Game

English, Jacob T. January 2020 (has links)
No description available.
94

Reinforcement learning in the presence of rare events

Frank, Jordan William, 1980- January 2009 (has links)
No description available.
95

On-policy Object Goal Navigation with Exploration Bonuses

Maia, Eric 15 August 2023 (has links)
Machine learning developments have contributed to overcome a wide range of issues, including robotic motion, autonomous navigation, and natural language processing. Of note are the advancements of reinforcement learning in the area of object goal navigation — the task of autonomously traveling to target objects with minimal a priori knowledge of the environment. Given the sparse placement of goals in unknown scenes, exploration is essential for reaching remote objects of interest that are not immediately visible to autonomous agents. Sparse rewards are a crucial problem in reinforcement learning that arises in object goal navigation, as positive rewards are only attained when targets are found at the end of an agent’s trajectory. As such, this work explores object goal navigation and the challenges it presents, along with the relevant reinforcement learning techniques applied to the task. An ablation study of the baseline approach for the RoboTHOR 2021 object goal navigation challenge is presented and used to guide the development of an on-policy agent that is computationally less expensive and obtains greater success in unseen environments. Then, original object goal navigation reward schemes that aggregate episodic and long-term novelty bonuses are proposed, and obtain success rates comparable to the respective object goal navigation benchmark at a fraction of training interactions with the environment.
96

Learning-based Optimal Control of Time-Varying Linear Systems Over Large Time Intervals

Baddam, Vasanth Reddy January 2023 (has links)
We solve the problem of two-point boundary optimal control of linear time-varying systems with unknown model dynamics using reinforcement learning. Leveraging singular perturbation theory techniques, we transform the time-varying optimal control problem into two time-invariant subproblems. This allows the utilization of an off-policy iteration method to learn the controller gains. We show that the performance of the learning-based controller approximates that of the model-based optimal controller and the approximation accuracy improves as the control problem’s time horizon increases. We also provide a simulation example to verify the results / M.S. / We use reinforcement learning to find two-point boundary optimum controls for linear time-varying systems with uncertain model dynamics. We divided the LTV control problem into two LTI subproblems using singular perturbation theory techniques. As a result, it is possible to identify the controller gains via a learning technique. We show that the training-based controller’s performance approaches that of the model-based optimal controller, with approximation accuracy growing with the temporal horizon of the control issue. In addition, we provide a simulated scenario to back up our findings.
97

Robot Navigation in Cluttered Environments with Deep Reinforcement Learning

Weideman, Ryan 01 June 2019 (has links) (PDF)
The application of robotics in cluttered and dynamic environments provides a wealth of challenges. This thesis proposes a deep reinforcement learning based system that determines collision free navigation robot velocities directly from a sequence of depth images and a desired direction of travel. The system is designed such that a real robot could be placed in an unmapped, cluttered environment and be able to navigate in a desired direction with no prior knowledge. Deep Q-learning, coupled with the innovations of double Q-learning and dueling Q-networks, is applied. Two modifications of this architecture are presented to incorporate direction heading information that the reinforcement learning agent can utilize to learn how to navigate to target locations while avoiding obstacles. The performance of the these two extensions of the D3QN architecture are evaluated in simulation in simple and complex environments with a variety of common obstacles. Results show that both modifications enable the agent to successfully navigate to target locations, reaching 88% and 67% of goals in a cluttered environment, respectively.
98

Influencing Exploration in Actor-Critic Reinforcement Learning Algorithms

Gough, Andrew R 01 June 2018 (has links) (PDF)
Reinforcement Learning (RL) is a subset of machine learning primarily concerned with goal-directed learning and optimal decision making. RL agents learn based on a reward signal discovered from trial and error in complex, uncertain environments with the goal of maximizing positive reward signals. RL approaches need to scale up as they are applied to more complex environments with extremely large state spaces. Inefficient exploration methods cannot sufficiently explore complex environments in a reasonable amount of time, and optimal policies will be unrealized resulting in RL agents failing to solve an environment. This thesis proposes a novel variant of the Actor-Advantage Critic (A2C) algorithm. The variant is validated against two state-of-the-art RL algorithms, Deep Q-Network (DQN) and A2C, across six Atari 2600 games of varying difficulty. The experimental results are competitive with state-of-the-art and achieve lower variance and quicker learning speed. Additionally, the thesis introduces a metric to objectively quantify the difficulty of any Markovian environment with respect to the exploratory capacity of RL agents.
99

Machine Translation For Machines

Tebbifakhr, Amirhossein 25 October 2021 (has links)
Traditionally, Machine Translation (MT) systems are developed by targeting fluency (i.e. output grammaticality) and adequacy (i.e. semantic equivalence with the source text) criteria that reflect the needs of human end-users. However, recent advancements in Natural Language Processing (NLP) and the introduction of NLP tools in commercial services have opened new opportunities for MT. A particularly relevant one is related to the application of NLP technologies in low-resource language settings, for which the paucity of training data reduces the possibility to train reliable services. In this specific condition, MT can come into play by enabling the so-called “translation-based” workarounds. The idea is simple: first, input texts in the low-resource language are translated into a resource-rich target language; then, the machine-translated text is processed by well-trained NLP tools in the target language; finally, the output of these downstream components is projected back to the source language. This results in a new scenario, in which the end-user of MT technology is no longer a human but another machine. We hypothesize that current MT training approaches are not the optimal ones for this setting, in which the objective is to maximize the performance of a downstream tool fed with machine-translated text rather than human comprehension. Under this hypothesis, this thesis introduces a new research paradigm, which we named “MT for machines”, addressing a number of questions that raise from this novel view of the MT problem. Are there different quality criteria for humans and machines? What makes a good translation from the machine standpoint? What are the trade-offs between the two notions of quality? How to pursue machine-oriented objectives? How to serve different downstream components with a single MT system? How to exploit knowledge transfer to operate in different language settings with a single MT system? Elaborating on these questions, this thesis: i) introduces a novel and challenging MT paradigm, ii) proposes an effective method based on Reinforcement Learning analysing its possible variants, iii) extends the proposed method to multitask and multilingual settings so as to serve different downstream applications and languages with a single MT system, iv) studies the trade-off between machine-oriented and human-oriented criteria, and v) discusses the successful application of the approach in two real-world scenarios.
100

Evolutionary Optimization of Decision Trees for Interpretable Reinforcement Learning

Custode, Leonardo Lucio 27 April 2023 (has links)
While Artificial Intelligence (AI) is making giant steps, it is also raising concerns about its trustworthiness, due to the fact that widely-used black-box models cannot be exactly understood by humans. One of the ways to improve humans’ trust towards AI is to use interpretable AI models, i.e., models that can be thoroughly understood by humans, and thus trusted. However, interpretable AI models are not typically used in practice, as they are thought to be less performing than black-box models. This is more evident in Reinforce- ment Learning, where relatively little work addresses the problem of performing Reinforce- ment Learning with interpretable models. In this thesis, we address this gap, proposing methods for Interpretable Reinforcement Learning. For this purpose, we optimize Decision Trees by combining Reinforcement Learning with Evolutionary Computation techniques, which allows us to overcome some of the challenges tied to optimizing Decision Trees in Reinforcement Learning scenarios. The experimental results show that these approaches are competitive with the state-of-the-art score while being extremely easier to interpret. Finally, we show the practical importance of Interpretable AI by digging into the inner working of the solutions obtained.

Page generated in 0.1255 seconds