• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 512
  • 77
  • 65
  • 22
  • 11
  • 8
  • 8
  • 7
  • 7
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • Tagged with
  • 889
  • 889
  • 215
  • 191
  • 146
  • 130
  • 130
  • 128
  • 120
  • 120
  • 119
  • 106
  • 103
  • 96
  • 88
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Cost and Power Loss Aware Coalitions under Uncertainty in Transactive Energy Systems

Sadeghi, Mohammad 02 June 2022 (has links)
The need to cope with the rapid transformation of the conventional electrical grid into the future smart grid, with multiple connected microgrids, has led to the investigation of optimal smart grid architectures. The main components of the future smart grids such as generators, substations, controllers, smart meters and collector nodes are evolving; however, truly effective integration of these elements into the microgrid context to guarantee intelligent and dynamic functionality across the whole smart grid remains an open issue. Energy trading is a significant part of this integration. In microgrids, energy trading refers to the use of surplus energy in one microgrid to satisfy the demand of another microgrid or a group of microgrids that form a microgrid community. Different techniques are employed to manage the energy trading process such as optimization-based and conventional game-theoretical methods, which bring about several challenges including complexity, scalability and ability to learn dynamic environments. A common challenge among all of these methods is adapting to changing circumstances. Optimization methods, for example, show promising performance in static scenarios where the optimal solution is achieved for a specific snapshot of the system. However, to use such a technique in a dynamic environment, finding the optimal solutions for all the time slots is needed, which imposes a significant complexity. Challenges such as this can be best addressed using game theory techniques empowered with machine learning methods across grid infrastructure and microgrid communities. In this thesis, novel Bayesian coalitional game theory-based and Bayesian reinforcement learning-based coalition formation algorithms are proposed, which allow the microgrids to exchange energy with their coalition members while minimizing the associated cost and power loss. In addition, a deep reinforcement learning scheme is developed to address the problem of large convergence time resulting from the sizeable state-action space of the methods mentioned above. The proposed algorithms can ideally overcome the uncertainty in the system. The advantages of the proposed methods are highlighted by comparing them with the conventional coalitional game theory-based techniques, Q-learning-based technique, random coalition formation, as well as with the case with no coalitions. The results show the superiority of the proposed methods in terms of power loss and cost minimization in dynamic environments.
32

Corporate classifier systems

Tomlinson, Andrew Stephen January 1999 (has links)
No description available.
33

QoE-Fair Video Streaming over DASH

Altamimi, Sadi 19 December 2018 (has links)
Video streaming has become, and is expected to remain, the dominant type of traffic over the Internet. With this high demand for multimedia streaming, there is always a question on how to provide acceptable and fair Quality of Experience (QoE) for consumers of the over-the-top video services, despite the best-effort nature of the Internet and the limited network resources, shared by concurrent users. MPEG-DASH, as one of the most widely used standards of HTTP-based adaptive streaming, uses a client-side rate adaptation algorithms; which is known to suffer from two practical challenges: in one hand, clients use fixed heuristics that have been fine-tuned according to strict assumptions about deployment environments which limit its ability to generalize across network conditions. On the other hand, the absence of collaboration among DASH clients leads to unfair bandwidth allocation, and typically ends up in an unbalanced equilibrium point. We believe that augmenting a server-side rate adaptation significantly improves the fairness of network bandwidth allocation among concurrent users. We have formulated the problem as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP) model, and used RL to train two neural networks to find an optimal solution to the proposed Dec-POMDP problem in a distributed way. We showed that our proposed client-server collaboration outperforms the state-of-the-art schemes in terms of QoE-efficiency, QoE-fairness, and social welfare by as much as 16%, 21%, and 24% respectively.
34

Physics-based reinforcement learning for autonomous manipulation

Scholz, Jonathan 07 January 2016 (has links)
With recent research advances, the dream of bringing domestic robots into our everyday lives has become more plausible than ever. Domestic robotics has grown dramatically in the past decade, with applications ranging from house cleaning to food service to health care. To date, the majority of the planning and control machinery for these systems are carefully designed by human engineers. A large portion of this effort goes into selecting the appropriate models and control techniques for each application, and these skills take years to master. Relieving the burden on human experts is therefore a central challenge for bringing robot technology to the masses. This work addresses this challenge by introducing a physics engine as a model space for an autonomous robot, and defining procedures for enabling robots to decide when and how to learn these models. We also present an appropriate space of motor controllers for these models, and introduce ways to intelligently select when to use each controller based on the estimated model parameters. We integrate these components into a framework called Physics-Based Reinforcement Learning, which features a stochastic physics engine as the core model structure. Together these methods enable a robot to adapt to unfamiliar environments without human intervention. The central focus of this thesis is on fast online model learning for objects with under-specified dynamics. We develop our approach across a diverse range of domestic tasks, starting with a simple table-top manipulation task, followed by a mobile manipulation task involving a single utility cart, and finally an open-ended navigation task with multiple obstacles impeding robot progress. We also present simulation results illustrating the efficiency of our method compared to existing approaches in the learning literature.
35

Counter Autonomy Defense for Aerial Autonomous Systems

Mark E Duntz (8747724) 22 April 2020 (has links)
<div>Here, we explore methods of counter autonomy defense for aerial autonomous multi-agent systems. First, the case is made for vast capabilities made possible by these systems. Recognizing that widespread use is likely on the horizon, we assert that it will be necessary for system designers to give appropriate attention to the security and vulnerabilities of such systems. We propose a method of learning-based resilient control for the multi-agent formation tracking problem, which uses reinforcement learning and neural networks to attenuate adversarial inputs and ensure proper operation. We also devise a learning-based method of cyber-physical attack detection for UAVs, which requires no formal system dynamics model yet learns to recognize abnormal behavior. We also utilize similar techniques for time signal analysis to achieve epileptic seizure prediction. Finally, a blockchain-based method for network security in the presence of Byzantine agents is explored.</div>
36

Regret Minimization in Structured Reinforcement Learning

Tranos, Damianos January 2021 (has links)
We consider a class of sequential decision making problems in the presence of uncertainty, which belongs to the field of Reinforcement Learning (RL). Specifically, we study discrete Markov decision Processes (MDPs) which model a decision maker or agent that interacts with a stochastic and dynamic environment and receives feedback from it in the form of a reward. The agent seeks to maximize a notion of cumulative reward. Because the environment (both the system dynamics and reward function) is unknown, it faces an exploration-exploitation dilemma, where it must balance exploring its available actions or exploiting what it believes to be the best one. This dilemma captured by the notion of regret, which compares the rewards that the agent has accumulated thus far with those that would have been obtained by an optimal policy. The agent is then said to behave optimally, if it minimizes its regret. This thesis investigates the fundamental regret limits that can be achieved by any agent. We derive general asymptotic and problem specific regret lower bounds for the cases of ergodic and deterministic MDPs. We make these explicit for ergodic MDPs that are unstructured, for MDPs with Lipschitz transitions and rewards, as well as for deterministic MDPs that satisfy a decoupling property. Furthermore, we propose DEL, an algorithm that is valid for any ergodic MDP with any structure and whose regret upper bound matches the associated regret lower bounds, thus being truly optimal. For this algorithm, we present theoretical regret guarantees as well as a numerical demonstration that verifies its ability to exploit the underlying structure. / <p>QC 20210603</p>
37

Reinforcement-learning-based autonomous vehicle navigation in a dynamically changing environment

Ngai, Chi-kit. January 2007 (has links)
Thesis (Ph. D.)--University of Hong Kong, 2008. / Also available in print.
38

Learning successful strategies in repeated general-sum games /

Crandall, Jacob W., January 2005 (has links) (PDF)
Thesis (Ph.D.)--Brigham Young University. Dept. of Computer Science, 2005. / Includes bibliographical references (p. 163-168).
39

Sequential frameworks for statistics-based value function representation in approximate dynamic programming

Fan, Huiyuan. January 2008 (has links)
Thesis (Ph.D.) -- University of Texas at Arlington, 2008.
40

Constructing Neuro-Fuzzy Control Systems Based on Reinforcement Learning Scheme

Pei, Shan-cheng 10 September 2007 (has links)
Traditionally, the fuzzy rules for a fuzzy controller are provided by experts. They cannot be trained from a set of input-output training examples because the correct response of the plant being controlled is delayed and cannot be obtained immediately. In this paper, we propose a novel approach to construct fuzzy rules for a fuzzy controller based on reinforcement learning. Our task is to learn from the delayed reward to choose sequences of actions that result in the best control. A neural network with delays is used to model the evaluation function Q. Fuzzy rules are constructed and added as the learning proceeds. Both the weights of the Q-learning network and the parameters of the fuzzy rules are tuned by gradient descent. Experimental results have shown that the fuzzy rules obtained perform effectively for control.

Page generated in 0.1906 seconds