• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 202
  • 135
  • 50
  • 26
  • 8
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 506
  • 506
  • 506
  • 148
  • 97
  • 83
  • 83
  • 80
  • 72
  • 67
  • 64
  • 60
  • 58
  • 58
  • 58
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
201

An integrated data- and capability-driven approach to the reconfiguration of agent-based production systems

Scrimieri, Daniele, Adalat, Omar, Afazov, S., Ratchev, S. 13 December 2022 (has links)
Yes / Industry 4.0 promotes highly automated mechanisms for setting up and operating flexible manufacturing systems, using distributed control and data-driven machine intelligence. This paper presents an approach to reconfiguring distributed production systems based on complex product requirements, combining the capabilities of the available production resources. A method for both checking the “realisability” of a product by matching required operations and capabilities, and adapting resources is introduced. The reconfiguration is handled by a multi-agent system, which reflects the distributed nature of the production system and provides an intelligent interface to the user. This is all integrated with a self-adaptation technique for learning how to improve the performance of the production system as part of a reconfiguration. This technique is based on a machine learning algorithm that generalises from past experience on adjustments. The mechanisms of the proposed approach have been evaluated on a distributed robotic manufacturing system, demonstrating their efficacy. Nevertheless, the approach is general and it can be applied to other scenarios. / This work was supported by the SURE Research Projects Fund of the University of Bradford and the European Commission (grant agreement no. 314762). / Research Development Fund Publication Prize Award winner, Nov 2022
202

Exploiting Structure in Coordinating Multiple Decision Makers

Mostafa, Hala 01 September 2011 (has links)
This thesis is concerned with sequential decision making by multiple agents, whether they are acting cooperatively to maximize team reward or selfishly trying to maximize their individual rewards. The practical intractability of this general problem led to efforts in identifying special cases that admit efficient computation, yet still represent a wide enough range of problems. In our work, we identify the class of problems with structured interactions, where actions of one agent can have non-local effects on the transitions and/or rewards of another agent. We addressed the following research questions: 1) How can we compactly represent this class of problems? 2) How can we efficiently calculate agent policies that maximize team reward (for cooperative agents) or achieve equilibrium (selfinterested agents)? 3) How can we exploit structured interactions to make reasoning about communication offline tractable? For representing our class of problems, we developed a new decision-theoretic model, Event-Driven Interactions with Complex Rewards (EDI-CR), that explicitly represents structured interactions. EDI-CR is a compact yet general representation capable of capturing problems where the degree of coupling among agents ranges from complete independence to complete dependence. For calculating agent policies, we draw on several techniques from the field of mathematical optimization and adapt them to exploit the special structure in EDI-CR. We developed a Mixed Integer Linear Program formulation of EDI-CR with cooperative agents that results in programs much more compact and faster to solve than formulations ignoring structure. We also investigated the use of homotopy methods as an optimization technique, as well as formulation of self-interested EDI-CR as a system of non-linear equations. We looked at the issue of communication in both cooperative and self-interested settings. For the cooperative setting, we developed heuristics that assess the impact of potential communication points and add the ones with highest impact to the agents' decision problems. Our heuristics successfully pick communication points that improve team reward while keeping problem size manageable. Also, by controlling the amount of communication introduced by a heuristic, our approach allows us to control the tradeoff between solution quality and problem size. For self-interested agents, we look at an example setting where communication is an integral part of problem solving, but where the self-interested agents have a reason to be reticent (e.g. privacy concerns). We formulate this problem as a game of incomplete information and present a general algorithm for calculating approximate equilibrium profile in this class of games.
203

Scaling Multi-Agent Learning in Complex Environments

Zhang, Chongjie 01 September 2011 (has links)
Cooperative multi-agent systems (MAS) are finding applications in a wide variety of domains, including sensor networks, robotics, distributed control, collaborative decision support systems, and data mining. A cooperative MAS consists of a group of autonomous agents that interact with one another in order to optimize a global performance measure. A central challenge in cooperative MAS research is to design distributed coordination policies. Designing optimal distributed coordination policies offline is usually not feasible for large-scale complex multi-agent systems, where 10s to 1000s of agents are involved, there is limited communication bandwidth and communication delay between agents, agents have only limited partial views of the whole system, etc. This infeasibility is either due to a prohibitive cost to build an accurate decision model, or a dynamically evolving environment, or the intractable computation complexity. This thesis develops a multi-agent reinforcement learning paradigm to allow agents to effectively learn and adapt coordination policies in complex cooperative domains without explicitly building the complete decision models. With multi-agent reinforcement learning (MARL), agents explore the environment through trial and error, adapt their behaviors to the dynamics of the uncertain and evolving environment, and improve their performance through experiences. To achieve the scalability of MARL and ensure the global performance, the MARL paradigm developed in this thesis restricts the learning of each agent to using information locally observed or received from local interactions with a limited number of agents (i.e., neighbors) in the system and exploits non-local interaction information to coordinate the learning processes of agents. This thesis develops new MARL algorithms for agents to learn effectively with limited observations in multi-agent settings and introduces a low-overhead supervisory control framework to collect and integrate non-local information into the learning process of agents to coordinate their learning. More specifically, the contributions of already completed aspects of this thesis are as follows: Multi-Agent Learning with Policy Prediction: This thesis introduces the concept of policy prediction and augments the basic gradient-based learning algorithm to achieve two properties: best-response learning and convergence. The convergence property of multi-agent learning with policy prediction is proven for a class of static games under the assumption of full observability. MARL Algorithm with Limited Observability: This thesis develops PGA-APP, a practical multi-agent learning algorithm that extends Q-learning to learn stochastic policies. PGA-APP combines the policy gradient technique with the idea of policy prediction. It allows an agent to learn effectively with limited observability in complex domains in presence of other learning agents. The empirical results demonstrate that PGA-APP outperforms state-of-the-art MARL techniques in both benchmark games. MARL Application in Cloud Computing: This thesis illustrates how MARL can be applied to optimizing online distributed resource allocation in cloud computing. Empirical results show that the MARL approach performs reasonably well, compared to an optimal solution, and better than a centralized myopic allocation approach in some cases. A General Paradigm for Coordinating MARL: This thesis presents a multi-level supervisory control framework to coordinate and guide the agents' learning process. This framework exploits non-local information and introduces a more global view to coordinate the learning process of individual agents without incurring significant overhead and exploding their policy space. Empirical results demonstrate that this coordination significantly improves the speed, quality and likelihood of MARL convergence in large-scale, complex cooperative multi-agent systems. An Agent Interaction Model: This thesis proposes a new general agent interaction model. This interaction model formalizes a type of interactions among agents, called {\em joint-even-driven} interactions, and define a measure for capturing the strength of such interactions. Formal analysis reveals the relationship between interactions between agents and the performance of individual agents and the whole system. Self-Organization for Nearly-Decomposable Hierarchy: This thesis develops a distributed self-organization approach, based on the agent interaction model, that dynamically form a nearly decomposable hierarchy for large-scale multi-agent systems. This self-organization approach is integrated into supervisory control framework to automatically evolving supervisory organizations to better coordinating MARL during the learning process. Empirically results show that dynamically evolving supervisory organizations can perform better than static ones. Automating Coordination for Multi-Agent Learning: We tailor our supervision framework for coordinating MARL in ND-POMDPs. By exploiting structured interaction in ND-POMDPs, this tailored approach distributes the learning of the global joint policy among supervisors and employs DCOP techniques to automatically coordinate distributed learning to ensure the global learning performance. We prove that this approach can learn a globally optimal policy for ND-POMDPs with a property called groupwise observability.
204

Multi-Agent Reinforcement Learning for Cooperative Edge Cloud Computing / 協調的エッジクラウドコンピューティングのためのマルチエージェント強化学習

Ding, Shiyao 26 September 2022 (has links)
京都大学 / 新制・課程博士 / 博士(情報学) / 甲第24261号 / 情博第805号 / 新制||情||136(附属図書館) / 京都大学大学院情報学研究科社会情報学専攻 / (主査)教授 伊藤 孝行, 教授 吉川 正俊, 教授 神田 崇行, 特定准教授 LIN Donghui / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM
205

Learning Based Methods for Resilient and Enhanced Operation of IntelligentTransportation Systems

Khanapuri, Eshaan January 2022 (has links)
No description available.
206

Human-in-the-Loop Control Synthesis for Multi-Agent Systems under Metric Interval Temporal Logic Specifications

Ahlberg, Sofie January 2019 (has links)
With the increase of robotic presence in our homes and work environment, it has become imperative to consider human-in-the-loop systems when designing robotic controllers. This includes both a physical presence of humans as well as interaction on a decision and control level. One important aspect of this is to design controllers which are guaranteed to satisfy specified safety constraints. At the same time we must minimize the risk of not finding solutions, which would force the system to stop. This require some room for relaxation to be put on the specifications. Another aspect is to design the system to be adaptive to the human and its environment. In this thesis we approach the problem by considering control synthesis for multi-agent systems under hard and soft constraints, where the human has direct impact on how the soft constraint is violated. To handle the multi-agent structure we consider both a classical centralized automata based framework and a decentralized approach with collision avoidance. To handle soft constraints we introduce a novel metric; hybrid distance, which quantify the violation. The hybrid distance consists of two types of violation; continuous distance or missing deadlines, and discrete distance or spacial violation. These distances are weighed against each other with a weight constant we will denote as the human preference constant. For the human impact we consider two types of feedback; direct feedback on the violation in the form of determining the human preference constant, and direct control input through mixed-initiative control where the human preference constant is determined through an inverse reinforcement learning algorithm based on the suggested and followed paths. The methods are validated through simulations. / I takt med att robotar blir allt vanligare i våra hem och i våra arbetsmiljöer, har det blivit allt viktigare att ta hänsyn till människan plats i systemen när regulatorerna för robotorna designas. Detta innefattar både människans fysiska närvaro och interaktion på besluts- och reglernivå. En viktig aspekt i detta är att designa regulatorer som garanterat uppfyller givna villkor. Samtidigt måste vi minimera risken att ingen lösning hittas, eftersom det skulle tvinga systemet till ett stopp. För att uppnå detta krävs det att det finns rum för att mjuka upp villkoren. En annan aspekt är att designa systemet så att det är anpassningsbart till människan och miljön. I den här uppsatsen närmar vi oss problemet genom att använda regulator syntes för multi-agent system under hårda och mjuka villkor där människan har direkt påverkan på hur det svaga villkoret överträds. För att hantera multi-agent strukturen undersöker vi både det klassiska centraliserade automata-baserade ramverket och ett icke-centraliserat tillvägagångsätt med krockundvikning. För att hantera mjuka villkor introducerar vi en metrik; hybrida avståndet, som kvantifierar överträdelsen. Det hybrida avståndet består av två typer av överträdelse (kontinuerligt avstånd eller missandet av deadlines, och diskret avstånd eller rumsliga överträdelser) som vägs mot varandra med en vikt konstant som vi kommer att kalla den mänskliga preferens kontanten. Som mänsklig påverkan överväger vi direkt feedback på överträdelsen genom att hon bestämmer värdet på den mänskliga preferens kontanten, och direkt påverkan på regulatorn där den mänskliga preferens konstanten bestäms genom en inverserad förstärkt inlärnings algoritm baserad på de föreslagna och följda vägarna. Metoderna valideras genom simuleringar. / <p>QC20190517</p>
207

Exploring Agent-Based Simulation of Causal Maps: Toward a Strategic Decision Support Tool

Druckenmiller, Douglas Allen 31 March 2005 (has links)
No description available.
208

Real-time Monitoring and Estimation of Spatio-Temporal Processes Using Co-operative Multi-Agent Systems for Improved Situational Awareness

Sharma, Balaji R. January 2013 (has links)
No description available.
209

Distributed Decision Tree Induction Using Multi-agent Based Negotiation Protocol

Chattopadhyay, Dipayan 10 October 2014 (has links)
No description available.
210

MANILA: A Multi-Agent Framework for Emergent Associative Learning and Creativity in Social Networks

Shekfeh, Marwa January 2017 (has links)
No description available.

Page generated in 0.5788 seconds