• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 168
  • 135
  • 50
  • 18
  • 4
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 453
  • 453
  • 453
  • 145
  • 96
  • 72
  • 72
  • 72
  • 68
  • 67
  • 64
  • 58
  • 57
  • 55
  • 52
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
51

Multi-Agent Reinforcement Learning: Analysis and Application

Paulo Cesar Heredia (12428121) 20 April 2022 (has links)
<p>With the increasing availability of data and the rise of networked systems such as autonomous vehicles, drones, and smart girds, the application of data-driven, machine learning methods with multi-agents systems have become an important topic. In particular, reinforcement learning has gained a lot of popularity due to its similarities with optimal control, with the potential of allowing us to develop optimal control systems using only observed data and without the need for a model of a system's state dynamics. In this thesis work, we explore the application of reinforcement learning with multi-agents systems, which is known as multi-agent reinforcement learning (MARL). We have developed algorithms that address some challenges in the cooperative setting of MARL. We have also done work on better understanding the convergence guarantees of some known multi-agent reinforcement learning algorithms, which combine reinforcement learning with distributed consensus methods. And, with the aim of making MARL better suited to real-world problems, we have also developed algorithms to address some practical challenges with MARL and we have applied MARL on a real-world problem.</p> <p>In the first part of this thesis, we focus on developing algorithms to address some open problems in MARL. One of these challenges is learning with output feedback, which is known as partial observability in the reinforcement learning literature. One of the main assumptions of reinforcement learning in the singles agent case is that the agent can fully observe the state of the plant it is controlling (we note the “plant" is often referred to as the “environment" in the reinforcement learning literature. We will use these terms interchangeably). In the single agent case this assumption can be reasonable since it only requires one agent to fully observe its environment. In the multi-agent setting, however, this assumption would require all agents to fully observe the state and furthermore since each agent could affect the plant (or environment) with its actions, the assumption would also require that agent's know the actions of other agents. We have also developed algorithms to address practical issues that may arise when applying reinforcement learning (RL) or MARL on large-scale real-world systems. One such algorithm is a distributed reinforcement learning algorithm that allows us to learn in cases where the states and actions are both continuous and of large dimensionality, which is the case for many real-world applications. Without the ability to handle continuous states and actions, many algorithms require discretization, which with high dimensional systems can become impractical. We have also developed a distributed reinforcement learning algorithm that addresses data scalability of RL. By data scalability we mean how to learn from a very large dataset that cannot be efficiently processed by a single agent with limited resources.</p> <p>In the second part of this thesis, we provide a finite-sample analysis of some distributed reinforcement learning algorithms. By finite-sample analysis, we mean we provide an upper bound on the squared error of the algorithm for a given iteration of the algorithm. Or equivalently, since each iteration uses one data sample, we provide an upper bound of the squared error for a given number of data samples used. This type of analysis had been missing in the MARL literature, where most works on MARL have only provided asymptotic results for their proposed algorithms, which only tells us how the algorithmic error behaves as the number of samples used goes to infinity. </p> <p>The third part of this thesis focuses on applications with real-world systems. We have explored a real-world problem, namely transactive energy systems (TES), which can be represented as a multi-agent system. We have applied various reinforcement learning algorithms with the aim of learning an optimal control policy for this system. Through simulations, we have compared the performance of these algorithms and have illustrated the effect of partial observability (output feedback) when compared to full state feedback.</p> <p>In the last part we present some other work, specifically we present a distributed observer that aims to address learning with output feedback by estimating the state. The proposed algorithm is designed so that we do not require a complete model of state dynamics, and instead we use a parameterized model where the parameters are estimated along with the state.</p>
52

Cooperative control for multi-agent persistent monitoring problems

Zhou, Nan 04 June 2019 (has links)
In persistent monitoring tasks, cooperating mobile agents are used to monitor a dynamically changing environment that cannot be fully covered by a stationary team of agents. The exploration process leads to the discovery of various "points of interest" (targets) to be perpetually monitored. Through an optimal control approach, the first part of this dissertation shows that in a one-dimensional mission space the solution can be reduced to a simpler parametric problem. The behavior of agents under optimal control is described by a hybrid system which can be analyzed using Infinitesimal Perturbation Analysis (IPA) to obtain an on-line solution. IPA allows the modeling of virtually arbitrary stochastic effects in target uncertainty and its event-driven nature renders the solution scalable in the number of events rather than the state space. The second part of this work extends the results of the one-dimensional persistent monitoring problem to a two-dimensional space with constrained agent mobility. Under a general graph setting, the properties of the one-dimensional optimal control solution are largely inherited. The solution involves the design of agent trajectories defined by both the sequence of nodes to be visited and the amount of time spent at each node. A class of distributed threshold-based parametric controllers is proposed to reduce the computational complexity. These parameters are optimized through an event-driven IPA gradient-based algorithm and yield optimal controllers within this family of threshold-based policies. The performance of the threshold-based parametric controller is close to that of the optimal controller derived through dynamic programming and its computational complexity is smaller by orders of magnitude. Although effective, the aforementioned optimal controls are established on the assumption that agents are all connected via a centralized controller which is energy-consuming and unreliable in adversarial environments. The third part of this work extends the previous controls by developing decentralized controllers which distribute functionality to the agents so that each one acts upon local information and sparse communication with neighbors. The complexity of decentralization for persistent monitoring problems is significant given agent mobility and the overall time-varying graph topology. Conditions are identified and a decentralized framework is proposed under which the centralized solution can be exactly recovered in a decentralized event-driven manner based on local information -- except for one event requiring communication from a non-neighbor agent.
53

Online Model-Free Distributed Reinforcement Learning Approach for Networked Systems of Self-organizing Agents

Chen, Yiqing 22 December 2021 (has links)
Control of large groups of robotic agents is driven by applications including military, aeronautics and astronautics, transportation network, and environmental monitoring. Cooperative control of networked multi-agent systems aims at driving the behavior of the group via feedback control inputs that encode the groups’ dynamics based on information sharing, with inter-agent communications that can be time varying and be spatially non-uniform. Notably, local interaction rules can induce coordinated behaviour, provided suitable network topologies. Distributed learning paradigms are often necessary for this class of systems to be able to operate autonomously and robustly, without the need of external units providing centralized information. Compared with model-based protocols that can be computationally prohibitive due to their mathematical complexity and requirements in terms of feedback information, we present an online model-free algorithm for some nonlinear tracking problems with unknown system dynamics. This method prescribes the actuation forces of agents to follow the time-varying trajectory of a moving target. The tracking problem is addressed by an online value iteration process which requires measurements collected along the trajectories. A set of simulations are conducted to illustrate that the presented algorithm is well functioning in various reference-tracking scenarios.
54

Automated Negotiation for Complex Multi-Agent Resource Allocation

An, Bo 01 February 2011 (has links)
The problem of constructing and analyzing systems of intelligent, autonomous agents is becoming more and more important. These agents may include people, physical robots, virtual humans, software programs acting on behalf of human beings, or sensors. In a large class of multi-agent scenarios, agents may have different capabilities, preferences, objectives, and constraints. Therefore, efficient allocation of resources among multiple agents is often difficult to achieve. Automated negotiation (bargaining) is the most widely used approach for multi-agent resource allocation and it has received increasing attention in the recent years. However, information uncertainty, existence of multiple contracting partners and competitors, agents' incentive to maximize individual utilities, and market dynamics make it difficult to calculate agents' rational equilibrium negotiation strategies and develop successful negotiation agents behaving well in practice. To this end, this thesis is concerned with analyzing agents' rational behavior and developing negotiation strategies for a range of complex negotiation contexts. First, we consider the problem of finding agents' rational strategies in bargaining with incomplete information. We focus on the principal alternating-offers finite horizon bargaining protocol with one-sided uncertainty regarding agents' reserve prices. We provide an algorithm based on the combination of game theoretic analysis and search techniques which finds agents' equilibrium in pure strategies when they exist. Our approach is sound, complete and, in principle, can be applied to other uncertainty settings. Simulation results show that there is at least one pure strategy sequential equilibrium in 99.7% of various scenarios. In addition, agents with equilibrium strategies achieved higher utilities than agents with heuristic strategies. Next, we extend the alternating-offers protocol to handle concurrent negotiations in which each agent has multiple trading opportunities and faces market competition. We provide an algorithm based on backward induction to compute the subgame perfect equilibrium of concurrent negotiation. We observe that agents' bargaining power are affected by the proposing ordering and market competition and for a large subset of the space of the parameters, agents' equilibrium strategies depend on the values of a small number of parameters. We also extend our algorithm to find a pure strategy sequential equilibrium in concurrent negotiations where there is one-sided uncertainty regarding the reserve price of one agent. Third, we present the design and implementation of agents that concurrently negotiate with other entities for acquiring multiple resources. Negotiation agents are designed to adjust 1) the number of tentative agreements and 2) the amount of concession they are willing to make in response to changing market conditions and negotiation situations. In our approach, agents utilize a time-dependent negotiation strategy in which the reserve price of each resource is dynamically determined by 1) the likelihood that negotiation will not be successfully completed, 2) the expected agreement price of the resource, and 3) the expected number of final agreements. The negotiation deadline of each resource is determined by its relative scarcity. Since agents are permitted to decommit from agreements, a buyer may make more than one tentative agreement for each resource and the maximum number of tentative agreements is constrained by the market situation. Experimental results show that our negotiation strategy achieved significantly higher utilities than simpler strategies. Finally, we consider the problem of allocating networked resources in dynamic environment, such as cloud computing platforms, where providers strategically price resources to maximize their utility. While numerous auction-based approaches have been proposed in the literature, our work explores an alternative approach where providers and consumers negotiate resource leasing contracts. We propose a distributed negotiation mechanism where agents negotiate over both a contract price and a decommitment penalty, which allows agents to decommit from contracts at a cost. We compare our approach experimentally, using representative scenarios and workloads, to both combinatorial auctions and the fixed-price model, and show that the negotiation model achieves a higher social welfare.
55

Stability Analysis of Swarms

Gazi, Veysel 11 September 2002 (has links)
No description available.
56

Towards the Application of Software Architectures in Multi-Agent Systems

Garcia-Martinez, Salvador 07 1900 (has links)
<p> Software Architecture is a concept that arose during the last two decades as a consequence of the need for a structured design at an early stage. Software Architecture is defined as a pattern of interconnected components satisfying some structural rule. Software architectures are widely used in many types of systems; Multi-Agent Systems should not be an exception. Multi-Agent Systems have emerged as a design paradigm for large and distributed systems. They are composed of autonomous elements that work together in order to pursue a common goal. They are mostly used in Electronic Commerce, Human-Computer Interfaces, and so on.</p> <p> In this research, we investigate the state of the art of Software Architectures in the Multi-Agent Systems field, showing that, generally Multi-Agent Systems do no use the software architecture concept properly and, when they do, they do not show specific architectures for Multi-Agent Systems. The approach followed is based on the analysis of six case studies, which are implemented applications that have been published in some of the most important conferences in the area. Additionally we show that, based on the initial design of each case and existing architectural patterns, it is possible to impose a software architecture on each case.</p> <p> Furthermore, we analyze the way that the term "software architecture" is used in the Multi-Agent Systems literature, showing that, usually, it refers to abstract architectures proposed in standards and frameworks or to an initial design in a system. In addition we clarify related concepts, such as reference architecture, reference models, architectural patterns and design patterns. Finally, we do an exhaustive comparison of the case studies, which aims to highlight commonalities and differences. The objective of this comparison is to analyze if they share a similar architecture that can be reused in more cases and to show how specific properties of Multi-Agent Systems affect in the design of an architecture.</p> / Thesis / Master of Science (MSc)
57

An agent based manufacturing scheduling module for Advanced Planning and Scheduling

Attri, Hitesh 11 April 2005 (has links)
A software agents based manufacturing scheduling module for Advanced Planning and Scheduling (APS) is presented. The problem considered is scheduling of jobs with multiple operations, distinct operation processing times, arrival times, and due dates in a job shop environment. Sequence dependent setups are also considered. The additional constraints of material and resource availability are also taken into consideration. The scheduling is to be considered in integration with production planning. The production plans can be changed dynamically and the schedule is to be generated to reflect the appropriate changes. The design of a generic multi-agent framework which is domain independent along with algorithms that are used by the agents is also discussed. / Master of Science
58

Using Norms To Control Open Multi-Agent Systems

Criado Pacheco, Natalia 13 November 2012 (has links)
Internet es, tal vez, el avance científico más relevante de nuestros días. Entre otras cosas, Internet ha permitido la evolución de los paradigmas de computación tradicionales hacia el paradigma de computaciónn distribuida, que se caracteriza por utilizar una red abierta de ordenadores. Los sistemas multiagente (SMA) son una tecnolog a adecuada para abordar los retos motivados por estos sistemas abiertos distribuidos. Los SMA son aplicaciones formadas por agentes heterog eneos y aut onomos que pueden haber sido dise~nados de forma independiente de acuerdo con objetivos y motivaciones diferentes. Por lo tanto, no es posible realizar ninguna hip otesis a priori sobre el comportamiento de los agentes. Por este motivo, los SMA necesitan de mecanismos de coordinaci on y cooperaci on, como las normas, para garantizar el orden social y evitar la aparici on de conictos. El t ermino norma cubre dos dimensiones diferentes: i) las normas como un instrumento que gu a a los ciudadanos a la hora de realizar acciones y actividades, por lo que las normas de nen los procedimientos y/o los protocolos que se deben seguir en una situaci on concreta, y ii) las normas como ordenes o prohibiciones respaldadas por un sistema de sanciones, por lo que las normas son medios para prevenir o castigar ciertas acciones. En el area de los SMA, las normas se vienen utilizando como una especi caci on formal de lo que est a permitido, obligado y prohibido dentro de una sociedad. De este modo, las normas permiten regular la vida de los agentes software y las interacciones entre ellos. La motivaci on principal de esta tesis es permitir a los dise~nadores de los SMA utilizar normas como un mecanismo para controlar y coordinar SMA abiertos. Nuestro objetivo es elaborar mecanismos normativos a dos niveles: a nivel de agente y a nivel de infraestructura. Por lo tanto, en esta tesis se aborda primero el problema de la de nici on de agentes normativos aut onomos que sean capaces de deliberar acerca / Criado Pacheco, N. (2012). Using Norms To Control Open Multi-Agent Systems [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/17800
59

Interpretations of epistemic mu-calculus over multi-agent games / Tolkningar av epistemisk mu-kalkyl över multiagent-spel

Stathatos, Nikitas January 2022 (has links)
In this work, we are interested in expressing and studying certain formal properties of multi-agent games. In particular, we are interested in the case in which a team of agents with imperfect information is playing against the environment. This is modeled by a non-deterministic game, where the agents can only partially distinguish its states, to varying degrees. We will study these games under the lens of the multi-agent knowledge-based subset construction (MKBSC), which, when applied to a game, reduces the degree of imperfect information the agents have. An appropriate language to express interesting and complex properties in these type of games is the epistemic μ-calculus, an extension of classicepistemic logic with a recursive operator. We define two semantics forthis language, one corresponding to a global view of the game, and onecorresponding to a local one. We state a claim relating these two semantics,while proving an analogous statement for epistemic logic. / I detta arbete är vi intresserade av att uttrycka och studera vissa formellaegenskaper hos spel med flera agenter. Särskilt intresserade är vi av falletdär ett lag av agenter med ofullständig information samarbetar mot miljön.Detta modelleras av ett icke-deterministiskt spel, där agenterna endast delviskan särskilja dess tillstånd, i varierande grad. Vi kommer att studera dessa isammanhanget av den kunskapsbaserade multiagent-konstruktionen (MKBSC),som när den tillämpas på ett spel minskar graden av ofullständig informationagenterna har.Ett lämpligt språk för att uttrycka intressanta och komplexa egenskaper idenna typ av spel är den epistemiska μ-kalkylen, en utvidgning av klassiskepistemisk logik genom en rekursiv operator. Vi definierar två semantikerför detta språk, ett som motsvarar ett globalt perspektiv på spelet, och ettmotsvarande ett lokalt perspektiv. Vi formulerar ett påstående som rör dessatvå semantiker för μ-kalkylen, och bevisar ett analogt påstående för epistemisklogik.
60

O planejamento de alocação de recursos baseado em sistemas multiagentes / Resource allocation planning using multi-agent systems

Bastos, Ricardo Melo January 1998 (has links)
Este trabalho tem por objetivo propor uma abordagem para o problema de alocação dinâmica de recursos em ambiente de produção baseada no paradigma de multiagentes. Para tanto, é especificada uma arquitetura multiagente genérica chamada M-DRAP - Multi-agent Dynamic Resource Allocation Planning, a partir da qual podem ser derivados modelos particulares. As principais contribuições deste trabalho compreendem: (i) a definição de uma estratégia que permita o planejamento dinâmico de cada recurso no atendimento as demandas das atividades de produção de forma descentralizada e distribuída, através de uma abordagem orientada a projeto; (ii) a proposição de uma organização social baseada em uma abordagem multiagente orientada a mercado, capaz de propiciar relações de negociação entre agentes autônomos no sentido de atenderem aos seus interesses individuais, contribuindo de forma efetiva para a satisfação dos objetivos e restrições temporais e de custos globais ao sistema de produção como um todo; (iii) a especificação de uma arquitetura multiagente derivada do CIMOSA, representando de forma consistente a estrutura funcional e organizacional de um sistema de produção; (iv) a definição de estratégias baseadas em negociação entre os agentes capazes de propiciarem o tratamento das perturbações que afetam o sistema de produção em tempo real. Como contribuindo associada, e proposta uma metodologia para a modelagem conceitual de sistemas multiagentes para o domínio das aplicações envolvendo modelagem de empresas. / The objective of this work is to propose an approach to the problem of dynamic resource allocation in production systems. A multi-agent reference architecture called M-DRAP - Multi-agent Dynamic Resource Allocation Planning - is specified and described in this thesis. The main contributions of this work are (i) the definition of a decentralised and distributed strategy for dynamic resource allocation planning, using a project oriented approach, (ii) the proposition of a social organisation based on marketoriented behaviour, which considers the necessity of each agent's local plan to converge to an adequate global plan in terms of production costs to the whole system, (iii) the definition of a multi-agent architecture inspired in the CIMOSA reference architecture representing a functional and organisational structure, (iv) the definition of a strategy based on negotiation which propitiates real-time disturbance treatment. As an associated contribution, we propose a methodology to multi-agent systems conceptual modelling adequate to the enterprise modelling domain.

Page generated in 0.0329 seconds