Global ETD Search

71	Multi-Agent Reinforcement Learning: Analysis and Application Paulo Cesar Heredia (12428121) 20 April 2022 (has links) <p>With the increasing availability of data and the rise of networked systems such as autonomous vehicles, drones, and smart girds, the application of data-driven, machine learning methods with multi-agents systems have become an important topic. In particular, reinforcement learning has gained a lot of popularity due to its similarities with optimal control, with the potential of allowing us to develop optimal control systems using only observed data and without the need for a model of a system's state dynamics. In this thesis work, we explore the application of reinforcement learning with multi-agents systems, which is known as multi-agent reinforcement learning (MARL). We have developed algorithms that address some challenges in the cooperative setting of MARL. We have also done work on better understanding the convergence guarantees of some known multi-agent reinforcement learning algorithms, which combine reinforcement learning with distributed consensus methods. And, with the aim of making MARL better suited to real-world problems, we have also developed algorithms to address some practical challenges with MARL and we have applied MARL on a real-world problem.</p> <p>In the first part of this thesis, we focus on developing algorithms to address some open problems in MARL. One of these challenges is learning with output feedback, which is known as partial observability in the reinforcement learning literature. One of the main assumptions of reinforcement learning in the singles agent case is that the agent can fully observe the state of the plant it is controlling (we note the “plant" is often referred to as the “environment" in the reinforcement learning literature. We will use these terms interchangeably). In the single agent case this assumption can be reasonable since it only requires one agent to fully observe its environment. In the multi-agent setting, however, this assumption would require all agents to fully observe the state and furthermore since each agent could affect the plant (or environment) with its actions, the assumption would also require that agent's know the actions of other agents. We have also developed algorithms to address practical issues that may arise when applying reinforcement learning (RL) or MARL on large-scale real-world systems. One such algorithm is a distributed reinforcement learning algorithm that allows us to learn in cases where the states and actions are both continuous and of large dimensionality, which is the case for many real-world applications. Without the ability to handle continuous states and actions, many algorithms require discretization, which with high dimensional systems can become impractical. We have also developed a distributed reinforcement learning algorithm that addresses data scalability of RL. By data scalability we mean how to learn from a very large dataset that cannot be efficiently processed by a single agent with limited resources.</p> <p>In the second part of this thesis, we provide a finite-sample analysis of some distributed reinforcement learning algorithms. By finite-sample analysis, we mean we provide an upper bound on the squared error of the algorithm for a given iteration of the algorithm. Or equivalently, since each iteration uses one data sample, we provide an upper bound of the squared error for a given number of data samples used. This type of analysis had been missing in the MARL literature, where most works on MARL have only provided asymptotic results for their proposed algorithms, which only tells us how the algorithmic error behaves as the number of samples used goes to infinity. </p> <p>The third part of this thesis focuses on applications with real-world systems. We have explored a real-world problem, namely transactive energy systems (TES), which can be represented as a multi-agent system. We have applied various reinforcement learning algorithms with the aim of learning an optimal control policy for this system. Through simulations, we have compared the performance of these algorithms and have illustrated the effect of partial observability (output feedback) when compared to full state feedback.</p> <p>In the last part we present some other work, specifically we present a distributed observer that aims to address learning with output feedback by estimating the state. The proposed algorithm is designed so that we do not require a complete model of state dynamics, and instead we use a parameterized model where the parameters are estimated along with the state.</p> Aerospace Engineering multi-agent systems reinforcement learning (RL) Distributed Algorithms
72	Cooperative control for multi-agent persistent monitoring problems Zhou, Nan 04 June 2019 (has links) In persistent monitoring tasks, cooperating mobile agents are used to monitor a dynamically changing environment that cannot be fully covered by a stationary team of agents. The exploration process leads to the discovery of various "points of interest" (targets) to be perpetually monitored. Through an optimal control approach, the first part of this dissertation shows that in a one-dimensional mission space the solution can be reduced to a simpler parametric problem. The behavior of agents under optimal control is described by a hybrid system which can be analyzed using Infinitesimal Perturbation Analysis (IPA) to obtain an on-line solution. IPA allows the modeling of virtually arbitrary stochastic effects in target uncertainty and its event-driven nature renders the solution scalable in the number of events rather than the state space. The second part of this work extends the results of the one-dimensional persistent monitoring problem to a two-dimensional space with constrained agent mobility. Under a general graph setting, the properties of the one-dimensional optimal control solution are largely inherited. The solution involves the design of agent trajectories defined by both the sequence of nodes to be visited and the amount of time spent at each node. A class of distributed threshold-based parametric controllers is proposed to reduce the computational complexity. These parameters are optimized through an event-driven IPA gradient-based algorithm and yield optimal controllers within this family of threshold-based policies. The performance of the threshold-based parametric controller is close to that of the optimal controller derived through dynamic programming and its computational complexity is smaller by orders of magnitude. Although effective, the aforementioned optimal controls are established on the assumption that agents are all connected via a centralized controller which is energy-consuming and unreliable in adversarial environments. The third part of this work extends the previous controls by developing decentralized controllers which distribute functionality to the agents so that each one acts upon local information and sparse communication with neighbors. The complexity of decentralization for persistent monitoring problems is significant given agent mobility and the overall time-varying graph topology. Conditions are identified and a decentralized framework is proposed under which the centralized solution can be exactly recovered in a decentralized event-driven manner based on local information -- except for one event requiring communication from a non-neighbor agent. Engineering Event-driven algorithms Multi-agent systems Optimization
73	Online Model-Free Distributed Reinforcement Learning Approach for Networked Systems of Self-organizing Agents Chen, Yiqing 22 December 2021 (has links) Control of large groups of robotic agents is driven by applications including military, aeronautics and astronautics, transportation network, and environmental monitoring. Cooperative control of networked multi-agent systems aims at driving the behavior of the group via feedback control inputs that encode the groups’ dynamics based on information sharing, with inter-agent communications that can be time varying and be spatially non-uniform. Notably, local interaction rules can induce coordinated behaviour, provided suitable network topologies. Distributed learning paradigms are often necessary for this class of systems to be able to operate autonomously and robustly, without the need of external units providing centralized information. Compared with model-based protocols that can be computationally prohibitive due to their mathematical complexity and requirements in terms of feedback information, we present an online model-free algorithm for some nonlinear tracking problems with unknown system dynamics. This method prescribes the actuation forces of agents to follow the time-varying trajectory of a moving target. The tracking problem is addressed by an online value iteration process which requires measurements collected along the trajectories. A set of simulations are conducted to illustrate that the presented algorithm is well functioning in various reference-tracking scenarios. Distributed Reinforcement Learning Cooperative Control Model-free Multi-agent systems
74	Automated Negotiation for Complex Multi-Agent Resource Allocation An, Bo 01 February 2011 (has links) The problem of constructing and analyzing systems of intelligent, autonomous agents is becoming more and more important. These agents may include people, physical robots, virtual humans, software programs acting on behalf of human beings, or sensors. In a large class of multi-agent scenarios, agents may have different capabilities, preferences, objectives, and constraints. Therefore, efficient allocation of resources among multiple agents is often difficult to achieve. Automated negotiation (bargaining) is the most widely used approach for multi-agent resource allocation and it has received increasing attention in the recent years. However, information uncertainty, existence of multiple contracting partners and competitors, agents' incentive to maximize individual utilities, and market dynamics make it difficult to calculate agents' rational equilibrium negotiation strategies and develop successful negotiation agents behaving well in practice. To this end, this thesis is concerned with analyzing agents' rational behavior and developing negotiation strategies for a range of complex negotiation contexts. First, we consider the problem of finding agents' rational strategies in bargaining with incomplete information. We focus on the principal alternating-offers finite horizon bargaining protocol with one-sided uncertainty regarding agents' reserve prices. We provide an algorithm based on the combination of game theoretic analysis and search techniques which finds agents' equilibrium in pure strategies when they exist. Our approach is sound, complete and, in principle, can be applied to other uncertainty settings. Simulation results show that there is at least one pure strategy sequential equilibrium in 99.7% of various scenarios. In addition, agents with equilibrium strategies achieved higher utilities than agents with heuristic strategies. Next, we extend the alternating-offers protocol to handle concurrent negotiations in which each agent has multiple trading opportunities and faces market competition. We provide an algorithm based on backward induction to compute the subgame perfect equilibrium of concurrent negotiation. We observe that agents' bargaining power are affected by the proposing ordering and market competition and for a large subset of the space of the parameters, agents' equilibrium strategies depend on the values of a small number of parameters. We also extend our algorithm to find a pure strategy sequential equilibrium in concurrent negotiations where there is one-sided uncertainty regarding the reserve price of one agent. Third, we present the design and implementation of agents that concurrently negotiate with other entities for acquiring multiple resources. Negotiation agents are designed to adjust 1) the number of tentative agreements and 2) the amount of concession they are willing to make in response to changing market conditions and negotiation situations. In our approach, agents utilize a time-dependent negotiation strategy in which the reserve price of each resource is dynamically determined by 1) the likelihood that negotiation will not be successfully completed, 2) the expected agreement price of the resource, and 3) the expected number of final agreements. The negotiation deadline of each resource is determined by its relative scarcity. Since agents are permitted to decommit from agreements, a buyer may make more than one tentative agreement for each resource and the maximum number of tentative agreements is constrained by the market situation. Experimental results show that our negotiation strategy achieved significantly higher utilities than simpler strategies. Finally, we consider the problem of allocating networked resources in dynamic environment, such as cloud computing platforms, where providers strategically price resources to maximize their utility. While numerous auction-based approaches have been proposed in the literature, our work explores an alternative approach where providers and consumers negotiate resource leasing contracts. We propose a distributed negotiation mechanism where agents negotiate over both a contract price and a decommitment penalty, which allows agents to decommit from contracts at a cost. We compare our approach experimentally, using representative scenarios and workloads, to both combinatorial auctions and the fixed-price model, and show that the negotiation model achieves a higher social welfare. Automated Negotiation Bargaining Multi-Agent Systems Resource Allocation Computer Sciences
75	Stability Analysis of Swarms Gazi, Veysel 11 September 2002 (has links) No description available. swarms stability analysis multi-agent systems formation control
76	Towards the Application of Software Architectures in Multi-Agent Systems Garcia-Martinez, Salvador 07 1900 (has links) <p> Software Architecture is a concept that arose during the last two decades as a consequence of the need for a structured design at an early stage. Software Architecture is defined as a pattern of interconnected components satisfying some structural rule. Software architectures are widely used in many types of systems; Multi-Agent Systems should not be an exception. Multi-Agent Systems have emerged as a design paradigm for large and distributed systems. They are composed of autonomous elements that work together in order to pursue a common goal. They are mostly used in Electronic Commerce, Human-Computer Interfaces, and so on.</p> <p> In this research, we investigate the state of the art of Software Architectures in the Multi-Agent Systems field, showing that, generally Multi-Agent Systems do no use the software architecture concept properly and, when they do, they do not show specific architectures for Multi-Agent Systems. The approach followed is based on the analysis of six case studies, which are implemented applications that have been published in some of the most important conferences in the area. Additionally we show that, based on the initial design of each case and existing architectural patterns, it is possible to impose a software architecture on each case.</p> <p> Furthermore, we analyze the way that the term "software architecture" is used in the Multi-Agent Systems literature, showing that, usually, it refers to abstract architectures proposed in standards and frameworks or to an initial design in a system. In addition we clarify related concepts, such as reference architecture, reference models, architectural patterns and design patterns. Finally, we do an exhaustive comparison of the case studies, which aims to highlight commonalities and differences. The objective of this comparison is to analyze if they share a similar architecture that can be reused in more cases and to show how specific properties of Multi-Agent Systems affect in the design of an architecture.</p> / Thesis / Master of Science (MSc)
77	Scalable Multi-Agent Systems in Restricted Environments Heintzman, Larkin Lee 15 February 2023 (has links) Modern robotics demonstrates the reality of near sci-fi solutions regularly. Swarms of interconnected robotic agents have been proven to have benefits in scalability, robustness, and efficiency. In communication restricted environments, such teams of robots are often required to support their own navigation, planning, and decision making processes, through use of onboard processors and collaboration. Example scenarios that exhibit restriction include unmanned underwater surveys and robots operating in indoor or remote environments without cloud connectivity. We begin this thesis by discussing multi-agent state estimation and it's observability properties, specifically for the case of an agent-to-agent range measurement system. For this case, inspired by navigation requirements underwater, we derive several conditions under which the system's state is guaranteed to be locally weakly observable. Ensuring a state is observable is necessary to maintain an estimate of it via filters, thus observability is required to support higher level navigation and planning. We conclude this section by creating an observability-based planner to control a subset of the agents' inputs. For the next contribution, we discuss scalability for coverage maximizing path planners. Typically planning for many individual robots incurs significant computational complexity which increases exponentially with the number of agents, this is often exacerbated when the objective function is collaborative as in coverage optimization. To maintain feasibility while planning for a large team of robots, we call upon a powerful relation from combinatorics which utilizes the greedy selection algorithm and a matroid condition to create an efficient planner that maintains a fixed performance ratio when compared to the optimal path. We then introduce a motivating example of autonomously assisted search and rescues using multiple aerial agents, and derive planners and models to suit the application. The framework begins by estimating the likely locations of a lost person through a Monte Carlo simulation, yielding a heatmap covering the area of interest. The heatmap is then used in combination with parametrized agent trajectories and a machine learning optimization algorithm to maximize the search efficiency. The search and rescues use case provides an excellent computational testbed for the final portion of the work. We close by discussing a computation architecture to support multi-agent system autonomy. Modern robotic autonomy results, especially computer vision and machine learning algorithms, often require large amounts of processing to yield quality results. With general purpose computing devices reaching a progression barrier, one that is not expected to be solved in the near term, increasingly devices must be designed with their end purposes in mind. To better support autonomy in multi-agent systems, we propose to use a distributed cluster of embedded processors which allows the sharing of computation and storage resources among the component members with minimal communication overhead. Our proposed architecture is composed of mature softwares already well-known in the robotics community, Kubernetes and the robot operating system, allowing ease of use and interoperability with existing algorithms. / Doctor of Philosophy / The traditional approach of robotics typically uses a single large platform capable of accomplishing all tasks assigned to it. However, it has been discovered that deploying multiple smaller platforms, each with their own processor and specific expertise, can have massive performance benefits compared to previous approaches. This development has been driven largely by readily available computing and mobility hardware. Termed as multi-agent systems, they can excel in areas that benefit from multiple perspectives, simultaneous task execution, and redundancy. In addition, planning algorithms developed for previous approaches often can map well onto multi-agent systems, provided there is adequate computational support. In cases where network or cloud connectivity is limited, teams of agents must use their own processors and sensors to make decisions and communicate. However, often an individual agent's computing hardware is limited in mass or size, thus limiting it's processing capabilities. In this work we will first discuss several multi-agent system algorithms, starting with estimation and navigation and ending with area search. We then conclude the work by proposing a novel architecture designed to distribute the computation load across the team in a highly scalable way. Multi-Agent Systems Path Planning Agent-Based Systems
78	An agent based manufacturing scheduling module for Advanced Planning and Scheduling Attri, Hitesh 11 April 2005 (has links) A software agents based manufacturing scheduling module for Advanced Planning and Scheduling (APS) is presented. The problem considered is scheduling of jobs with multiple operations, distinct operation processing times, arrival times, and due dates in a job shop environment. Sequence dependent setups are also considered. The additional constraints of material and resource availability are also taken into consideration. The scheduling is to be considered in integration with production planning. The production plans can be changed dynamically and the schedule is to be generated to reflect the appropriate changes. The design of a generic multi-agent framework which is domain independent along with algorithms that are used by the agents is also discussed. / Master of Science Manufacturing Scheduling Advanced Planning and Scheduling Multi Agent Systems
79	Distributed Map Creation and Planning for a Multi-Agent System with CARLA Environment Andersson, Alfred January 2024 (has links) The pursuit of multi-agent exploration is driven by its capacity to enhance operational robustness and efficiency in complex, dynamic environments, paving the way for advancements in autonomous systems and robotics. This thesis explores the development and assessment of decentralised planning algorithms within multi-agent systems, using the CARLA simulation environment. A methodology combining simulation-based testing and theoretical analysis was employed to evaluate the efficiency, and scalability of various decentralised planning strategies. The study systematically analysed three different exploration strategies for multi-agent systems: Greedy, MinPos, and Hungarian Assignments, across various configurations concerning the number of agents and communication demands. The Hungarian Assignment strategy demonstrates the highest efficiency in area coverage and coordination, especially as the number of agents increases. Meanwhile, the Greedy Assignment strategy requires the least communication bandwidth, indicating its potential for scenarios with limited communication capabilities. The MinPos Assignment, while facilitating better spatial distribution of agents than the Greedy Assignment, showed a moderate increase in communication demands and did not significantly outperform the Greedy Assignment in terms of efficiency. This work contributes to the field by providing insights into the trade-offs between exploration efficiency and communication overhead in multi-agent systems. Future work could explore synchronisation mechanisms, collision-avoidance strategies, and further decentralisation of the system's components. Multi-agent systems Environment exploration CARLA Robotics Robotteknik och automation
80	Interpretations of epistemic mu-calculus over multi-agent games / Tolkningar av epistemisk mu-kalkyl över multiagent-spel Stathatos, Nikitas January 2022 (has links) In this work, we are interested in expressing and studying certain formal properties of multi-agent games. In particular, we are interested in the case in which a team of agents with imperfect information is playing against the environment. This is modeled by a non-deterministic game, where the agents can only partially distinguish its states, to varying degrees. We will study these games under the lens of the multi-agent knowledge-based subset construction (MKBSC), which, when applied to a game, reduces the degree of imperfect information the agents have. An appropriate language to express interesting and complex properties in these type of games is the epistemic μ-calculus, an extension of classicepistemic logic with a recursive operator. We define two semantics forthis language, one corresponding to a global view of the game, and onecorresponding to a local one. We state a claim relating these two semantics,while proving an analogous statement for epistemic logic. / I detta arbete är vi intresserade av att uttrycka och studera vissa formellaegenskaper hos spel med flera agenter. Särskilt intresserade är vi av falletdär ett lag av agenter med ofullständig information samarbetar mot miljön.Detta modelleras av ett icke-deterministiskt spel, där agenterna endast delviskan särskilja dess tillstånd, i varierande grad. Vi kommer att studera dessa isammanhanget av den kunskapsbaserade multiagent-konstruktionen (MKBSC),som när den tillämpas på ett spel minskar graden av ofullständig informationagenterna har.Ett lämpligt språk för att uttrycka intressanta och komplexa egenskaper idenna typ av spel är den epistemiska μ-kalkylen, en utvidgning av klassiskepistemisk logik genom en rekursiv operator. Vi definierar två semantikerför detta språk, ett som motsvarar ett globalt perspektiv på spelet, och ettmotsvarande ett lokalt perspektiv. Vi formulerar ett påstående som rör dessatvå semantiker för μ-kalkylen, och bevisar ett analogt påstående för epistemisklogik. epistemic logic game theory multi-agent systems semantics epistemisk logik spelteori multi-agent systems semantiker Other Mathematics Annan matematik

Search results