131 |
DISTRIBUTED CONTROL AND OPTIMIZATION IN MULTI-AGENT SYSTEMSXuan Wang (8948108) 16 June 2020 (has links)
<div>In recent years, the collective behaviors in nature have motivated rapidly expanding research efforts in the control of multi-agent systems. A multi-agent system is composed of multiple interacting subsystems (agents). In order to seek approaches that respect the network nature of multi-agent systems, distributed algorithms has recently received a significant amount of research attention, the goal of which is allowing multi-agent systems to accomplish global objectives through only local coordination. </div><div> Under this scope, we consider three major problems in this dissertation, namely, distributed computation, distributed optimization, and the resilience of distributed algorithms. First, for distributed computation, we devise distributed algorithms for solving linear equations, which can eliminate the initialization step for agents; converge to the minimum $l_1$ and $l_2$ solutions of under-determined linear equations; achieve ultimate scalability inters of agents' local storage and local states. Second, for distributed optimization, we introduce a new method for algorithm discretization so that the agents no longer have to carefully choose their step-size. We also introduce a new distributed optimization approach that can achieve better convergence rate with lower bandwidth requirement. Finally, for the resilience of distributed algorithms, we propose a new approach that allow normal agents in the multi-agent system to automatically isolate any false information from malicious agents without identification process. Though out the dissertation, all mentioned results are theoretically guaranteed and numerically validated.</div>
|
132 |
Inteligentní křižovatka / Smart Traffic IntersectionŠkopková, Věra January 2019 (has links)
This thesis is concerned with the problem of planning paths for autonomous cars through a smart traffic intersection. In this thesis, we describe existing concepts for solving this problem and discuss the possibilities of approaching intersection problems theoretically. Then, we choose one specific approach and design a declarative model for solving the problem. We use that model to perform a series of theoretical experiments to test the throughput and the quality of intersection paths described by different graphs. After that, we translate theoretical plans to actions for real robots and run it. In these experiments, we measure the degree of robots desynchronization and performance success of the plans based on the collision rate. We also describe how to improve action translation to achieve better performance than that for real robots following the straightforward plans.
|
133 |
Multiple Agent Target Tracking in GPS-Denied EnvironmentsTolman, Skyler 17 December 2019 (has links)
Unmanned aerial systems (UAS) are effective for surveillance and monitoring, but struggle with persistent, long-term tracking, especially without GPS, due to limited flight time. Persistent tracking can be accomplished using multiple vehicles if one vehicle can effectively hand off the tracking information to another replacement vehicle. This work presents a solution to the moving-target handoff problem in the absence of GPS. The proposed solution (a) a nonlinear complementary filter for self-pose estimation using only an IMU, (b) a particle filter for relative pose estimation between UAS using a relative range (c) visual target tracking using a gimballed camera when the target is close to the handoff UAS, and (d) track correlation logic using Procrustes analysis to perform the final target handoff between vehicles. We present hardware results of the self-pose estimation and visual target tracking, as well as an extensive simulation result that demonstrates the effectiveness of our full system, and perform Monte-Carlo simulations that indicate a 97% successful handoff rate using the proposed methods.
|
134 |
Multi-Agent Reinforcement Learning: Analysis and ApplicationPaulo Cesar Heredia (12428121) 20 April 2022 (has links)
<p>With the increasing availability of data and the rise of networked systems such as autonomous vehicles, drones, and smart girds, the application of data-driven, machine learning methods with multi-agents systems have become an important topic. In particular, reinforcement learning has gained a lot of popularity due to its similarities with optimal control, with the potential of allowing us to develop optimal control systems using only observed data and without the need for a model of a system's state dynamics. In this thesis work, we explore the application of reinforcement learning with multi-agents systems, which is known as multi-agent reinforcement learning (MARL). We have developed algorithms that address some challenges in the cooperative setting of MARL. We have also done work on better understanding the convergence guarantees of some known multi-agent reinforcement learning algorithms, which combine reinforcement learning with distributed consensus methods. And, with the aim of making MARL better suited to real-world problems, we have also developed algorithms to address some practical challenges with MARL and we have applied MARL on a real-world problem.</p>
<p>In the first part of this thesis, we focus on developing algorithms to address some open problems in MARL. One of these challenges is learning with output feedback, which is known as partial observability in the reinforcement learning literature. One of the main assumptions of reinforcement learning in the singles agent case is that the agent can fully observe the state of the plant it is controlling (we note the “plant" is often referred to as the “environment" in the reinforcement learning literature. We will use these terms interchangeably). In the single agent case this assumption can be reasonable since it only requires one agent to fully observe its environment. In the multi-agent setting, however, this assumption would require all agents to fully observe the state and furthermore since each agent could affect the plant (or environment) with its actions, the assumption would also require that agent's know the actions of other agents. We have also developed algorithms to address practical issues that may arise when applying reinforcement learning (RL) or MARL on large-scale real-world systems. One such algorithm is a distributed reinforcement learning algorithm that allows us to learn in cases where the states and actions are both continuous and of large dimensionality, which is the case for many real-world applications. Without the ability to handle continuous states and actions, many algorithms require discretization, which with high dimensional systems can become impractical. We have also developed a distributed reinforcement learning algorithm that addresses data scalability of RL. By data scalability we mean how to learn from a very large dataset that cannot be efficiently processed by a single agent with limited resources.</p>
<p>In the second part of this thesis, we provide a finite-sample analysis of some distributed reinforcement learning algorithms. By finite-sample analysis, we mean we provide an upper bound on the squared error of the algorithm for a given iteration of the algorithm. Or equivalently, since each iteration uses one data sample, we provide an upper bound of the squared error for a given number of data samples used. This type of analysis had been missing in the MARL literature, where most works on MARL have only provided asymptotic results for their proposed algorithms, which only tells us how the algorithmic error behaves as the number of samples used goes to infinity. </p>
<p>The third part of this thesis focuses on applications with real-world systems. We have explored a real-world problem, namely transactive energy systems (TES), which can be represented as a multi-agent system. We have applied various reinforcement learning algorithms with the aim of learning an optimal control policy for this system. Through simulations, we have compared the performance of these algorithms and have illustrated the effect of partial observability (output feedback) when compared to full state feedback.</p>
<p>In the last part we present some other work, specifically we present a distributed observer that aims to address learning with output feedback by estimating the state. The proposed algorithm is designed so that we do not require a complete model of state dynamics, and instead we use a parameterized model where the parameters are estimated along with the state.</p>
|
135 |
Cooperative control for multi-agent persistent monitoring problemsZhou, Nan 04 June 2019 (has links)
In persistent monitoring tasks, cooperating mobile agents are used to monitor a dynamically changing environment that cannot be fully covered by a stationary team of agents. The exploration process leads to the discovery of various "points of interest" (targets) to be perpetually monitored. Through an optimal control approach, the first part of this dissertation shows that in a one-dimensional mission space the solution can be reduced to a simpler parametric problem. The behavior of agents under optimal control is described by a hybrid system which can be analyzed using Infinitesimal Perturbation Analysis (IPA) to obtain an on-line solution. IPA allows the modeling of virtually arbitrary stochastic effects in target uncertainty and its event-driven nature renders the solution scalable in the number of events rather than the state space.
The second part of this work extends the results of the one-dimensional persistent monitoring problem to a two-dimensional space with constrained agent mobility. Under a general graph setting, the properties of the one-dimensional optimal control solution are largely inherited. The solution involves the design of agent trajectories defined by both the sequence of nodes to be visited and the amount of time spent at each node. A class of distributed threshold-based parametric controllers is proposed to reduce the computational complexity. These parameters are optimized through an event-driven IPA gradient-based algorithm and yield optimal controllers within this family of threshold-based policies. The performance of the threshold-based parametric controller is close to that of the optimal controller derived through dynamic programming and its computational complexity is smaller by orders of magnitude.
Although effective, the aforementioned optimal controls are established on the assumption that agents are all connected via a centralized controller which is energy-consuming and unreliable in adversarial environments. The third part of this work extends the previous controls by developing decentralized controllers which distribute functionality to the agents so that each one acts upon local information and sparse communication with neighbors. The complexity of decentralization for persistent monitoring problems is significant given agent mobility and the overall time-varying graph topology. Conditions are identified and a decentralized framework is proposed under which the centralized solution can be exactly recovered in a decentralized event-driven manner based on local information -- except for one event requiring communication from a non-neighbor agent.
|
136 |
Semi-Informed Multi-Agent Patrol StrategiesHardin, Chad E. 01 January 2018 (has links)
The adversarial multi-agent patrol problem is an active research topic with many real-world applications such as physical robots guarding an area and software agents protecting a computer network. In it, agents patrol a graph looking for so-called critical vertices that are subject to attack by adversaries. The agents are unaware of which vertices are subject to attack by adversaries and when they encounter such a vertex they attempt to protect it from being compromised (an adversary must occupy the vertex it targets a certain amount of time for the attack to succeed). Even though the terms adversary and attack are used, the problem domain extends to patrolling a graph for other interesting noncompetitive contexts such as search and rescue. The problem statement adopted in this work is formulated such that agents obtain knowledge of local graph topology and critical vertices over the course of their travels via an API ; there is no global knowledge of the graph or communication between agents. The challenge is to balance exploration, necessary to discover critical vertices, with exploitation, necessary to protect critical vertices from attack. Four types of adversaries were used for experiments, three from previous research – waiting, random, and statistical - and the fourth, a hybrid of those three. Agent strategies for countering each of these adversaries are designed and evaluated. Benchmark graphs and parameter settings from related research will be employed. The proposed research culminates in the design and evaluation of agents to counter these various types of adversaries under a range of conditions. The results of this work are agent strategies in which each agent becomes solely responsible for protecting those critical vertices it discovers. The agents use emergent behavior to minimize successful attacks and maximize the discovery of new critical vertices. A set of seven edge choosing primitives (ECPs) are defined that are combined in different ways to yield a range of agent strategies using the chain of responsibility OOP design pattern. Every permutation of them were tested and measured in order to identify those strategies that perform well. One strategy performed particularly well against all adversaries, graph topology, and other experimental variables. This particular strategy combines ECPs of: A hard-deadline return to covered vertices to counter the random adversary, efficiently checking vertices to see if they are being attacked by the waiting adversary, and random movement to impede the statistical adversary.
|
137 |
NUMERICAL SIMULATION OF EVACUATION PROCESS AGAINST TSUNAMI DISASTER IN MALAYSIA BY USING DISTINCT-ELEMENT-METHOD BASED MULTI-AGENT MODEL / 個別要素法型マルチエージェントモデルを用いたマレーシアにおける津波避難過程の数値シミュレーションMuhammad Salleh Bin Haji Abustan 24 September 2013 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(工学) / 甲第17864号 / 工博第3773号 / 新制||工||1577(附属図書館) / 30684 / 京都大学大学院工学研究科社会基盤工学専攻 / (主査)教授 後藤 仁志, 教授 戸田 圭一, 准教授 原田 英治 / 学位規則第4条第1項該当 / Doctor of Philosophy (Engineering) / Kyoto University / DFAM
|
138 |
MODELING FOR ACTIONS OF DIRECTIONAL SWITCHING AND GROUPING IN DEM-BASED CROWD BEHAVIOR SIMULATOR / 個別要素法型群集行動シミュレータにおける方向転換とグループ行動に関するモデリングNOORHAZLINDA, BINTI ABD RAHMAN 23 March 2015 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(工学) / 甲第18934号 / 工博第3976号 / 新制||工||1613(附属図書館) / 31885 / 京都大学大学院工学研究科社会基盤工学専攻 / (主査)教授 後藤 仁志, 教授 戸田 圭一, 准教授 原田 英治 / 学位規則第4条第1項該当 / Doctor of Philosophy (Engineering) / Kyoto University / DGAM
|
139 |
Online Model-Free Distributed Reinforcement Learning Approach for Networked Systems of Self-organizing AgentsChen, Yiqing 22 December 2021 (has links)
Control of large groups of robotic agents is driven by applications including military, aeronautics and astronautics, transportation network, and environmental monitoring. Cooperative control of networked multi-agent systems aims at driving the behavior of the group via feedback control inputs that encode the groups’ dynamics based on information sharing, with inter-agent communications that can be time varying and be spatially non-uniform. Notably, local interaction rules can induce coordinated behaviour, provided suitable network topologies.
Distributed learning paradigms are often necessary for this class of systems to be able to operate autonomously and robustly, without the need of external units providing centralized information. Compared with model-based protocols that can be computationally prohibitive due to their mathematical complexity and requirements in terms of feedback information, we present an online model-free algorithm for some nonlinear tracking problems with unknown system dynamics. This method prescribes the actuation forces of agents to follow the time-varying trajectory of a moving target. The tracking problem is addressed by an online value iteration process which requires measurements collected along the trajectories. A set of simulations are conducted to illustrate that the presented algorithm is well functioning in various reference-tracking scenarios.
|
140 |
Cooperative Localization based Multi-Agent Coordination and ControlChakraborty, Anusna 05 October 2021 (has links)
No description available.
|
Page generated in 0.0321 seconds