Global ETD Search

141	Exploiting Opponent Modeling For Learning In Multi-agent Adversarial Games Laviers, Kennard R 01 January 2011 (has links) An issue with learning effective policies in multi-agent adversarial games is that the size of the search space can be prohibitively large when the actions of both teammates and opponents are considered simultaneously. Opponent modeling, predicting an opponent’s actions in advance of execution, is one approach for selecting actions in adversarial settings, but it is often performed in an ad hoc way. In this dissertation, we introduce several methods for using opponent modeling, in the form of predictions about the players’ physical movements, to learn team policies. To explore the problem of decision-making in multi-agent adversarial scenarios, we use our approach for both offline play generation and real-time team response in the Rush 2008 American football simulator. Simultaneously predicting the movement trajectories, future reward, and play strategies of multiple players in real-time is a daunting task but we illustrate how it is possible to divide and conquer this problem with an assortment of data-driven models. By leveraging spatio-temporal traces of player movements, we learn discriminative models of defensive play for opponent modeling. With the reward information from previous play matchups, we use a modified version of UCT (Upper Conference Bounds applied to Trees) to create new offensive plays and to learn play repairs to counter predicted opponent actions. iii In team games, players must coordinate effectively to accomplish tasks while foiling their opponents either in a preplanned or emergent manner. An effective team policy must generate the necessary coordination, yet considering all possibilities for creating coordinating subgroups is computationally infeasible. Automatically identifying and preserving the coordination between key subgroups of teammates can make search more productive by pruning policies that disrupt these relationships. We demonstrate that combining opponent modeling with automatic subgroup identification can be used to create team policies with a higher average yardage than either the baseline game or domain-specific heuristics. Artificial intelligence Games -- Data processing Machine learning Multiagent systems Electrical and Computer Engineering Electrical and Electronics Engineering
142	Multiagent Learning Through Indirect Encoding D'Ambrosio, David B 01 January 2011 (has links) Designing a system of multiple, heterogeneous agents that cooperate to achieve a common goal is a difficult task, but it is also a common real-world problem. Multiagent learning addresses this problem by training the team to cooperate through a learning algorithm. However, most traditional approaches treat multiagent learning as a combination of multiple single-agent learning problems. This perspective leads to many inefficiencies in learning such as the problem of reinvention, whereby fundamental skills and policies that all agents should possess must be rediscovered independently for each team member. For example, in soccer, all the players know how to pass and kick the ball, but a traditional algorithm has no way to share such vital information because it has no way to relate the policies of agents to each other. In this dissertation a new approach to multiagent learning that seeks to address these issues is presented. This approach, called multiagent HyperNEAT, represents teams as a pattern of policies rather than individual agents. The main idea is that an agent’s location within a canonical team layout (such as a soccer team at the start of a game) tends to dictate its role within that team, called the policy geometry. For example, as soccer positions move from goal to center they become more offensive and less defensive, a concept that is compactly represented as a pattern. iii The first major contribution of this dissertation is a new method for evolving neural network controllers called HyperNEAT, which forms the foundation of the second contribution and primary focus of this work, multiagent HyperNEAT. Multiagent learning in this dissertation is investigated in predator-prey, room-clearing, and patrol domains, providing a real-world context for the approach. Interestingly, because the teams in multiagent HyperNEAT are represented as patterns they can scale up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed. Thus the third contribution is a method for teams trained with multiagent HyperNEAT to dynamically scale their size without further learning. Fourth, the capabilities to both learn and scale in multiagent HyperNEAT are compared to the traditional multiagent SARSA(λ) approach in a comprehensive study. The fifth contribution is a method for efficiently learning and encoding multiple policies for each agent on a team to facilitate learning in multi-task domains. Finally, because there is significant interest in practical applications of multiagent learning, multiagent HyperNEAT is tested in a real-world military patrolling application with actual Khepera III robots. The ultimate goal is to provide a new perspective on multiagent learning and to demonstrate the practical benefits of training heterogeneous, scalable multiagent teams through generative encoding. Evolutionary computation Multiagent systems Neural networks (Computer Science) Electrical and Computer Engineering Electrical and Electronics Engineering
143	Spatio-temporal Negotiation Protocols Luo, Yi 01 January 2011 (has links) Canonical problems are simplified representations of a class of real world problems. They allow researchers to compare algorithms in a standard setting which captures the most important challenges of the real world problems being modeled. In this dissertation, we focus on negotiating a collaboration in space and time, a problem with many important real world applications. Although technically a multi-issue negotiation, we show that the problem can not be represented in a satisfactory manner by previous models. We propose the "Children in the Rectangular Forest" (CRF) model as a possible canonical problem for negotiating spatio-temporal collaboration. In the CRF problem, two embodied agents are negotiating the synchronization of their movement for a portion of the path from their respective sources to destinations. The negotiation setting is zero initial knowledge and it happens in physical time. As equilibrium strategies are not practically possible, we are interested in strategies with bounded rationality, which achieve good performance in a wide range of practical negotiation scenarios. We design a number of negotiation protocols to allow agents to exchange their offers. The simple negotiation protocol can be enhanced by schemes in which the agents add additional information of the negotiation flow to aid the negotiation partner in offer formation. Naturally, the performance of a strategy is dependent on the strategy of the opponent and the iii characteristics of the scenario. Thus we develop a set of metrics for the negotiation scenario which formalizes our intuition of collaborative scenarios (where the agents’ interests are closely aligned) versus competitive scenarios (where the gain of the utility for one agent is paid off with a loss of utility for the other agent). Finally, we further investigate the sophisticated strategies which allow agents to learn the opponents while negotiating. We find strategies can be augmented by collaborativeness analysis: the approximate collaborativeness metric can be used to cut short the negotiation. Then, we discover an approach to model the opponent through Bayesian learning. We assume the agents do not disclose their information voluntarily: the learning needs to rely on the study of the offers exchanged during normal negotiation. At last, we explore a setting where the agents are able to perform physical action (movement) while the negotiation is ongoing. We formalize a method to represent and update the beliefs about the valuation function, the current state of negotiation and strategy of the opponent agent using a particle filter. By exploring a number of different negotiation protocols and several peer-to-peer negotiation based strategies, we claim that the CRF problem captures the main challenges of the real world problems while allows us to simplify away some of the computationally demanding but semantically marginal features of real world problems. Artificial intelligence Multiagent systems Negotiation Electrical and Computer Engineering Electrical and Electronics Engineering
144	Increasing Scalability in Algorithms for Centralized and Decentralized Partially Observable Markov Decision Processes: Efficient Decision-Making and Coordination in Uncertain Environments Amato, Christopher 01 September 2010 (has links) As agents are built for ever more complex environments, methods that consider the uncertainty in the system have strong advantages. This uncertainty is common in domains such as robot navigation, medical diagnosis and treatment, inventory management, sensor networks and e-commerce. When a single decision maker is present, the partially observable Markov decision process (POMDP) model is a popular and powerful choice. When choices are made in a decentralized manner by a set of decision makers, the problem can be modeled as a decentralized partially observable Markov decision process (DEC-POMDP). While POMDPs and DEC-POMDPs offer rich frameworks for sequential decision making under uncertainty, the computational complexity of each model presents an important research challenge. As a way to address this high complexity, this thesis develops several solution methods based on utilizing domain structure, memory-bounded representations and sampling. These approaches address some of the major bottlenecks for decision-making in real-world uncertain systems. The methods include a more efficient optimal algorithm for DEC-POMDPs as well as scalable approximate algorithms for POMDPs and DEC-POMDPs. Key contributions include optimizing compact representations as well as automatic structure extraction and exploitation. These approaches increase the scalability of algorithms, while also increasing their solution quality. Artificial Intelligence Decision Theory Game Theory Machine Learning Multiagent Systems Reasoning Under Uncertainty Computer Sciences
145	Cohesive behaviors of cooperative multiagent systems with information flow constraints Liu, Yanfei 29 September 2004 (has links) No description available. Biological systems foraging multiagent systems stability analysis swarming discrete-time systems
146	Behavior-based model predictive control for networked multi-agent systems Droge, Greg Nathanael 22 May 2014 (has links) We present a motion control framework which allows a group of robots to work together to decide upon their motions by minimizing a collective cost without any central computing component or any one agent performing a large portion of the computation. When developing distributed control algorithms, care must be taken to respect the limited computational capacity of each agent as well as respect the information and communication constraints of the network. To address these issues, we develop a distributed, behavior-based model predictive control (MPC) framework which alleviates the computational difficulties present in many distributed MPC frameworks, while respecting the communication and information constraints of the network. In developing the multi-agent control framework, we make three contributions. First, we develop a distributed optimization technique which respects the dynamic communication restraints of the network, converges to a collective minimum of the cost, and has transients suitable for robot motion control. Second, we develop a behavior-based MPC framework to control the motion of a single-agent and apply the framework to robot navigation. The third contribution is to combine the concepts of distributed optimization and behavior-based MPC to develop the mentioned multi-agent behavior-based MPC algorithm suitable for multi-robot motion control. Model predictive control Distributed optimization Networked-control Multi-agent control Automatic control Predictive control Multiagent systems Algorithms
147	Aplicação de sistemas multiagentes para gerenciamento de sistemas de distribuição tipo Smart Grids / Application of multiagent systems for management of distribution systems like Smart Grids Saraiva, Filipe de Oliveira 23 March 2012 (has links) Os smart grids são tidos como a nova geração dos sistemas elétricos de potência, combinando avanços em computação, sistemas distribuídos e inteligência artificial para prover maiores funcionalidades sobre acompanhamento em tempo real da demanda e do consumo de energia elétrica, gerenciamento em larga escala de geradores distribuídos, entre outras, a partir de um sistema de controle distribuído sobre a rede elétrica. Esta abordagem alteraria fundamentalmente a maneira como se dá o planejamento e a operação de sistemas de distribuição, e há grandes possibilidades de pesquisa e desenvolvimento possibilitada pela busca de implementação destas funcionalidades. Com esse cenário em vista, o presente trabalho utiliza uma abordagem a partir do uso de sistemas multiagentes para estudar o gerenciamento de sistemas de distribuição, do ponto de vista da reconfiguração da topologia da rede, simulando as características de um smart grid. Nesta dissertação, foi desenvolvido um sistema multiagente para simulação computacional de um sistema de distribuição elétrico do tipo smart grid, buscando executar a reconfiguração topológica do sistema a partir de dados de carga capturados de forma distribuída pelos agentes dispersos na rede elétrica. Espera-se que o desenrolar da pesquisa conduza à vários estudos sobre algoritmos e técnicas que melhor implementem tais funcionalidades a serem transpostas para um ambiente de produção. / Smart grids are taken as the new generation of electric power systems, combining advances in computing, distributed systems and artificial intelligence to provide more features on real-time monitoring of demand and consumption of electricity, managing large-scale distributed generators, among others, from a distributed control system on the grid. This approach fundamentally alter the way how is the planning and operation of distribution systems, and there are great possibilities for research and development offered in the quest to implement these features. With that environment, this text uses an approach through the use of multi-agent systems to study the management of the distribution system, from the reconfiguration of grid topology, simulating the characteristics of a smart grid. In this text, was developed a multiagent system to computational simulation of a distribution system like smart grid to topological reconfiguration, from datas collected for agents in electrical grid. It is expected that the conduct of research leads to several studies about better algorithms and techniques that would implement such functionality in a production environment. Distributed systems Multiagent systems Sistemas distribuídos Sistemas multiagentes Smart Grids Smart Grids
148	Influência da complexidade da representação de estratégias em modelos evolucionários para o dilema do prisioneiro com n jogadores. / Influence of strategy representation complexity in evolutionary models for the n-players Prisoner\'s Dilemma. Bó, Inácio Guerberoff Lanari 19 December 2007 (has links) Em Teoria dos Jogos, o Dilema do Prisioneiro para N Participantes (DPNP) é o problema que representa, em sua forma elementar, o paradoxo que gera as dificuldades existentes na formação da cooperação entre mais de dois agentes. Diversos trabalhos foram e continuam sendo feitos sobre esse tema, no sentido de compreender melhor os fatores que influenciam o surgimento e a evolução da cooperação numa sociedade. Neste trabalho, o objetivo principal é o de analisar o impacto do poder expressivo de um modelo de representação de estratégias neste surgimento e evolução. Para tal, foi desenvolvido um modelo computacional de jogos evolutivos, onde agentes participam repetidamente do DPNP. Nele, as estratégias que definem qual será a jogada de um determinado agente são desenvolvidas e selecionadas através de mecanismos de mutação e reprodução daquelas que obtiveram melhores resultados nas iterações anteriores, e implementadas através de duas representações com diferentes poderes computacionais: autômatos finitos e autômatos adaptativos. Este modelo foi implementado num sistema denominado S2E2 onde foram executados diversos experimentos de simulação. Através da comparação dos resultados obtidos para ambas as representações, verificou-se que em ambos os casos a sociedade consegue atingir, após um período inicial, um nível de cooperação relativamente alto e estável. A análise das estratégias utilizadas pelos agentes, entretanto, mostrou que o uso de autômatos adaptativos resulta em uma pequena vantagem, embora estatisticamente não significativa, pois permite surgir estratégias que visam retornar a uma situação de cooperação. / In Game Theory, the n-Players Prisoner\'s Dilemma (NPPD) is a problem that represents, in its elementary form, the paradox that leads to the existing difficulties in the development of cooperation between two or more agents. Many works were and are still being done about this subject, trying to better understand the factors that influence the development and evolution of cooperation in a society. In this work, the main objective is to analyze the impact of the expressive power of the strategies representation model in this development and evolution. In order to do so, a computational model of evolutionary games was developed, where agents are spatially distributed and participate on the NPPD with five participants, interacting only with their neighbors. In this model, the strategies that define the agent\'s decisions are developed and selected through mutation and reproduction of those strategies that obtained better results in the last iterations, and they are implemented by two representations with different computational power: finite automata and adaptative automata. This model was implemented in a system called S2E2 and several simulation experiments were carried on. Comparing the results obtained in those experiments, it was verified that after an initial period of time in both cases the society achieved a relatively high and stable level of cooperation. On the other hand, the analysis of the strategies used by the agents showed that the use of adaptative automata resulted in a slight advantage, although not statistically significative, because they allow the emergence of strategies that return to a situation of cooperation. Autômatos finitos Finite automata Game theory Multiagent systems Simulação de sistemas Sistemas multiagentes Systems simulation Teoria dos jogos
149	Modelagem e simulação de agentes com aspectos cognitivos para avaliação de comportamento social. / Modeling and simulation of agents with congnitive aspects to evaluate social behavior. Paiva, Daniel Costa de 12 May 2011 (has links) Este trabalho foi elaborado considerando conceitos de quatro áreas de pesquisa: ciência da computação, ciência cognitiva, ciência da informação e comunicação social. A contribuição principal aqui se dá em definir agentes minimamente cognitivos que participam ativamente na dinâmica do fluxo de informações, sofrendo influência das mensagens que recebem e também interferindo no que irá passar adiante. O modelo contempla tanto características relacionadas ao ambiente e sociedade, definidos usando ontologias, quanto à mente, arquitetura e funcionamento dos agentes. Cada personagem possui três módulos de decisão, podendo acessar meios de comunicação em massa, assimilar informações recebidas e também falar com seus amigos. Para o módulo de assimilação, estão apresentadas quatro formas que os agentes podem usar para avaliar as informações que recebem. Primeiramente foi elaborada uma função, depois foram definidas três máquinas de estados finitos, aumentando gradativamente a complexidade e a interdependência entre os parâmetros envolvidos em cada uma. Foi também elaborada uma função adaptativa, a qual a partir de uma regra definida pelo usuário para a troca entre as máquinas de estado que os agentes dispõem, propiciou resultados satisfatórios mesmo nos casos onde alguma das máquinas de estado apresentou deficiências. Visando reproduzir uma situação inspirada na vida real, estão apresentados resultados de uma versão combinada (definida pelo autor) onde foram considerados três grupos e a possibilidade de acesso a meios de comunicação em massa e/ou troca de mensagens entre amigos. Usando o modelo como base, um simulador foi desenvolvido onde é possível ter não só uma visão global da dinâmica que está acontecendo na sociedade, mas também o que alguns agentes ficam sabendo e que podem falar sobre. Os estudos de caso visaram comparar as diferentes formas de assimilação de informações que os agentes podem usar (elaboradas pelo autor), avaliar a influência da variação de alguns parâmetros e reproduzir a dinâmica do fluxo de informações em redes sociais, a comparando com o que acontece quando se tem a divulgação broadcast. / This work was developed considering concepts of four research areas: computer science, cognitive science, information science and social communication. The main contribution here is to define minimally cognitive agents who actively participate in dynamic of information flow interfering on the messages received and are being affected by them. The model consider characteristics related to the environment and the society, defined by using ontologies, and the \"mind\", architecture and functioning of the agents. Each character has three decision modules. They can access means of mass communication, assimilate information received and talk to their friends. To the assimilation module were presented four forms that agents can evaluate received information, a function, and three finite state machines (FSM). They were elaborated increasing the complexity. After, it was developed an adaptive function, in which using a rule the user define how to exchange between the finite state machines that agents have. This function provided satisfactory results even in cases where some of the FSMs presented problems. Aiming to reproduce a situation inspired in real life a combined version was defined by the author, in where were considered three groups that can access means of mass communication and / or talk to their friends. Based on the model, a simulator was developed in which there are not only an overview of the society dynamics, but also what agents learn and what they can talk about. The case studies were performed aiming to compare the different assimilation forms (developed by the author), to evaluate the influence of some parameters variation and to reproduce the dynamics of information flow, comparing the influence of a social network communication and the mass communication activity. Aspectos cognitivos Cognitive aspects Fluxo de informação Information flow Multiagent systems Ontologia Ontology Simulação de humanos virtuais Sistemas multiagentes Virtual humans simulation
150	Decentralized control and analysis of cluster patterned networks / Commande décentralisée et analyse des réseaux partitionnés en groupes Bragagnolo, Marcos Cesar 27 November 2015 (has links) Les réseaux sont présents dans plusieurs domaines scientifiques et d’ingénierie tels que la biologie, la physique, la sociologie ainsi que la robotique ou la théorie de la communication. L’étude de ces réseaux montre qu’ils sont souvent structurés en sous-groupes. Entre eux il n’y a pas ou il y a très peu d’interaction. Par conséquent, un accord local au sein de chaque groupe est naturellement atteint alors que le consensus associé à un tel réseau doit être imposé par une loi de commande spécifique. Nous proposons donc un contrôleur discret quasi-périodique pour échanger des informations entre les groupes. Un agent de chaque groupe est choisi le « leader » et, à certains moments, ces leaders communiquent entre eux à travers un nouveau réseau. Ceci permet d’obtenir le consensus dans tout le réseau mais engendre des réinitialisation/sauts dans l’état de leaders. La première contribution de la thèse est la caractérisation de la valeur de consensus dans le cadre des systèmes linéaires impulsifs. Il est remarquable que la valeur de consensus dépende seulement des conditions initiales et des topologies des réseaux impliqués. Elle n’est donc pas sensible aux instants de réinitialisation des états de leaders. Afin d’étudier la stabilité de la valeur de consensus obtenue, nous proposons une méthode fondée sur la vérification d’une condition LMI. Cela peut être adaptée pour la conception du réseau d’interaction entre les leaders permettant d’atteindre une valeur de consensus a priori choisie. Il est aussi possible d’utiliser la condition LMI afin de garantir une vitesse de convergence désirée vers le consensus. Pour ces derniers objectifs, la topologie du réseau continue à être considérée comme fixe et connue pour chaque groupe. L’ensemble des valeurs de consensus qui peuvent être atteintes est contenu dans l’intervalle défini par le minimum et le maximum des accords locaux initiaux. Ensuite nous présentons l’étude d’un problème pratique. Des robots mobiles non-holonome, séparés dans des groupes, doivent atteindre une formation donnée. L’algorithme de consensus à pour mission de définir les trajectoires de référence pour ces robots en prenant en compte juste les informations locale. Le robot poursuit la trajectoire de référence en utilisant une commande classique pour cela. / Networks appear in several areas of science and engineering such as biology, physics, sociology as well as robotics and communication theory. Studying these networks it is possible to see cluster-like structures, which are disconnected or very weakly connected one to another. The presence of these clusters hampers consensus throughout the overall network. Instead, local agreement is reached within each cluster. To enforce consensus we have to design an appropriate decentralized controller that imposes interactions between clusters. While the interactions inside each cluster are continuous, we propose a quasi-periodic discrete controller to exchange information between clusters. A single agent from each cluster is chosen to be the leader, and at certain moments, the leaders communicate with each other through a new network. This allows consensus in the entire network but generates resets/jumps on the leaders’ state. The first contribution of this manuscript is related to the characterization of the consensus value in the linear impulsive dynamics framework. It is noteworthy that the consensus value depends only on the initial conditions and the topologies of the involved networks. Therefore, the consensus value does not depend on the reset sequence used for the leaders’ states. To study the stability of the consensus value a LMI based condition is proposed. The main advantage of this approach is its flexibility. Indeed with some modifications to the LMI condition it is possible to analyse the convergence speed of the network or to design the leaders’ network. The purpose of leaders’ network design is to reach an a priori specified consensus value with a specified convergence speed. Whatever is the objective, throughout the manuscript we consider that the network topology is fixed and known for each cluster. The set of consensus values that can be reached is restricted to the interval defined by the minimum and maximum initial local agreements. A last contribution is related to the application of the proposed methodology to a practical situation. We consider a fleet of non-holonomic mobile robots separated in clusters. The communication inside each cluster are secured and cheap while between clusters it is expensive and not securely to communicate. Nevertheless the robots have to reach a given formation. In this case our consensus algorithm is in charge of providing reference trajectories to each robot by using only the available local information. The robot follows the reference by using a classical trajectory tracking control. Systèmes multi-Agents Commande décentralisée Théorie des graphes Multiagent systems Decentralized control Graph theory 629.831 2 005.3 006.3

Search results