Global ETD Search

111	Value methods for efficiently solving stochastic games of complete and incomplete information Mac Dermed, Liam Charles 13 January 2014 (has links) Multi-agent reinforcement learning (MARL) poses the same planning problem as traditional reinforcement learning (RL): What actions over time should an agent take in order to maximize its rewards? MARL tackles a challenging set of problems that can be better understood by modeling them as having a relatively simple environment but with complex dynamics attributed to the presence of other agents who are also attempting to maximize their rewards. A great wealth of research has developed around specific subsets of this problem, most notably when the rewards for each agent are either the same or directly opposite each other. However, there has been relatively little progress made for the general problem. This thesis address this lack. Our goal is to tackle the most general, least restrictive class of MARL problems. These are general-sum, non-deterministic, infinite horizon, multi-agent sequential decision problems of complete and incomplete information. Towards this goal, we engage in two complementary endeavors: the creation of tractable models and the construction of efficient algorithms to solve these models. We tackle three well known models: stochastic games, decentralized partially observable Markov decision problems, and partially observable stochastic games. We also present a new fourth model, Markov games of incomplete information, to help solve the partially observable models. For stochastic games and decentralized partially observable Markov decision problems, we develop novel and efficient value iteration algorithms to solve for game theoretic solutions. We empirically evaluate these algorithms on a range of problems, including well known benchmarks and show that our value iteration algorithms perform better than current policy iteration algorithms. Finally, we argue that our approach is easily extendable to new models and solution concepts, thus providing a foundation for a new class of multi-agent value iteration algorithms. Multi-agent planning Game theory Reinforcement learning Reinforcement learning Multiagent systems Game theory
112	Trust and reputation for formation and evolution of multi-robot teams Pippin, Charles Everett 13 January 2014 (has links) Agents in most types of societies use information about potential partners to determine whether to form mutually beneficial partnerships. We can say that when this information is used to decide to form a partnership that one agent trusts another, and when agents work together for mutual benefit in a partnership, we refer to this as a form of cooperation. Current multi-robot teams typically have the team's goals either explicitly or implicitly encoded into each robot's utility function and are expected to cooperate and perform as designed. However, there are many situations in which robots may not be interested in full cooperation, or may not be capable of performing as expected. In addition, the control strategy for robots may be fixed with no mechanism for modifying the team structure if teammate performance deteriorates. This dissertation investigates the application of trust to multi-robot teams. This research also addresses the problem of how cooperation can be enabled through the use of incentive mechanisms. We posit a framework wherein robot teams may be formed dynamically, using models of trust. These models are used to improve performance on the team, through evolution of the team dynamics. In this context, robots learn online which of their peers are capable and trustworthy to dynamically adjust their teaming strategy. We apply this framework to multi-robot task allocation and patrolling domains and show that performance is improved when this approach is used on teams that may have poorly performing or untrustworthy members. The contributions of this dissertation include algorithms for applying performance characteristics of individual robots to task allocation, methods for monitoring performance of robot team members, and a framework for modeling trust of robot team members. This work also includes experimental results gathered using simulations and on a team of indoor mobile robots to show that the use of a trust model can improve performance on multi-robot teams in the patrolling task. Trust Reputation Multi-robot cooperation Task assignment Robots Evolutionary robotics Multiagent systems
113	A Regulatory Theory of Cortical Organization and its Applications to Robotics Thangavelautham, Jekanthan 05 March 2010 (has links) Fundamental aspects of biologically-inspired regulatory mechanisms are considered in a robotics context, using artificial neural-network control systems . Regulatory mechanisms are used to control expression of genes, adaptation of form and behavior in organisms. Traditional neural network control architectures assume networks of neurons are fixed and are interconnected by wires. However, these architectures tend to be specified by a designer and are faced with several limitations that reduce scalability and tractability for tasks with larger search spaces. Traditional methods used to overcome these limitations with fixed network topologies are to provide more supervision by a designer. More supervision as shown does not guarantee improvement during training particularly when making incorrect assumptions for little known task domains. Biological organisms often do not require such external intervention (more supervision) and have self-organized through adaptation. Artificial neural tissues (ANT) addresses limitations with current neural-network architectures by modeling both wired interactions between neurons and wireless interactions through use of chemical diffusion fields. An evolutionary (Darwinian) selection process is used to ‘breed’ ANT controllers for a task at hand and the framework facilitates emergence of creative solutions since only a system goal function and a generic set of basis behaviours need be defined. Regulatory mechanisms are formed dynamically within ANT through superpositioning of chemical diffusion fields from multiple sources and are used to select neuronal groups. Regulation drives competition and cooperation among neuronal groups and results in areas of specialization forming within the tissue. These regulatory mechanisms are also shown to increase tractability without requiring more supervision using a new statistical theory developed to predict performance characteristics of fixed network topologies. Simulations also confirm the significance of regulatory mechanisms in solving certain tasks found intractable for fixed network topologies. The framework also shows general improvement in training performance against existing fixed-topology neural network controllers for several robotic and control tasks. ANT controllers evolved in a low-fidelity simulation environment have been demonstrated for a number of tasks on hardware using groups of mobile robots and have given insight into self-organizing system. Evidence of sparse activity and use of decentralized, distributed functionality within ANT controller solutions are found consistent with observations from neurobiology. Evolutionary Robotics Neural Networks Developmental systems Gene regulation Space Robotics Multiagent systems 771 317 800
114	Modélisation et Supervision d'Institutions Multi-Agents Gâteau, Benjamin 26 June 2007 (has links) (PDF) Le domaine scientifique dans lequel nos recherches s'insèrent est celui des Systèmes Multi-Agents. Dans ce domaine, nous nous intéressons à l'élaboration de modèles globaux définissant et structurant l'activité commune d'agents autonomes. En effet l'autonomie d'un agent se traduit par sa capacité à déterminer ses propres buts face à la configuration de l'environnement ainsi qu'à ses motivations. Du fait de cette autonomie, des conflits liés à l'existence de buts propres à chacun des agents autonomes d'une société d'agents peuvent être incompatibles, complémentaires ou en compétition avec les objectifs de la société. Des modèles de gestion de conflits existent, et dans le cadre de ce travail, nous nous sommes intéressés aux modèles organisationnels. Nous défendons la thèse selon laquelle la définition et la mise en place de contraintes globales influençant le fonctionnement d'agents autonomes doit être faite de manière à prendre en compte l'éventualité que les agents puissent ne pas respecter les contraintes selon les contextes et les objectifs individuels de chacun. Afin d'atteindre ces objectifs, nous proposons un modèle d'Institution Electronique comportant une description explicite et déclarative des contraintes à appliquer aux agents ainsi qu'un système de supervision et de renforcement de ces contraintes. Ce modèle permet de définir et de structurer grâce à MoiseInst l'activité d'un ensemble d'agents autonomes au sein d'une organisation normée et de superviser son respect grâce au système d'arbitrage Synai. Ce dernier est composé d'un ensemble d'agents de supervision capables de mettre en place des mécanismes de régulation et de renforcement de ces contraintes sur les agents autonomes. L'application du modèle se fait dans deux domaines totalement différents puisqu'ils concernent la création et l'exécution de contenu pour la télévision interactive d'un côté et la gestion de contrats électroniques d'une plate-forme de e-commerce de l'autre. Ils exhibent cependant la même problématique d'arbitrage d'entités autonomes devant respecter des contraintes. De plus, ces deux applications sont complémentaires par leur différence de portée de l'Institution. Système Multi-Agent Institution Electronique Système d'Arbitrage Norme Organisation
115	Approche cognitive pour la planification de trajectoire sous contraintes Gaillard, François 02 February 2012 (has links) (PDF) Je présente une approche cognitive pour la planification de trajectoire sous contraintes. Elle repose en premier lieu sur DKP : un planificateur de trajectoires par échantillonnage. DKP construit un arbre d'exploration dans les parties atteignables de l'environnement à la manière d'un A∗ . Les solutions sont des trajectoires splines quadratiques adaptées au contrôle de robots à deux roues indépendantes. La couche de propagation construit un espace des paramètres, contenant toutes les valeurs des paramètres laissés libres dans les solutions. Il est représenté sous la forme d'une surface contenant toutes les solutions locales qui respectent les contraintes du problème : vitesse, accélération, évitement d'obstacle. . . Ensuite, une recherche de solutions est effectuée sur l'espace des paramètres selon un critère d'optimisation. DKP a la propriété d'être entièrement déterministe : les résultats sont reproductibles et son comportement est complètement contrôlable. Ce contrôle me sert à définir des comportements de pilotage. Ils sont exprimés sous la forme d'un arbre de comportements : chaque comportement agit sur la manière dont l'arbre d'exploration produit par DKP progresse dans l'environnement. De plus, les comportements sont appliqués en fonction de la partie explorée. Ainsi, seuls les comportements faisables sont développés. TÆMS permet de décrire ces comportements de pilotage puis d'évaluer les solutions ainsi construites. Pour résumer, mon approche cognitive repose sur l'évolution conjointe d'un arbre de comportements de pilotage et d'un arbre d'exploration : elle fait ainsi le lien entre planification classique et planification de trajectoire sous contraintes. planification de trajectoire comportements contraintes cinématiques optimisation
116	On intentional and social agents with graded attitudes Casali, Ana 16 December 2008 (has links) La principal contribución de esta Tesis es la propuesta de un modelo de agente BDI graduado (g-BDI) que permita especificar una arquitetura de agente capaz de representar y razonar con actitudes mentales graduadas. Consideramos que una arquitectura BDI más exible permitirá desarrollar agentes que alcancen mejor performance en entornos inciertos y dinámicos, al servicio de otros agentes (humanos o no) que puedan tener un conjunto de motivaciones graduadas. En el modelo g-BDI, las actitudes graduadas del agente tienen una representación explícita y adecuada. Los grados en las creencias representan la medida en que el agente cree que una fórmula es verdadera, en los deseos positivos o negativos permiten al agente establecer respectivamente, diferentes niveles de preferencias o de rechazo. Las graduaciones en las intenciones también dan una medida de preferencia pero en este caso, modelan el costo/beneficio que le trae al agente alcanzar una meta. Luego, a partir de la representación e interacción de estas actitudes graduadas, pueden ser modelados agentes que muestren diferentes tipos de comportamiento. La formalización del modelo g-BDI está basada en los sistemas multi-contextos. Diferentes lógicas modales multivaluadas se han propuesto para representar y razonarsobre las creencias, deseos e intenciones, presentando en cada caso una axiomática completa y consistente. Para tratar con la semántica operacional del modelo de agente, primero se definió un calculus para la ejecución de sistemas multi-contextos, denominado Multi-context calculus. Luego, mediante este calculus se le ha dado al modelo g-BDI semántica computacional. Por otra parte, se ha presentado una metodología para la ingeniería de agentes g-BDI en un escenario multiagente. El objeto de esta propuesta es guiar el diseño de sistemas multiagentes, a partir de un problema del mundo real. Por medio del desarrollo de un sistema recomendador en turismo como caso de estudio, donde el agente recomendador tiene una arquitectura g-BDI, se ha mostrado que este modelo es valioso para diseñar e implementar agentes concretos. Finalmente, usando este caso de estudio se ha realizado una experimentación sobre la flexibilidad y performance del modelo de agente g-BDI, demostrando que es útil para desarrollar agentes que manifiesten conductas diversas. También se ha mostrado que los resultados obtenidos con estos agentes recomendadores modelizados con actitudes graduadas, son mejores que aquellos alcanzados por los agentes con actitudes no-graduadas. / The central contribution of this dissertation is the proposal of a graded BDI agent model (g-BDI), specifying an architecture capable of representing and reasoning with graded mental attitudes. We consider that making the BDI architecture more exible will allow us to design and develop agents capable of improved performance in uncertain and dynamic environments, serving other agents (human or not) that may have a set of graded motivations.In the g-BDI model, the agent graded attitudes have an explicit and suitable representation. Belief degrees represent the extent to which the agent believes a formula to be true. Degrees of positive or negative desires allow the agent to set di_erent levels of preference or rejection respectively. Intention degrees also give a preference measure but, in this case, modelling the cost/benefit trade off of achieving an agent's goal. Then, agents having different kinds of behaviour can be modelled on the basis of the representation and interaction of their graded attitudes. The formalization of the g-BDI agent model is based on Multi-context systems and in order to represent and reason about the beliefs, desires and intentions, we followed a many-valued modal approach. Also, a sound and complete axiomatics for representing each graded attitude is proposed. Besides, in order to cope with the operational semantics aspects of the g-BDI agent model, we first defined a Multi-context calculus for Multi-context systems execution and then, using this calculus we give this agent model computational meaning.Furthermore, a software engineering process to develop graded BDI agents in a multiagent scenario is presented. The aim of the proposed methodology is to guide the design of a multiagent system starting from a real world problem. Through the development of a Tourism recommender system, where one of its principal agents is modelled as a g-BDI agent, we show that the model is useful to design and implement concrete agents.Finally, using the case study we have made some experiments concerning the exibility and performance of the g-BDI agent model, demonstrating that this agent model is useful to develop agents showing varied and rich behaviours. We also show that the results obtained by these particular recommender agents using graded attitudes improve those achieved by agents using non-graded attitudes. Agent Oriented software engineering Process calculus Uncertainty BDI agents Multiagent systems Artificial Intelligence 62
117	An artificial life approach to evolutionary computation: from mobile cellular algorithms to artificial ecosystems Vulli, Srinivasa Shivakar, January 2010 (has links) (PDF) Thesis (M.S.)--Missouri University of Science and Technology, 2010. / Vita. The entire thesis text is included in file. Title from title screen of thesis/dissertation PDF file (viewed July 19, 2010) Includes bibliographical references (p. 57-60).
118	[en] AN AGENT-BASED SOFTWARE FRAMEWORK FOR MACHINE LEARNING TUNING / [pt] UM FRAMEWORK BASEADO EM AGENTES PARA A CALIBRAGEM DE MODELOS DE APRENDIZADO DE MÁQUINA JEFRY SASTRE PEREZ 23 November 2018 (has links) [pt] Hoje em dia, a enorme quantidade de dados disponíveis online apresenta um novo desafio para os processos de descoberta de conhecimento. As abordagens mais utilizadas para enfrentar esse desafio são baseadas em técnicas de aprendizado de máquina. Apesar de serem muito poderosas, essas técnicas exigem que seus parâmetros sejam calibrados para gerar modelos com melhor qualidade. Esses processos de calibração são demorados e dependem das habilidades dos especialistas da área de aprendizado de máquinas. Neste contexto, esta pesquisa apresenta uma estrutura baseada em agentes de software para automatizar a calibração de modelos de aprendizagem de máquinas. Esta abordagem integra conceitos de Engenharia de Software Orientada a Agentes (AOSE) e Aprendizado de Máquinas (ML). Como prova de conceito, foi utilizado o conjunto de dados Iris para mostrar como nossa abordagem melhora a qualidade dos novos modelos gerados por nosso framework. Além disso, o framework foi instanciado para um dataset de imagens médicas e finalmente foi feito um experimento usando o dataset Grid Sector. / [en] Nowadays, the challenge of knowledge discovery is to mine massive amounts of data available online. The most widely used approaches to tackle that challenge are based on machine learning techniques. In spite of being very powerful, those techniques require their parameters to be calibrated in order to generate models with better quality. Such calibration processes are time-consuming and rely on the skills of machine learning experts. Within this context, this research presents a framework based on software agents for automating the calibration of machine learning models. This approach integrates concepts from Agent Oriented Software Engineering (AOSE) and Machine Learning (ML). As a proof of concept, we first train a model for the Iris dataset and then we show how our approach improves the quality of new models generated by our framework. Then, we create instances of the framework to generate models for a medical images dataset and finally we use the Grid Sector dataset for a final experiment. [pt] APRENDIZADO DE MAQUINA [en] MACHINE LEARNING [pt] SISTEMAS MULTIAGENTES [en] MULTIAGENT SYSTEMS [pt] CALIBRAGEM DE MODELOS [en] CALIBRATION PROCESS
119	Uma arquitetura escalável para recuperação e atualização de informações com relação de ordem total. / A scalable architecture for retrieving information with total order relationship. Vladimir Emiliano Moreira Rocha 17 November 2017 (has links) Desde o início do século XXI, vivenciamos uma explosão na produção de informações de diversos tipos, tais como fotos, áudios, vídeos, entre outros. Dentre essas informações, existem aquelas em que a informação pode ser dividida em partes menores, mas que devem ser relacionadas seguindo uma ordem total. Um exemplo deste tipo de informação é um arquivo de vídeo que foi dividido em dez segmentos identificados com números de 1 a 10. Para reproduzir o vídeo original a partir dos segmentos é necessário que seus identificadores estejam ordenados. A estrutura denominada tabela de hash distribuída (DHT) tem sido amplamente utilizada para armazenar, atualizar e recuperar esse tipo de informação de forma eficiente em diversos cenários, como monitoramento de sensores e vídeo sob demanda. Entretanto, a DHT apresenta problemas de escalabilidade quando um membro da estrutura não consegue atender as requisições recebidas, trazendo como consequência a inacessibilidade da informação. Este trabalho apresenta uma arquitetura em camadas denominada MATe, que trata o problema da escalabilidade em dois níveis: estendendo a DHT com a introdução de agentes baseados na utilidade e organizando a quantidade de requisições solicitadas. A primeira camada trata a escalabilidade ao permitir a criação de novos agentes com o objetivo de distribuir as requisições evitando que um deles tenha a escalabilidade comprometida. A segunda camada é composta por grupos de dispositivos organizados de tal forma que somente alguns deles serão escolhidos para fazer requisições. A arquitetura foi implementada para dois cenários onde os problemas de escalabilidade acontecem: (i) monitoramento de sensores; e (ii) vídeo sob demanda. Para ambos cenários, os resultados experimentais mostraram que MATe melhora a escalabilidade quando comparada com as implementações originais da DHT. / Since the beginning of the 21st century, we have experienced an explosive growth in the generation of information, such as photos, audios, videos, among others. Within this information, there are some in which the information can be divided and related following a total order. For example, a video file can be divided into ten segments identified with numbers from 1 to 10. To play the original video from these segments, their identifiers must be fully ordered. A structure called Distributed Hash Table (DHT) has been widely used to efficiently store, update, and retrieve this kind of information in several application domains, such as video on demand and sensor monitoring. However, DHT encounters scalability issues when one of its members fails to answer the requests, resulting in information loss. This work presents MATe, a layered architecture that addresses the problem of scalability on two levels: extending the DHT with the introduction of utility-based agents and organizing the volume of requests. The first layer manages the scalability by allowing the creation of new agents to distribute the requests when one of them has compromised its scalability. The second layer is composed of groups of devices, organized in such a way that only a few of them will be chosen to perform requests. The architecture was implemented in two application scenarios where scalability problems arise: (i) sensor monitoring; and (ii) video on demand. For both scenarios, the experimental results show that MATe improves scalability when compared to original DHT implementations. Arquitetura de software Sistemas multiagentes Data aggregation Distributed hash table Multiagent systems
120	Uma abordagem multiagente para dinâmica de pedestres / Walker - Multiagent Based Approach for simulation of Pedestrian Dynamics Toyama, Marcelo Costa January 2006 (has links) Este trabalho propôe a melhoria Walker para o modelo de Schadschneider e colaboradores, esta é uma melhoria que transforma o modelo de Schadschneider em um modelo baseado em sistema multiagentes. Diferentemente dos autômatos celulares e modelos contínuos, Walker apresenta pedestres com diferentes características: sexo, velocidade, conhecimento do ambiente, comportamento de grupo. Além disto, é realizada também a implementação de um protótipo de Walker. Modelos de simulação da dinâmica de pedestres têm chamado a atenção por diversas razões. Primeiro, os pesquisadores descobriram que modelar fluxo de pedestres é desafiante e complexo. Por exemplo, os corredores de pedestres podem ter diversas entradas, não são regulados ordenadamente como rodovias e são normalmente bi-direcionais. Segundo, modelos de pedestres podem ser ferramentas importantes para o desenvolvimento e planejamento de áreas para pedestres, tais como metrôs, estações de trens, edifícios e shopping centers. Portanto, simulações computadorizadas de dinâmica de pedestres permitem a observação de uma ampla gama de características do fluxo de pessoas e um maior entendimento de seus princípios básicos. O conhecimento do comportamento de pedestres é valioso por prover informações de como formular melhores saídas, geometrias de salas e estádios. Com o objetivo de definir as características importantes para esta melhoria, um estudo do estado da arte da Dinâmica de Pedestres foi realizado e aspectos importantes dos modelos estudados foram utilizados na criação da melhoria. Com o fim de validar a melhoria Walker e demonstrar suas capacidades 18 experimentos foram realizados. Os cenários abrangem desde a validação da melhoria Walker com experimentos realizados por outros autores, verificação do impacto da variação dos parâmetros nas simulações, simulação de diversos experimentos com tamanhos de portas e salas diferentes, até a simulação com dois grupos diferentes de pedestres. Através dos experimentos realizados mostrou-se as qualidades da melhoria proposta, assim como sua capacidade de realizar diversas simulações. / This work presents a improvement (Walker improvement) over the model from Schadschneider and cooworkers. This improvement transforms the Schadschneider’s model in a multiagent system based model. Differently from the cellular automata and continuous models, Walker represents many pedestrian’s characteristics: gender, speed, environment knowledge, herding behavior. We also implement a prototype for the Walker improvement. Pedestrian dynamics models are important for many reasons. First, researches discovered that modelling and simulating pedestrian flux is complex. Second, pedestrian dynamic models are important tools for the development and planning of pedestrian areas, such as: subways stations, train stations, buildings and shopping centers. Therefore, computer simulation of pedestrian dynamics are capable of showing a high number of characteristics that exist in real traffic and contribute for a better understanding of basic pedestrian traffic principles. The knowledge of pedestrian behavior is important to provide information about better exit paths, room and stadium geometries. We made a study of the state of the art in pedestrian dynamics to define important features for this improvement. And many important aspects of the studied models were utilized in the Walker improvement. We also have made 18 experiments to validate and show the Walker’s capabilities. The experiments are: experiments created by others authors, verification of parameters influence in the simulation, simulation of many scenarios with different doors and room sizes, simulation of two different groups of pedestrians. The Walker’s qualities are shown in the experiments, as well its ability to simulate many situations. Inteligência artificial Sistemas multiagentes Dinâmica : Pedestres Simulação computacional Collective and emergente behavior Artificial social systems Multiagent systems

Search results