Spelling suggestions: "subject:"autonomous agents"" "subject:"utonomous agents""
51 |
Micro-Data Reinforcement Learning for Adaptive Robots / Apprentissage micro-data pour l'adaptation en robotiqueChatzilygeroudis, Konstantinos 14 December 2018 (has links)
Les robots opèrent dans le monde réel, dans lequel essayer quelque chose prend beaucoup de temps. Pourtant, les methodes d’apprentissage par renforcement actuels (par exemple, deep reinforcement learning) nécessitent de longues périodes d’interaction pour trouver des politiques efficaces. Dans cette thèse, nous avons exploré des algorithmes qui abordent le défi de l’apprentissage par essai-erreur en quelques minutes sur des robots physiques. Nous appelons ce défi “Apprentissage par renforcement micro-data”. Dans la première contribution, nous avons proposé un nouvel algorithme d’apprentissage appelé “Reset-free Trial-and-Error” qui permet aux robots complexes de s’adapter rapidement dans des circonstances inconnues (par exemple, des dommages) tout en accomplissant leurs tâches; en particulier, un robot hexapode endommagé a retrouvé la plupart de ses capacités de marche dans un environnement avec des obstacles, et sans aucune intervention humaine. Dans la deuxième contribution, nous avons proposé un nouvel algorithme de recherche de politique “basé modèle”, appelé Black-DROPS, qui: (1) n’impose aucune contrainte à la fonction de récompense ou à la politique, (2) est aussi efficace que les algorithmes de l’état de l’art, et (3) est aussi rapide que les approches analytiques lorsque plusieurs processeurs sont disponibles. Nous avons aussi proposé Multi-DEX, une extension qui s’inspire de l’algorithme “Novelty Search” et permet de résoudre plusieurs scénarios où les récompenses sont rares. Dans la troisième contribution, nous avons introduit une nouvelle procédure d’apprentissage du modèle dans Black-DROPS qui exploite un simulateur paramétré pour permettre d’apprendre des politiques sur des systèmes avec des espaces d’état de grande taille; par exemple, cette extension a trouvé des politiques performantes pour un robot hexapode (espace d’état 48D et d’action 18D) en moins d’une minute d’interaction. Enfin, nous avons exploré comment intégrer les contraintes de sécurité, améliorer la robustesse et tirer parti des multiple a priori en optimisation bayésienne. L'objectif de la thèse était de concevoir des méthodes qui fonctionnent sur des robots physiques (pas seulement en simulation). Par conséquent, tous nos approches ont été évaluées sur au moins un robot physique. Dans l’ensemble, nous proposons des méthodes qui permettre aux robots d’être plus autonomes et de pouvoir apprendre en poignée d’essais / Robots have to face the real world, in which trying something might take seconds, hours, or even days. Unfortunately, the current state-of-the-art reinforcement learning algorithms (e.g., deep reinforcement learning) require big interaction times to find effective policies. In this thesis, we explored approaches that tackle the challenge of learning by trial-and-error in a few minutes on physical robots. We call this challenge “micro-data reinforcement learning”. In our first contribution, we introduced a novel learning algorithm called “Reset-free Trial-and-Error” that allows complex robots to quickly recover from unknown circumstances (e.g., damages or different terrain) while completing their tasks and taking the environment into account; in particular, a physical damaged hexapod robot recovered most of its locomotion abilities in an environment with obstacles, and without any human intervention. In our second contribution, we introduced a novel model-based reinforcement learning algorithm, called Black-DROPS that: (1) does not impose any constraint on the reward function or the policy (they are treated as black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for data-efficient RL in robotics, and (3) is as fast (or faster) than analytical approaches when several cores are available. We additionally proposed Multi-DEX, a model-based policy search approach, that takes inspiration from novelty-based ideas and effectively solved several sparse reward scenarios. In our third contribution, we introduced a new model learning procedure in Black-DROPS (we call it GP-MI) that leverages parameterized black-box priors to scale up to high-dimensional systems; for instance, it found high-performing walking policies for a physical damaged hexapod robot (48D state and 18D action space) in less than 1 minute of interaction time. Finally, in the last part of the thesis, we explored a few ideas on how to incorporate safety constraints, robustness and leverage multiple priors in Bayesian optimization in order to tackle the micro-data reinforcement learning challenge. Throughout this thesis, our goal was to design algorithms that work on physical robots, and not only in simulation. Consequently, all the proposed approaches have been evaluated on at least one physical robot. Overall, this thesis aimed at providing methods and algorithms that will allow physical robots to be more autonomous and be able to learn in a handful of trials
|
52 |
DEEP LEARNING BASED MODELS FOR NOVELTY ADAPTATION IN AUTONOMOUS MULTI-AGENT SYSTEMSMarina Wagdy Wadea Haliem (13121685) 20 July 2022 (has links)
<p>Autonomous systems are often deployed in dynamic environments and are challenged with unexpected changes (novelties) in the environments where they receive novel data that was not seen during training. Given the uncertainty, they should be able to operate without (or with limited) human intervention and they are expected to (1) Adapt to such changes while still being effective and efficient in performing their multiple tasks. The system should be able to provide continuous availability of its critical functionalities. (2) Make informed decisions independently from any central authority. (3) Be Cognitive: learns the new context, its possible actions, and be rich in knowledge discovery through mining and pattern recognition. (4) Be Reflexive: reacts to novel unknown data as well as to security threats without terminating on-going critical missions. These characteristics combine to create the workflow of autonomous decision-making process in multi-agent environments (i.e.,) any action taken by the system must go through these characteristic models to autonomously make an ideal decision based on the situation. </p>
<p><br></p>
<p>In this dissertation, we propose novel learning-based models to enhance the decision-making process in autonomous multi-agent systems where agents are able to detect novelties (i.e., unexpected changes in the environment), and adapt to it in a timely manner. For this purpose, we explore two complex and highly dynamic domains </p>
<p>(1) Transportation Networks (e.g., Ridesharing application): where we develop AdaPool: a novel distributed diurnal-adaptive decision-making framework for multi-agent autonomous vehicles using model-free deep reinforcement learning and change point detection. (2) Multi-agent games (e.g., Monopoly): for which we propose a hybrid approach that combines deep reinforcement learning (for frequent but complex decisions) with a fixed-policy approach (for infrequent but straightforward decisions) to facilitate decision-making and it is also adaptive to novelties. (3) Further, we present a domain agnostic approach for decision making without prior knowledge in dynamic environments using Bootstrapped DQN. Finally, to enhance security of autonomous multi-agent systems, (4) we develop a machine learning based resilience testing of address randomization moving target defense. Additionally, to further improve the decision-making process, we present (5) a novel framework for multi-agent deep covering option discovery that is designed to accelerate exploration (which is the first step of decision-making for autonomous agents), by identifying potential collaborative agents and encouraging visiting the under-represented states in their joint observation space. </p>
|
53 |
Improving and Extending Behavioral Animation Through Machine LearningDinerstein, Jonathan J. 20 April 2005 (has links) (PDF)
Behavioral animation has become popular for creating virtual characters that are autonomous agents and thus self-animating. This is useful for lessening the workload of human animators, populating virtual environments with interactive agents, etc. Unfortunately, current behavioral animation techniques suffer from three key problems: (1) deliberative behavioral models (i.e., cognitive models) are slow to execute; (2) interactive virtual characters cannot adapt online due to interaction with a human user; (3) programming of behavioral models is a difficult and time-intensive process. This dissertation presents a collection of papers that seek to overcome each of these problems. Specifically, these issues are alleviated through novel machine learning schemes. Problem 1 is addressed by using fast regression techniques to quickly approximate a cognitive model. Problem 2 is addressed by a novel multi-level technique composed of custom machine learning methods to gather salient knowledge with which to guide decision making. Finally, Problem 3 is addressed through programming-by-demonstration, allowing a non technical user to quickly and intuitively specify agent behavior.
|
54 |
Cognitive and Behavioral Model Ensembles for Autonomous Virtual CharactersWhiting, Jeffrey S. 08 June 2007 (has links) (PDF)
Cognitive and behavioral models have become popular methods to create autonomous self-animating characters. Creating these models presents the following challenges: (1) Creating a cognitive or behavioral model is a time intensive and complex process that must be done by an expert programmer (2) The models are created to solve a specific problem in a given environment and because of their specific nature cannot be easily reused. Combining existing models together would allow an animator, without the need of a programmer, to create new characters in less time and would be able to leverage each model's strengths to increase the character's performance, and to create new behaviors and animations. This thesis provides a framework that can aggregate together existing behavioral and cognitive models into an ensemble. An animator only has to rate how appropriately a character performed and through machine learning the system is able to determine how the character should act given the current situation. Empirical results from multiple case studies validate the approach taken.
|
55 |
Towards Novelty-Resilient AI: Learning in the Open WorldTrevor A Bonjour (18423153) 22 April 2024 (has links)
<p dir="ltr">Current artificial intelligence (AI) systems are proficient at tasks in a closed-world setting where the rules are often rigid. However, in real-world applications, the environment is usually open and dynamic. In this work, we investigate the effects of such dynamic environments on AI systems and develop ways to mitigate those effects. Central to our exploration is the concept of \textit{novelties}. Novelties encompass structural changes, unanticipated events, and environmental shifts that can confound traditional AI systems. We categorize novelties based on their representation, anticipation, and impact on agents, laying the groundwork for systematic detection and adaptation strategies. We explore novelties in the context of stochastic games. Decision-making in stochastic games exercises many aspects of the same reasoning capabilities needed by AI agents acting in the real world. A multi-agent stochastic game allows for infinitely many ways to introduce novelty. We propose an extension of the deep reinforcement learning (DRL) paradigm to develop agents that can detect and adapt to novelties in these environments. To address the sample efficiency challenge in DRL, we introduce a hybrid approach that combines fixed-policy methods with traditional DRL techniques, offering enhanced performance in complex decision-making tasks. We present a novel method for detecting anticipated novelties in multi-agent games, leveraging information theory to discern patterns indicative of collusion among players. Finally, we introduce DABLER, a pioneering deep reinforcement learning architecture that dynamically adapts to changing environmental conditions through broad learning approaches and environment recognition. Our findings underscore the importance of developing AI systems equipped to navigate the uncertainties of the open world, offering promising pathways for advancing AI research and application in real-world settings.</p>
|
56 |
Un mécanisme constructiviste d'apprentissage automatique, d'anticipations pour des agents artificiels situés / A Constructivist Anticipatory Learning Mechanism for Situated Artificial AgentsStudzinski Perotto, Filipo 11 June 2010 (has links)
Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. A partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentes. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes bases sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. Dans CAES, l'agent est compose de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativite et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fonde sur l'approche constructiviste de l'Intelligence Artificielle. Il permet a un agent situe de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorise (FPOMDP). Le modèle du monde construit est ensuite utilise pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. In CAES, the agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance
|
57 |
Reconnaissance comportementale et suivi multi-cible dans des environnements partiellement observés / ehavioral Recognition and multi-target tracking in partially observed environmentsFansi Tchango, Arsène 04 December 2015 (has links)
Dans cette thèse, nous nous intéressons au problème du suivi comportemental des piétons au sein d'un environnement critique partiellement observé. Tandis que plusieurs travaux de la littérature s'intéressent uniquement soit à la position d'un piéton dans l'environnement, soit à l'activité à laquelle il s'adonne, nous optons pour une vue générale et nous estimons simultanément à ces deux données. Les contributions présentées dans ce document sont organisées en deux parties. La première partie traite principalement du problème de la représentation et de l'exploitation du contexte environnemental dans le but d'améliorer les estimations résultant du processus de suivi. L'état de l'art fait mention de quelques études adressant cette problématique. Dans ces études, des modèles graphiques aux capacités d'expressivité limitées, tels que des réseaux Bayésiens dynamiques, sont utilisés pour modéliser des connaissances contextuelles a priori. Dans cette thèse, nous proposons d'utiliser des modèles contextuelles plus riches issus des simulateurs de comportements d'agents autonomes et démontrons l’efficacité de notre approche au travers d'un ensemble d'évaluations expérimentales. La deuxième partie de la thèse adresse le problème général d'influences mutuelles - communément appelées interactions - entre piétons et l'impact de ces interactions sur les comportements respectifs de ces derniers durant le processus de suivi. Sous l'hypothèse que nous disposons d'un simulateur (ou une fonction) modélisant ces interactions, nous développons une approche de suivi comportemental à faible coût computationnel et facilement extensible dans laquelle les interactions entre cibles sont prises en compte. L'originalité de l'approche proposée vient de l'introduction des "représentants'', qui sont des informations agrégées issues de la distribution de chaque cible de telle sorte à maintenir une diversité comportementale, et sur lesquels le système de filtrage s'appuie pour estimer, de manière fine, les comportements des différentes cibles et ceci, même en cas d'occlusions. Nous présentons nos choix de modélisation, les algorithmes résultants, et un ensemble de scénarios difficiles sur lesquels l’approche proposée est évaluée / In this thesis, we are interested in the problem of pedestrian behavioral tracking within a critical environment partially under sensory coverage. While most of the works found in the literature usually focus only on either the location of a pedestrian or the activity a pedestrian is undertaking, we stands in a general view and consider estimating both data simultaneously. The contributions presented in this document are organized in two parts. The first part focuses on the representation and the exploitation of the environmental context for serving the purpose of behavioral estimation. The state of the art shows few studies addressing this issue where graphical models with limited expressiveness capacity such as dynamic Bayesian networks are used for modeling prior environmental knowledge. We propose, instead, to rely on richer contextual models issued from autonomous agent-based behavioral simulators and we demonstrate the effectiveness of our approach through extensive experimental evaluations. The second part of the thesis addresses the general problem of pedestrians’ mutual influences, commonly known as targets’ interactions, on their respective behaviors during the tracking process. Under the assumption of the availability of a generic simulator (or a function) modeling the tracked targets' behaviors, we develop a yet scalable approach in which interactions are considered at low computational cost. The originality of the proposed approach resides on the introduction of density-based aggregated information, called "representatives’’, computed in such a way to guarantee the behavioral diversity for each target, and on which the filtering system relies for computing, in a finer way, behavioral estimations even in case of occlusions. We present the modeling choices, the resulting algorithms as well as a set of challenging scenarios on which the proposed approach is evaluated
|
58 |
Um mecanismo construtivista para aprendizagem de antecipações em agentes artificiais situados / Un mecanisme constructiviste d'apprentissage automatique d'anticipations pour des agents artificiels situes / A constructivist anticipatory learning mechanism for situated artificial agentsPerotto, Filipo Studzinski January 2010 (has links)
Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. À partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentés. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes basés sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. L'agent, à son tour, est composé de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativité et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fondé sur l'approche constructiviste de l'Intelligence Artificielle. Il permet à un agent situé de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorisé (FPOMDP). Le modèle du monde construit est ensuite utilisé pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance. / Esta pesquisa caracteriza-se, primeiramente, pela condução de uma discussão teórica sobre o conceito de agente autônomo, baseada em elementos provenientes dos paradigmas da Inteligência Artificial Situada e da Inteligência Artificial Afetiva. A seguir, a tese apresenta o problema da aprendizagem de modelos de mundo, fazendo uma revisão bibliográfica a respeito de trabalhos relacionados. A partir dessas discussões, a arquitetura CAES e o mecanismo CALM são apresentados. O CAES (Coupled Agent-Environment System) é uma arquitetura para a descrição de sistemas baseados na dicotomia agente-ambiente. Ele define agente e ambiente como dois sistemas parcialmente abertos, em acoplamento dinâmico. O agente, por sua vez, é composto por dois subsistemas, mente e corpo, seguindo os princípios de situatividade e motivação intrínseca. O CALM (Constructivist Anticipatory Learning Mechanism) é um mecanismo de aprendizagem fundamentado na abordagem construtivista da Inteligência Artificial. Ele permite que um agente situado possa construir um modelo de mundo em ambientes parcialmente observáveis e parcialmente determinísticos, na forma de um Processo de Decisão de Markov Parcialmente Observável e Fatorado (FPOMDP). O modelo de mundo construído é então utilizado para que o agente defina uma política de ações a fim de melhorar seu próprio desempenho. / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. The agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance.
|
59 |
Um mecanismo construtivista para aprendizagem de antecipações em agentes artificiais situados / Un mecanisme constructiviste d'apprentissage automatique d'anticipations pour des agents artificiels situes / A constructivist anticipatory learning mechanism for situated artificial agentsPerotto, Filipo Studzinski January 2010 (has links)
Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. À partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentés. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes basés sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. L'agent, à son tour, est composé de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativité et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fondé sur l'approche constructiviste de l'Intelligence Artificielle. Il permet à un agent situé de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorisé (FPOMDP). Le modèle du monde construit est ensuite utilisé pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance. / Esta pesquisa caracteriza-se, primeiramente, pela condução de uma discussão teórica sobre o conceito de agente autônomo, baseada em elementos provenientes dos paradigmas da Inteligência Artificial Situada e da Inteligência Artificial Afetiva. A seguir, a tese apresenta o problema da aprendizagem de modelos de mundo, fazendo uma revisão bibliográfica a respeito de trabalhos relacionados. A partir dessas discussões, a arquitetura CAES e o mecanismo CALM são apresentados. O CAES (Coupled Agent-Environment System) é uma arquitetura para a descrição de sistemas baseados na dicotomia agente-ambiente. Ele define agente e ambiente como dois sistemas parcialmente abertos, em acoplamento dinâmico. O agente, por sua vez, é composto por dois subsistemas, mente e corpo, seguindo os princípios de situatividade e motivação intrínseca. O CALM (Constructivist Anticipatory Learning Mechanism) é um mecanismo de aprendizagem fundamentado na abordagem construtivista da Inteligência Artificial. Ele permite que um agente situado possa construir um modelo de mundo em ambientes parcialmente observáveis e parcialmente determinísticos, na forma de um Processo de Decisão de Markov Parcialmente Observável e Fatorado (FPOMDP). O modelo de mundo construído é então utilizado para que o agente defina uma política de ações a fim de melhorar seu próprio desempenho. / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. The agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance.
|
60 |
Um mecanismo construtivista para aprendizagem de antecipações em agentes artificiais situados / Un mecanisme constructiviste d'apprentissage automatique d'anticipations pour des agents artificiels situes / A constructivist anticipatory learning mechanism for situated artificial agentsPerotto, Filipo Studzinski January 2010 (has links)
Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. À partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentés. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes basés sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. L'agent, à son tour, est composé de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativité et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fondé sur l'approche constructiviste de l'Intelligence Artificielle. Il permet à un agent situé de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorisé (FPOMDP). Le modèle du monde construit est ensuite utilisé pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance. / Esta pesquisa caracteriza-se, primeiramente, pela condução de uma discussão teórica sobre o conceito de agente autônomo, baseada em elementos provenientes dos paradigmas da Inteligência Artificial Situada e da Inteligência Artificial Afetiva. A seguir, a tese apresenta o problema da aprendizagem de modelos de mundo, fazendo uma revisão bibliográfica a respeito de trabalhos relacionados. A partir dessas discussões, a arquitetura CAES e o mecanismo CALM são apresentados. O CAES (Coupled Agent-Environment System) é uma arquitetura para a descrição de sistemas baseados na dicotomia agente-ambiente. Ele define agente e ambiente como dois sistemas parcialmente abertos, em acoplamento dinâmico. O agente, por sua vez, é composto por dois subsistemas, mente e corpo, seguindo os princípios de situatividade e motivação intrínseca. O CALM (Constructivist Anticipatory Learning Mechanism) é um mecanismo de aprendizagem fundamentado na abordagem construtivista da Inteligência Artificial. Ele permite que um agente situado possa construir um modelo de mundo em ambientes parcialmente observáveis e parcialmente determinísticos, na forma de um Processo de Decisão de Markov Parcialmente Observável e Fatorado (FPOMDP). O modelo de mundo construído é então utilizado para que o agente defina uma política de ações a fim de melhorar seu próprio desempenho. / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. The agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance.
|
Page generated in 0.091 seconds