Global ETD Search

101	Capturing continuous human movement on a linear network with mobile phone towers / Skattning av kontinuerlig mänsklig rörelse på ett linjärt nätverk med hjälp av mobiltelefon-master Dejby, Jesper January 2017 (has links) Anonymous Call Detail Records (CDR’s) from mobile phone towers provide a unique opportunity to aggregate individual location data to overall human mobility patterns. Flowminder uses this data to improve the welfare of low- and middle-income countries. The movement patterns are studied through key measurements of mobility. This thesis seeks to evaluate the estimates of key measurements obtained with mobile phone towers through simulation of continuous human movement on a linear network. Simulation is made with an agent based approach. Spatial point processes are used to distribute continuous start points of the agents on the linear network. The start point is then equipped with a mark, a path with an end point dependent on the start point. A path from the start point to the end point of an agent is modeled with a Markov Decision Process. The simulated human movement can then be captured with different types of mobile phone tower distributions realized from spatial point processes. The thesis will initially consider homogeneous Poisson and Simple Sequential Inhibition (SSI) processes on a plane and then introduce local clusters (heterogeneity) with Matérn Cluster and SSI processes. The goal of the thesis is to investigate the effects of change in mobile phone tower distribution and call frequency on the estimates of key measurements of mobility. The effects of call frequency are unclear and invite more detailed study. The results suggest that a decrease in the total number of towers generally worsens the estimates and that introducing local clusters also has a negative effect on the estimates. The presented methodology provides a flexible and new way to model continuous human movement along a linear network. Mobile phone tower analysis Mobile phone networks Spatial point processes Linear networks Call detail records Human movement Human trajectories Markov decision processes Continuous random walks Probability Theory and Statistics Sannolikhetsteori och statistik
102	Learning dynamics and decision paradigms in social-ecological dilemmas Barfuss, Wolfram 10 July 2019 (has links) Kollektives Handeln ist erforderlich um nachhaltige Entwicklungspfade in gekoppelten sozial-ökologischen Systemen zu erschließen, fernab von gefährlichen Kippelementen. Ohne anderen Modellierungsprinzipien ihren Nutzen abzuerkennen, schlägt diese Dissertation die Agent-Umwelt Schnittstelle als die mathematische Grundlage für das Modellieren sozial-ökologischer Systeme vor. Zuerst erweitert diese Arbeit eine Methode aus der Literatur der statistischen Physik über Lerndynamiken, um einen deterministischen Grenzübergang von etablierten Verstärkungslernalgorithmen aus der Forschung zu künstlicher Intelligenz herzuleiten. Die resultierenden Lerndynamiken zeigen eine große Bandbreite verschiedener dynamischer Regime wie z.B. Fixpunkte, Grenzzyklen oder deterministisches Chaos. Zweitens werden die hergeleiteten Lerngleichungen auf eine neu eingeführte Umwelt, das Ökologisches Öffentliches Gut, angewendet,. Sie modelliert ein gekoppeltes sozial-ökologisches Dilemma und erweitert damit etablierte soziale Dilemmaspiele um ein ökologisches Kippelement. Bekannte theoretische und empirische Ergebnisse werden reproduziert und neuartige, qualitativ verschiedene Parameterregime aufgezeigt, darunter eines, in dem diese belohnungsoptimierenden Lern-Agenten es vorziehen, gemeinsam unter einem Kollaps der Umwelt zu leiden, als in einer florierenden Umwelt zu kooperieren. Drittens stellt diese Arbeit das Optimierungsparadigma der Lern-Agenten in Frage. Die drei Entscheidungsparadimen ökonomischen Optimierung, Nachhaltigkeit und Sicherheit werden systematisch miteinander verglichen, während sie auf das Management eines umweltlichen Kippelements angewendet werden. Es wird gezeigt, dass kein Paradigma garantiert, Anforderungen anderer Paradigmen zu erfüllen, sowie dass das Fehlen eines Meisterparadigmas von besonderer Bedeutung für das Klimasystem ist, da dieses sich am Rand zwischen Parameterbereichen befinden kann, wo ökonomische Optimierung weder nachhaltig noch sicher wird. / Collective action is required to enter sustainable development pathways in coupled social-ecological systems, safely away from dangerous tipping elements. Without denying the usefulness of other model design principles, this thesis proposes the agent-environment interface as the mathematical foundation for the design of social-ecological system models. First, this work refines techniques from the statistical physics literature on learning dynamics to derive a deterministic limit of established reinforcement learning algorithms from artificial intelligence research. Illustrations of the resulting learning dynamics reveal a wide range of different dynamical regimes, such as fixed points, periodic orbits and deterministic chaos. Second, the derived multi-state learning equations are applied to a newly introduced environment, the Ecological Public Good. It models a coupled social-ecological dilemma, extending established repeated social dilemma games by an ecological tipping element. Known theoretical and empirical results are reproduced and novel qualitatively different parameter regimes are discovered, including one in which these reward-optimizing agents prefer to collectively suffer in environmental collapse rather than cooperating in a prosperous environment. Third, this thesis challenges the reward optimizing paradigm of the learning equations. It presents a novel formal comparison of the three decision paradigms of economic optimization, sustainability and safety for the governance of an environmental tipping element. It is shown that no paradigm guarantees fulfilling requirements imposed by another paradigm. Further, the absence of a master paradigm is shown to be of special relevance for governing the climate system, since the latter may reside at the edge between parameter regimes where economic welfare optimization becomes neither sustainable nor safe. Verstärkendes Lernen Multiagenten Systeme Nichtlineare Dynamik Markov-Entscheidungsprobleme Sozial-Ökologische Systeme Soziale Dilemmata Nachhaltigkeit Reinforcement Learning Multiagent Systems Nonlinear Dynamics Markov Decision Processes Social-Ecological Systems Social Dilemmas Sustainability 530 Physik QH 253 ddc:530
103	Learning in Partially Observable Markov Decision Processes Sachan, Mohit 21 August 2013 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Learning in Partially Observable Markov Decision process (POMDP) is motivated by the essential need to address a number of realistic problems. A number of methods exist for learning in POMDPs, but learning with limited amount of information about the model of POMDP remains a highly anticipated feature. Learning with minimal information is desirable in complex systems as methods requiring complete information among decision makers are impractical in complex systems due to increase of problem dimensionality. In this thesis we address the problem of decentralized control of POMDPs with unknown transition probabilities and reward. We suggest learning in POMDP using a tree based approach. States of the POMDP are guessed using this tree. Each node in the tree has an automaton in it and acts as a decentralized decision maker for the POMDP. The start state of POMDP is known as the landmark state. Each automaton in the tree uses a simple learning scheme to update its action choice and requires minimal information. The principal result derived is that, without proper knowledge of transition probabilities and rewards, the automata tree of decision makers will converge to a set of actions that maximizes the long term expected reward per unit time obtained by the system. The analysis is based on learning in sequential stochastic games and properties of ergodic Markov chains. Simulation results are presented to compare the long term rewards of the system under different decision control algorithms. Learning in POMDP Learning automata tree POMDP Computer programming Data structures (Computer science) Stochastic systems -- Research Game theory -- Mathematical models Sequences (Mathematics) Markov processes Decision making -- Simulation methods User interfaces (Computer systems)
104	Dynamic Routing for Fuel Optimization in Autonomous Vehicles Regatti, Jayanth Reddy 14 August 2018 (has links) No description available. Electrical Engineering Transportation Planning Operations Research Dynamic routing transportation autonomous vehicles simulation simulator reinforcement learning markov decision processes markov decision process mdp borel state mdp traffic flow dynamic programming approximate dynamic programming
105	Opportunistic Scheduling Using Channel Memory in Markov-modeled Wireless Networks Murugesan, Sugumar 26 October 2010 (has links) No description available. Computer Science Electrical Engineering Engineering wireless communication downlink network broadcast opportunistic scheduling imperfect channel state information channel memory Markov channel ARQ feedback restless multi-armed bandit processes
106	An empirical study of stability and variance reduction in DeepReinforcement Learning Lindström, Alexander January 2024 (has links) Reinforcement Learning (RL) is a branch of AI that deals with solving complex sequential decision making problems such as training robots, trading while following patterns and trends, optimal control of industrial processes, and more. These applications span various fields, including data science, factories, finance, and others[1]. The most popular RL algorithm today is Deep Q Learning (DQL), developed by a team at DeepMind, which successfully combines RL with Neural Network (NN). However, combining RL and NN introduces challenges such as numerical instability and unstable learning due to high variance. Among others, these issues are due to the“moving target problem”. To mitigate this problem, the target network was introduced as a solution. However, using a target network slows down learning, vastly increases memory requirements, and adds overheads in running the code. In this thesis, we conduct an empirical study to investigate the importance of target networks. We conduct this empirical study for three scenarios. In the first scenario, we train agents in online learning. The aim here is to demonstrate that the target network can be removed after some point in time without negatively affecting performance. To evaluate this scenario, we introduce the concept of the stabilization point. In thesecond scenario, we pre-train agents before continuing to train them in online learning. For this scenario, we demonstrate the redundancy of the target network by showing that it can be completely omitted. In the third scenario, we evaluate a newly developed activation function called Truncated Gaussian Error Linear Unit (TGeLU). For thisscenario, we train an agent in online learning and show that by using TGeLU as anactivation function, we can completely remove the target network. Through the empirical study of these scenarios, we conjecture and verify that a target network has only transient benefits concerning stability. We show that it has no influence on the quality of the policy found. We also observed that variance was generally higher when using a target network in the later stages of training compared to cases where the target network had been removed. Additionally, during the investigation of the second scenario, we observed that the magnitude of training iterations during pre-training affected the agent’s performance in the online learning phase. This thesis provides a deeper understanding of how the target networkaffects the training process of DQL, some of them - surrounding variance reduction- are contrary to popular belief. Additionally, the results have provided insights into potential future work. These include further explore the benefits of lower variance observed when removing the target network and conducting more efficient convergence analyses for the pre-training part in the second scenario. Reinforcement Learning Markov Decision Processes Neural Network Deep Q Learning Deep Q Network Sigmoid Truncated Gaussian Error Linear Unit Target network Stable learning Online learning Offline learning Computer Engineering Datorteknik
107	Un mécanisme constructiviste d'apprentissage automatique, d'anticipations pour des agents artificiels situés / A Constructivist Anticipatory Learning Mechanism for Situated Artificial Agents Studzinski Perotto, Filipo 11 June 2010 (has links) Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. A partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentes. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes bases sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. Dans CAES, l'agent est compose de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativite et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fonde sur l'approche constructiviste de l'Intelligence Artificielle. Il permet a un agent situe de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorise (FPOMDP). Le modèle du monde construit est ensuite utilise pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. In CAES, the agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance Intelligence Artificielle Apprentissage Automatique Agents Autonomes Induction de Concepts Développement Cognitif Artificiel Piaget Découverte de Structure dans des FPOMDP Processus de Décision Markovien (MDP) Constructivist Artificial Intelligence Artificial Intelligence Machine Learning Autonomous Agents Concept Induction Artificial Cognitive Development Piaget Structure Discovering in FPOMDPs Markov Decision Processes (MDP)
108	Uma arquitetura de Agentes BDI para auto-regulação de Trocas Sociais em Sistemas Multiagentes Abertos / SELF-REGULATION OF PERSONALITY-BASED SOCIAL EXCHANGES IN OPEN MULTIAGENT SYSTEMS Gonçalves, Luciano Vargas 31 March 2009 (has links) Made available in DSpace on 2016-03-22T17:26:22Z (GMT). No. of bitstreams: 1 dm2_Luciano_vargas.pdf: 637463 bytes, checksum: b08b63e8c6a347cd2c86fc24fdfd8986 (MD5) Previous issue date: 2009-03-31 / The study and development of systems to control interactions in multiagent systems is an open problem in Artificial Intelligence. The system of social exchange values of Piaget is a social approach that allows for the foundations of the modeling of interactions between agents, where the interactions are seen as service exchanges between pairs of agents, with the evaluation of the realized or received services, thats is, the investments and profits in the exchange, and credits and debits to be charged or received, respectively, in future exchanges. This evaluation may be performed in different ways by the agents, considering that they may have different exchange personality traits. In an exchange process along the time, the different ways in the evaluation of profits and losses may cause disequilibrium in the exchange balances, where some agents may accumulate profits and others accumulate losses. To solve the exchange equilibrium problem, we use the Partially Observable Markov Decision Processes (POMDP) to help the agent decision of actions that can lead to the equilibrium of the social exchanges. Then, each agent has its own internal process to evaluate its current balance of the results of the exchange process between the other agents, observing its internal state, and with the observation of its partner s exchange behavior, it is able to deliberate on the best action it should perform in order to get the equilibrium of the exchanges. Considering an open multiagent system, it is necessary a mechanism to recognize the different personality traits, to build the POMDPs to manage the exchanges between the pairs of agents. This recognizing task is done by Hidden Markov Models (HMM), which, from models of known personality traits, can approximate the personality traits of the new partners, just by analyzing observations done on the agent behaviors in exchanges. The aim of this work is to develop an hybrid agent architecture for the self-regulation of social exchanges between personalitybased agents in a open multiagent system, based in the BDI (Beliefs, Desires, Intentions) architecture, where the agent plans are obtained from optimal policies of POMDPs, which model personality traits that are recognized by HMMs. To evaluate the proposed approach some simulations were done considering (known or new) different personality traits / O estudo e desenvolvimento de sistemas para o controle de interações em sistemas multiagentes é um tema em aberto dentro da Inteligência Artificial. O sistema de valores de trocas sociais de Piaget é uma abordagem social que possibilita fundamentar a modelagem de interações de agentes, onde as interações são vistas como trocas de serviços entre pares de agentes, com a valorização dos serviços realizados e recebidos, ou seja, investimentos e ganhos na troca realizada, e, também os créditos e débitos a serem cobrados ou recebidos, respectivamente, em trocas futuras. Esta avaliação pode ser realizada de maneira diferenciada pelos agentes envolvidos, considerando que estes apresentam traços de personalidade distintos. No decorrer de processo de trocas sociais a forma diferenciada de avaliar os ganhos e perdas nas interações pode causar desequilíbrio nos balanços de trocas dos agentes, onde alguns agentes acumulam ganhos e outros acumulam perdas. Para resolver a questão do equilíbrio das trocas, encontrou-se nos Processos de Decisão de Markov Parcialmente Observáveis (POMDP) uma metodologia capaz de auxiliar a tomada de decisões de cursos de ações na busca do equilíbrio interno dos agentes. Assim, cada agente conta com um mecanismo próprio para avaliar o seu estado interno, e, de posse das observações sobre o comportamento de troca dos parceiros, torna-se apto para deliberar sobre as melhores ações a seguir na busca do equilíbrio interno para o par de agentes. Com objetivo de operar em sistema multiagentes aberto, torna-se necessário um mecanismo para reconhecer os diferentes traços de personalidade, viabilizando o uso de POMDPs nestes ambientes. Esta tarefa de reconhecimento é desempenhada pelos Modelos de Estados Ocultos de Markov (HMM), que, a partir de modelos de traços de personalidade conhecidos, podem inferir os traços aproximados de novos parceiros de interações, através das observações sobre seus comportamentos nas trocas. O objetivo deste trabalho é desenvolver uma arquitetura de agentes híbrida para a auto-regulação de trocas sociais entre agentes baseados em traços de personalidade em sistemas multiagentes abertos. A arquitetura proposta é baseada na arquitetura BDI (Beliefs, Desires, Intentions), onde os planos dos agentes são obtidos através de políticas ótimas de POMDPs, que modelam traços de personalidade reconhecidos através de HMMs. Para avaliar a proposta, foram realizadas simulações envolvendo traços de personalidade conhecidos e novos traços Valores de trocas sociais auto-regulação de trocas sociais Arquitetura BDI Modelos Ocultos de Markov social exchange values self-regulation of social exchanges personalitybased multiagent systems BDI Architecture Hidden Markov Models
109	Um mecanismo construtivista para aprendizagem de antecipações em agentes artificiais situados / Un mecanisme constructiviste d'apprentissage automatique d'anticipations pour des agents artificiels situes / A constructivist anticipatory learning mechanism for situated artificial agents Perotto, Filipo Studzinski January 2010 (has links) Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. À partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentés. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes basés sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. L'agent, à son tour, est composé de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativité et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fondé sur l'approche constructiviste de l'Intelligence Artificielle. Il permet à un agent situé de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorisé (FPOMDP). Le modèle du monde construit est ensuite utilisé pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance. / Esta pesquisa caracteriza-se, primeiramente, pela condução de uma discussão teórica sobre o conceito de agente autônomo, baseada em elementos provenientes dos paradigmas da Inteligência Artificial Situada e da Inteligência Artificial Afetiva. A seguir, a tese apresenta o problema da aprendizagem de modelos de mundo, fazendo uma revisão bibliográfica a respeito de trabalhos relacionados. A partir dessas discussões, a arquitetura CAES e o mecanismo CALM são apresentados. O CAES (Coupled Agent-Environment System) é uma arquitetura para a descrição de sistemas baseados na dicotomia agente-ambiente. Ele define agente e ambiente como dois sistemas parcialmente abertos, em acoplamento dinâmico. O agente, por sua vez, é composto por dois subsistemas, mente e corpo, seguindo os princípios de situatividade e motivação intrínseca. O CALM (Constructivist Anticipatory Learning Mechanism) é um mecanismo de aprendizagem fundamentado na abordagem construtivista da Inteligência Artificial. Ele permite que um agente situado possa construir um modelo de mundo em ambientes parcialmente observáveis e parcialmente determinísticos, na forma de um Processo de Decisão de Markov Parcialmente Observável e Fatorado (FPOMDP). O modelo de mundo construído é então utilizado para que o agente defina uma política de ações a fim de melhorar seu próprio desempenho. / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. The agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance. ntelligence artificielle constructiviste Intelligence artificielle Apprentissage automatique Agents autonomes Induction de concepts Développement cognitif artificiel Piaget Découverte de structure dans des FPOMDP Processus de décision markovien (MDP) Inteligência artificial Construtivismo Agentes autonomos Aprendizagem : Maquina Constructivist artificial intelligence Artificial intelligence Machine learning Autonomous agents Concept induction Artificial cognitive development Structure discovering in FPOMDPs Markov decision processes (MDP)
110	Um mecanismo construtivista para aprendizagem de antecipações em agentes artificiais situados / Un mecanisme constructiviste d'apprentissage automatique d'anticipations pour des agents artificiels situes / A constructivist anticipatory learning mechanism for situated artificial agents Perotto, Filipo Studzinski January 2010 (has links) Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. À partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentés. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes basés sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. L'agent, à son tour, est composé de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativité et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fondé sur l'approche constructiviste de l'Intelligence Artificielle. Il permet à un agent situé de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorisé (FPOMDP). Le modèle du monde construit est ensuite utilisé pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance. / Esta pesquisa caracteriza-se, primeiramente, pela condução de uma discussão teórica sobre o conceito de agente autônomo, baseada em elementos provenientes dos paradigmas da Inteligência Artificial Situada e da Inteligência Artificial Afetiva. A seguir, a tese apresenta o problema da aprendizagem de modelos de mundo, fazendo uma revisão bibliográfica a respeito de trabalhos relacionados. A partir dessas discussões, a arquitetura CAES e o mecanismo CALM são apresentados. O CAES (Coupled Agent-Environment System) é uma arquitetura para a descrição de sistemas baseados na dicotomia agente-ambiente. Ele define agente e ambiente como dois sistemas parcialmente abertos, em acoplamento dinâmico. O agente, por sua vez, é composto por dois subsistemas, mente e corpo, seguindo os princípios de situatividade e motivação intrínseca. O CALM (Constructivist Anticipatory Learning Mechanism) é um mecanismo de aprendizagem fundamentado na abordagem construtivista da Inteligência Artificial. Ele permite que um agente situado possa construir um modelo de mundo em ambientes parcialmente observáveis e parcialmente determinísticos, na forma de um Processo de Decisão de Markov Parcialmente Observável e Fatorado (FPOMDP). O modelo de mundo construído é então utilizado para que o agente defina uma política de ações a fim de melhorar seu próprio desempenho. / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. The agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance. ntelligence artificielle constructiviste Intelligence artificielle Apprentissage automatique Agents autonomes Induction de concepts Développement cognitif artificiel Piaget Découverte de structure dans des FPOMDP Processus de décision markovien (MDP) Inteligência artificial Construtivismo Agentes autonomos Aprendizagem : Maquina Constructivist artificial intelligence Artificial intelligence Machine learning Autonomous agents Concept induction Artificial cognitive development Structure discovering in FPOMDPs Markov decision processes (MDP)

Search results