Global ETD Search

131	Uma arquitetura de Agentes BDI para auto-regulação de Trocas Sociais em Sistemas Multiagentes Abertos / SELF-REGULATION OF PERSONALITY-BASED SOCIAL EXCHANGES IN OPEN MULTIAGENT SYSTEMS Gonçalves, Luciano Vargas 31 March 2009 (has links) Made available in DSpace on 2016-03-22T17:26:22Z (GMT). No. of bitstreams: 1 dm2_Luciano_vargas.pdf: 637463 bytes, checksum: b08b63e8c6a347cd2c86fc24fdfd8986 (MD5) Previous issue date: 2009-03-31 / The study and development of systems to control interactions in multiagent systems is an open problem in Artificial Intelligence. The system of social exchange values of Piaget is a social approach that allows for the foundations of the modeling of interactions between agents, where the interactions are seen as service exchanges between pairs of agents, with the evaluation of the realized or received services, thats is, the investments and profits in the exchange, and credits and debits to be charged or received, respectively, in future exchanges. This evaluation may be performed in different ways by the agents, considering that they may have different exchange personality traits. In an exchange process along the time, the different ways in the evaluation of profits and losses may cause disequilibrium in the exchange balances, where some agents may accumulate profits and others accumulate losses. To solve the exchange equilibrium problem, we use the Partially Observable Markov Decision Processes (POMDP) to help the agent decision of actions that can lead to the equilibrium of the social exchanges. Then, each agent has its own internal process to evaluate its current balance of the results of the exchange process between the other agents, observing its internal state, and with the observation of its partner s exchange behavior, it is able to deliberate on the best action it should perform in order to get the equilibrium of the exchanges. Considering an open multiagent system, it is necessary a mechanism to recognize the different personality traits, to build the POMDPs to manage the exchanges between the pairs of agents. This recognizing task is done by Hidden Markov Models (HMM), which, from models of known personality traits, can approximate the personality traits of the new partners, just by analyzing observations done on the agent behaviors in exchanges. The aim of this work is to develop an hybrid agent architecture for the self-regulation of social exchanges between personalitybased agents in a open multiagent system, based in the BDI (Beliefs, Desires, Intentions) architecture, where the agent plans are obtained from optimal policies of POMDPs, which model personality traits that are recognized by HMMs. To evaluate the proposed approach some simulations were done considering (known or new) different personality traits / O estudo e desenvolvimento de sistemas para o controle de interações em sistemas multiagentes é um tema em aberto dentro da Inteligência Artificial. O sistema de valores de trocas sociais de Piaget é uma abordagem social que possibilita fundamentar a modelagem de interações de agentes, onde as interações são vistas como trocas de serviços entre pares de agentes, com a valorização dos serviços realizados e recebidos, ou seja, investimentos e ganhos na troca realizada, e, também os créditos e débitos a serem cobrados ou recebidos, respectivamente, em trocas futuras. Esta avaliação pode ser realizada de maneira diferenciada pelos agentes envolvidos, considerando que estes apresentam traços de personalidade distintos. No decorrer de processo de trocas sociais a forma diferenciada de avaliar os ganhos e perdas nas interações pode causar desequilíbrio nos balanços de trocas dos agentes, onde alguns agentes acumulam ganhos e outros acumulam perdas. Para resolver a questão do equilíbrio das trocas, encontrou-se nos Processos de Decisão de Markov Parcialmente Observáveis (POMDP) uma metodologia capaz de auxiliar a tomada de decisões de cursos de ações na busca do equilíbrio interno dos agentes. Assim, cada agente conta com um mecanismo próprio para avaliar o seu estado interno, e, de posse das observações sobre o comportamento de troca dos parceiros, torna-se apto para deliberar sobre as melhores ações a seguir na busca do equilíbrio interno para o par de agentes. Com objetivo de operar em sistema multiagentes aberto, torna-se necessário um mecanismo para reconhecer os diferentes traços de personalidade, viabilizando o uso de POMDPs nestes ambientes. Esta tarefa de reconhecimento é desempenhada pelos Modelos de Estados Ocultos de Markov (HMM), que, a partir de modelos de traços de personalidade conhecidos, podem inferir os traços aproximados de novos parceiros de interações, através das observações sobre seus comportamentos nas trocas. O objetivo deste trabalho é desenvolver uma arquitetura de agentes híbrida para a auto-regulação de trocas sociais entre agentes baseados em traços de personalidade em sistemas multiagentes abertos. A arquitetura proposta é baseada na arquitetura BDI (Beliefs, Desires, Intentions), onde os planos dos agentes são obtidos através de políticas ótimas de POMDPs, que modelam traços de personalidade reconhecidos através de HMMs. Para avaliar a proposta, foram realizadas simulações envolvendo traços de personalidade conhecidos e novos traços Valores de trocas sociais auto-regulação de trocas sociais Arquitetura BDI Modelos Ocultos de Markov social exchange values self-regulation of social exchanges personalitybased multiagent systems BDI Architecture Hidden Markov Models
132	Um mecanismo construtivista para aprendizagem de antecipações em agentes artificiais situados / Un mecanisme constructiviste d'apprentissage automatique d'anticipations pour des agents artificiels situes / A constructivist anticipatory learning mechanism for situated artificial agents Perotto, Filipo Studzinski January 2010 (has links) Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. À partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentés. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes basés sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. L'agent, à son tour, est composé de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativité et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fondé sur l'approche constructiviste de l'Intelligence Artificielle. Il permet à un agent situé de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorisé (FPOMDP). Le modèle du monde construit est ensuite utilisé pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance. / Esta pesquisa caracteriza-se, primeiramente, pela condução de uma discussão teórica sobre o conceito de agente autônomo, baseada em elementos provenientes dos paradigmas da Inteligência Artificial Situada e da Inteligência Artificial Afetiva. A seguir, a tese apresenta o problema da aprendizagem de modelos de mundo, fazendo uma revisão bibliográfica a respeito de trabalhos relacionados. A partir dessas discussões, a arquitetura CAES e o mecanismo CALM são apresentados. O CAES (Coupled Agent-Environment System) é uma arquitetura para a descrição de sistemas baseados na dicotomia agente-ambiente. Ele define agente e ambiente como dois sistemas parcialmente abertos, em acoplamento dinâmico. O agente, por sua vez, é composto por dois subsistemas, mente e corpo, seguindo os princípios de situatividade e motivação intrínseca. O CALM (Constructivist Anticipatory Learning Mechanism) é um mecanismo de aprendizagem fundamentado na abordagem construtivista da Inteligência Artificial. Ele permite que um agente situado possa construir um modelo de mundo em ambientes parcialmente observáveis e parcialmente determinísticos, na forma de um Processo de Decisão de Markov Parcialmente Observável e Fatorado (FPOMDP). O modelo de mundo construído é então utilizado para que o agente defina uma política de ações a fim de melhorar seu próprio desempenho. / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. The agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance. ntelligence artificielle constructiviste Intelligence artificielle Apprentissage automatique Agents autonomes Induction de concepts Développement cognitif artificiel Piaget Découverte de structure dans des FPOMDP Processus de décision markovien (MDP) Inteligência artificial Construtivismo Agentes autonomos Aprendizagem : Maquina Constructivist artificial intelligence Artificial intelligence Machine learning Autonomous agents Concept induction Artificial cognitive development Structure discovering in FPOMDPs Markov decision processes (MDP)
133	Um mecanismo construtivista para aprendizagem de antecipações em agentes artificiais situados / Un mecanisme constructiviste d'apprentissage automatique d'anticipations pour des agents artificiels situes / A constructivist anticipatory learning mechanism for situated artificial agents Perotto, Filipo Studzinski January 2010 (has links) Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. À partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentés. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes basés sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. L'agent, à son tour, est composé de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativité et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fondé sur l'approche constructiviste de l'Intelligence Artificielle. Il permet à un agent situé de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorisé (FPOMDP). Le modèle du monde construit est ensuite utilisé pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance. / Esta pesquisa caracteriza-se, primeiramente, pela condução de uma discussão teórica sobre o conceito de agente autônomo, baseada em elementos provenientes dos paradigmas da Inteligência Artificial Situada e da Inteligência Artificial Afetiva. A seguir, a tese apresenta o problema da aprendizagem de modelos de mundo, fazendo uma revisão bibliográfica a respeito de trabalhos relacionados. A partir dessas discussões, a arquitetura CAES e o mecanismo CALM são apresentados. O CAES (Coupled Agent-Environment System) é uma arquitetura para a descrição de sistemas baseados na dicotomia agente-ambiente. Ele define agente e ambiente como dois sistemas parcialmente abertos, em acoplamento dinâmico. O agente, por sua vez, é composto por dois subsistemas, mente e corpo, seguindo os princípios de situatividade e motivação intrínseca. O CALM (Constructivist Anticipatory Learning Mechanism) é um mecanismo de aprendizagem fundamentado na abordagem construtivista da Inteligência Artificial. Ele permite que um agente situado possa construir um modelo de mundo em ambientes parcialmente observáveis e parcialmente determinísticos, na forma de um Processo de Decisão de Markov Parcialmente Observável e Fatorado (FPOMDP). O modelo de mundo construído é então utilizado para que o agente defina uma política de ações a fim de melhorar seu próprio desempenho. / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. The agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance. ntelligence artificielle constructiviste Intelligence artificielle Apprentissage automatique Agents autonomes Induction de concepts Développement cognitif artificiel Piaget Découverte de structure dans des FPOMDP Processus de décision markovien (MDP) Inteligência artificial Construtivismo Agentes autonomos Aprendizagem : Maquina Constructivist artificial intelligence Artificial intelligence Machine learning Autonomous agents Concept induction Artificial cognitive development Structure discovering in FPOMDPs Markov decision processes (MDP)
134	Online Learning and Simulation Based Algorithms for Stochastic Optimization Lakshmanan, K January 2012 (has links) (PDF) In many optimization problems, the relationship between the objective and parameters is not known. The objective function itself may be stochastic such as a long-run average over some random cost samples. In such cases finding the gradient of the objective is not possible. It is in this setting that stochastic approximation algorithms are used. These algorithms use some estimates of the gradient and are stochastic in nature. Amongst gradient estimation techniques, Simultaneous Perturbation Stochastic Approximation (SPSA) and Smoothed Functional(SF) scheme are widely used. In this thesis we have proposed a novel multi-time scale quasi-Newton based smoothed functional (QN-SF) algorithm for unconstrained as well as constrained optimization. The algorithm uses the smoothed functional scheme for estimating the gradient and the quasi-Newton method to solve the optimization problem. The algorithm is shown to converge with probability one. We have also provided here experimental results on the problem of optimal routing in a multi-stage network of queues. Policies like Join the Shortest Queue or Least Work Left assume knowledge of the queue length values that can change rapidly or hard to estimate. If the only information available is the expected end-to-end delay as with our case, such policies cannot be used. The QN-SF based probabilistic routing algorithm uses only the total end-to-end delay for tuning the probabilities. We observe from the experiments that the QN-SF algorithm has better performance than the gradient and Jacobi versions of Newton based smoothed functional algorithms. Next we consider constrained routing in a similar queueing network. We extend the QN-SF algorithm to this case. We study the convergence behavior of the algorithm and observe that the constraints are satisfied at the point of convergence. We provide experimental results for the constrained routing setup as well. Next we study reinforcement learning algorithms which are useful for solving Markov Decision Process(MDP) when the precise information on transition probabilities is not known. When the state, and action sets are very large, it is not possible to store all the state-action tuples. In such cases, function approximators like neural networks have been used. The popular Q-learning algorithm is known to diverge when used with linear function approximation due to the ’off-policy’ problem. Hence developing stable learning algorithms when used with function approximation is an important problem. We present in this thesis a variant of Q-learning with linear function approximation that is based on two-timescale stochastic approximation. The Q-value parameters for a given policy in our algorithm are updated on the slower timescale while the policy parameters themselves are updated on the faster scale. We perform a gradient search in the space of policy parameters. Since the objective function and hence the gradient are not analytically known, we employ the efficient one-simulation simultaneous perturbation stochastic approximation(SPSA) gradient estimates that employ Hadamard matrix based deterministic perturbations. Our algorithm has the advantage that, unlike Q-learning, it does not suffer from high oscillations due to the off-policy problem when using function approximators. Whereas it is difficult to prove convergence of regular Q-learning with linear function approximation because of the off-policy problem, we prove that our algorithm which is on-policy is convergent. Numerical results on a multi-stage stochastic shortest path problem show that our algorithm exhibits significantly better performance and is more robust as compared to Q-learning. Future work would be to compare it with other policy-based reinforcement learning algorithms. Finally, we develop an online actor-critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process(MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multistage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point. Stochastic Approximation Algorithms Stochastic Optimization Markov Decision Process Reinforcement Learning Algorithm Queueing Networks Queuing Theory Online Q-Learning Algorithm Online Actor-Critic Algorithm Markov Decision Processes Q-learning Algorithm Linear Function Approximation Computer Science
135	Um mecanismo construtivista para aprendizagem de antecipações em agentes artificiais situados / Un mecanisme constructiviste d'apprentissage automatique d'anticipations pour des agents artificiels situes / A constructivist anticipatory learning mechanism for situated artificial agents Perotto, Filipo Studzinski January 2010 (has links) Cette recherche se caractérise, premièrement, par une discussion théorique sur le concept d'agent autonome, basée sur des éléments issus des paradigmes de l'Intelligence Artificielle Située et de l'Intelligence Artificielle Affective. Ensuite, cette thèse présente le problème de l'apprentissage de modèles du monde, en passant en revue la littérature concernant les travaux qui s'y rapportent. À partir de ces discussions, l'architecture CAES et le mécanisme CALM sont présentés. CAES (Coupled Agent-Environment System) constitue une architecture pour décrire des systèmes basés sur la dichotomie agent-environnement. Il définit l'agent et l'environnement comme deux systèmes partiellement ouverts, en couplage dynamique. L'agent, à son tour, est composé de deux sous-systèmes, l'esprit et le corps, suivant les principes de la situativité et de la motivation intrinsèque. CALM (Constructivist Anticipatory Learning Mechanism) est un mécanisme d'apprentissage fondé sur l'approche constructiviste de l'Intelligence Artificielle. Il permet à un agent situé de construire un modèle du monde dans des environnements partiellement observables et partiellement déterministes, sous la forme d'un processus de décision markovien partiellement observable et factorisé (FPOMDP). Le modèle du monde construit est ensuite utilisé pour que l'agent puisse définir une politique d'action visant à améliorer sa propre performance. / Esta pesquisa caracteriza-se, primeiramente, pela condução de uma discussão teórica sobre o conceito de agente autônomo, baseada em elementos provenientes dos paradigmas da Inteligência Artificial Situada e da Inteligência Artificial Afetiva. A seguir, a tese apresenta o problema da aprendizagem de modelos de mundo, fazendo uma revisão bibliográfica a respeito de trabalhos relacionados. A partir dessas discussões, a arquitetura CAES e o mecanismo CALM são apresentados. O CAES (Coupled Agent-Environment System) é uma arquitetura para a descrição de sistemas baseados na dicotomia agente-ambiente. Ele define agente e ambiente como dois sistemas parcialmente abertos, em acoplamento dinâmico. O agente, por sua vez, é composto por dois subsistemas, mente e corpo, seguindo os princípios de situatividade e motivação intrínseca. O CALM (Constructivist Anticipatory Learning Mechanism) é um mecanismo de aprendizagem fundamentado na abordagem construtivista da Inteligência Artificial. Ele permite que um agente situado possa construir um modelo de mundo em ambientes parcialmente observáveis e parcialmente determinísticos, na forma de um Processo de Decisão de Markov Parcialmente Observável e Fatorado (FPOMDP). O modelo de mundo construído é então utilizado para que o agente defina uma política de ações a fim de melhorar seu próprio desempenho. / This research is characterized, first, by a theoretical discussion on the concept of autonomous agent, based on elements taken from the Situated AI and the Affective AI paradigms. Secondly, this thesis presents the problem of learning world models, providing a bibliographic review regarding some related works. From these discussions, the CAES architecture and the CALM mechanism are presented. The CAES (Coupled Agent-Environment System) is an architecture for describing systems based on the agent-environment dichotomy. It defines the agent and the environment as two partially open systems, in dynamic coupling. The agent is composed of two sub-systems, mind and body, following the principles of situativity and intrinsic motivation. CALM (Constructivist Learning Anticipatory Mechanism) is based on the constructivist approach to Artificial Intelligence. It allows a situated agent to build a model of the world in environments partially deterministic and partially observable in the form of Partially Observable and Factored Markov Decision Process (FPOMDP). The model of the world is constructed and used for the agent to define a policy for action in order to improve its own performance. ntelligence artificielle constructiviste Intelligence artificielle Apprentissage automatique Agents autonomes Induction de concepts Développement cognitif artificiel Piaget Découverte de structure dans des FPOMDP Processus de décision markovien (MDP) Inteligência artificial Construtivismo Agentes autonomos Aprendizagem : Maquina Constructivist artificial intelligence Artificial intelligence Machine learning Autonomous agents Concept induction Artificial cognitive development Structure discovering in FPOMDPs Markov decision processes (MDP)
136	Ant colony optimization and its application to adaptive routing in telecommunication networks Di Caro, Gianni 10 November 2004 (has links) In ant societies, and, more in general, in insect societies, the activities of the individuals, as well as of the society as a whole, are not regulated by any explicit form of centralized control. On the other hand, adaptive and robust behaviors transcending the behavioral repertoire of the single individual can be easily observed at society level. These complex global behaviors are the result of self-organizing dynamics driven by local interactions and communications among a number of relatively simple individuals.<p><p>The simultaneous presence of these and other fascinating and unique characteristics have made ant societies an attractive and inspiring model for building new algorithms and new multi-agent systems. In the last decade, ant societies have been taken as a reference for an ever growing body of scientific work, mostly in the fields of robotics, operations research, and telecommunications.<p><p>Among the different works inspired by ant colonies, the Ant Colony Optimization metaheuristic (ACO) is probably the most successful and popular one. The ACO metaheuristic is a multi-agent framework for combinatorial optimization whose main components are: a set of ant-like agents, the use of memory and of stochastic decisions, and strategies of collective and distributed learning.<p><p>It finds its roots in the experimental observation of a specific foraging behavior of some ant colonies that, under appropriate conditions, are able to select the shortest path among few possible paths connecting their nest to a food site. The pheromone, a volatile chemical substance laid on the ground by the ants while walking and affecting in turn their moving decisions according to its local intensity, is the mediator of this behavior.<p><p>All the elements playing an essential role in the ant colony foraging behavior were understood, thoroughly reverse-engineered and put to work to solve problems of combinatorial optimization by Marco Dorigo and his co-workers at the beginning of the 1990's.<p><p>From that moment on it has been a flourishing of new combinatorial optimization algorithms designed after the first algorithms of Dorigo's et al. and of related scientific events.<p><p>In 1999 the ACO metaheuristic was defined by Dorigo, Di Caro and Gambardella with the purpose of providing a common framework for describing and analyzing all these algorithms inspired by the same ant colony behavior and by the same common process of reverse-engineering of this behavior. Therefore, the ACO metaheuristic was defined a posteriori, as the result of a synthesis effort effectuated on the study of the characteristics of all these ant-inspired algorithms and on the abstraction of their common traits.<p><p>The ACO's synthesis was also motivated by the usually good performance shown by the algorithms (e.g. for several important combinatorial problems like the quadratic assignment, vehicle routing and job shop scheduling, ACO implementations have outperformed state-of-the-art algorithms).<p><p>The definition and study of the ACO metaheuristic is one of the two fundamental goals of the thesis. The other one, strictly related to this former one, consists in the design, implementation, and testing of ACO instances for problems of adaptive routing in telecommunication networks.<p><p>This thesis is an in-depth journey through the ACO metaheuristic, during which we have (re)defined ACO and tried to get a clear understanding of its potentialities, limits, and relationships with other frameworks and with its biological background. The thesis takes into account all the developments that have followed the original 1999's definition, and provides a formal and comprehensive systematization of the subject, as well as an up-to-date and quite comprehensive review of current applications. We have also identified in dynamic problems in telecommunication networks the most appropriate domain of application for the ACO ideas. According to this understanding, in the most applicative part of the thesis we have focused on problems of adaptive routing in networks and we have developed and tested four new algorithms.<p><p>Adopting an original point of view with respect to the way ACO was firstly defined (but maintaining full conceptual and terminological consistency), ACO is here defined and mainly discussed in the terms of sequential decision processes and Monte Carlo sampling and learning.<p><p>More precisely, ACO is characterized as a policy search strategy aimed at learning the distributed parameters (called pheromone variables in accordance with the biological metaphor) of the stochastic decision policy which is used by so-called ant agents to generate solutions. Each ant represents in practice an independent sequential decision process aimed at constructing a possibly feasible solution for the optimization problem at hand by using only information local to the decision step.<p>Ants are repeatedly and concurrently generated in order to sample the solution set according to the current policy. The outcomes of the generated solutions are used to partially evaluate the current policy, spot the most promising search areas, and update the policy parameters in order to possibly focus the search in those promising areas while keeping a satisfactory level of overall exploration.<p><p>This way of looking at ACO has facilitated to disclose the strict relationships between ACO and other well-known frameworks, like dynamic programming, Markov and non-Markov decision processes, and reinforcement learning. In turn, this has favored reasoning on the general properties of ACO in terms of amount of complete state information which is used by the ACO's ants to take optimized decisions and to encode in pheromone variables memory of both the decisions that belonged to the sampled solutions and their quality.<p><p>The ACO's biological context of inspiration is fully acknowledged in the thesis. We report with extensive discussions on the shortest path behaviors of ant colonies and on the identification and analysis of the few nonlinear dynamics that are at the very core of self-organized behaviors in both the ants and other societal organizations. We discuss these dynamics in the general framework of stigmergic modeling, based on asynchronous environment-mediated communication protocols, and (pheromone) variables priming coordinated responses of a number of ``cheap' and concurrent agents.<p><p>The second half of the thesis is devoted to the study of the application of ACO to problems of online routing in telecommunication networks. This class of problems has been identified in the thesis as the most appropriate for the application of the multi-agent, distributed, and adaptive nature of the ACO architecture.<p><p>Four novel ACO algorithms for problems of adaptive routing in telecommunication networks are throughly described. The four algorithms cover a wide spectrum of possible types of network: two of them deliver best-effort traffic in wired IP networks, one is intended for quality-of-service (QoS) traffic in ATM networks, and the fourth is for best-effort traffic in mobile ad hoc networks.<p><p>The two algorithms for wired IP networks have been extensively tested by simulation studies and compared to state-of-the-art algorithms for a wide set of reference scenarios. The algorithm for mobile ad hoc networks is still under development, but quite extensive results and comparisons with a popular state-of-the-art algorithm are reported. No results are reported for the algorithm for QoS, which has not been fully tested. The observed experimental performance is excellent, especially for the case of wired IP networks: our algorithms always perform comparably or much better than the state-of-the-art competitors.<p><p>In the thesis we try to understand the rationale behind the brilliant performance obtained and the good level of popularity reached by our algorithms. More in general, we discuss the reasons of the general efficacy of the ACO approach for network routing problems compared to the characteristics of more classical approaches. Moving further, we also informally define Ant Colony Routing (ACR), a multi-agent framework explicitly integrating learning components into the ACO's design in order to define a general and in a sense futuristic architecture for autonomic network control.<p><p>Most of the material of the thesis comes from a re-elaboration of material co-authored and published in a number of books, journal papers, conference proceedings, and technical reports. The detailed list of references is provided in the Introduction.<p><p><p> / Doctorat en sciences appliquées / info:eu-repo/semantics/nonPublished Sciences de l'ingénieur Informatique générale Combinatorial optimization Algorithms Telecommunication systems Monte Carlo method Neural networks (Computer science) Optimisation combinatoire Algorithmes Systèmes de télécommunications Monte-Carlo, Méthode de Réseaux neuronaux (Informatique) combinatorial optimization metaheuristiques AntNet optimization combinatoire best-effort routing Reseaux mobiles ad hoc Mobile ad Hoc Networks metaheuristics agents mobiles mobile agents stigmergie processus de decision sequentielles stigmergy Monte Carlo sequential decision processes
137	Comprendre et prévenir l’erreur récurrente dans les processus de décision stratégique : l’apport de la Behavioral Strategy / Understanding and preventing recurring errors in strategic decision processes : a Behavioral Strategy approach Sibony, Olivier 14 December 2017 (has links) Les erreurs récurrentes et systématiques dans les processus de décision stratégique sont fréquentes ; et les théories actuelles des organisations sont insuffisantes pour les expliquer. La « Behavioral Strategy » suggère de lier ces erreurs à la psychologie des décideurs, et notamment à leurs biais cognitifs. Toutefois, cette vision suppose de connecter le niveau d’analyse de l’individu et celui de l’organisation. Nous proposons pour ce faire un niveau « méso », la routine de choix stratégique (RCS), où interagissent la psychologie des décideurs et les décisions stratégiques. Après avoir distingué trois types de RCS, nous formulons des hypothèses d’intervention sur celles-ci visant à prévenir les erreurs stratégiques. Nous illustrons ces hypothèses par six cas pratiques, en testons certaines par une étude quantitative, et analysons les préférences qui conduisent les dirigeants à les adopter ou non. Nous concluons en discutant les implications théoriques et pratiques de notre démarche. / Many types of strategic decisions result in recurring, systematic errors. Extant theories of organizations are insufficient to account for this phenomenon. Behavioral Strategy suggests that an explanation may be found in the psychology of decision makers, and particularly in their cognitive biases. This, however, calls for a link between individual-level cognition and affects, and organization-level choices. We propose “Strategic Choice Routines” as a middle level of analysis to bridge this gap, and identify three broad types of Strategic Choice Routines.This leads us to formulate hypotheses on how Strategic Choice Routines can be modified to minimize strategic errors. We illustrate these hypotheses through case studies; test some of them quantitatively; and analyze preferences that drive their adoption by executives. Finally, we discuss theoretical and managerial implications. Erreur Stratégie Décision Psychologie Processus de décision stratégique Routines Biais Cognitifs Cognition Comportement Stratégie Comportementale Excès de confiance Hubris Inertie Rationalité Check-List Styles cognitifs Error Strategy Decision Psychology Strategic Decision Processes Routines Cognitive biases Cognition Debiasing Behavioral Strategy Overconfidence Groupthink Inertia Rationality Checklist Cognitive styles 658.4
138	Online Resource Allocation in Dynamic Optical Networks Romero Reyes, Ronald 13 May 2019 (has links) Konventionelle, optische Transportnetze haben die Bereitstellung von High-Speed-Konnektivität in Form von langfristig installierten Verbindungen konstanter Bitrate ermöglicht. Die Einrichtungszeiten solcher Verbindungen liegen in der Größenordnung von Wochen, da in den meisten Fällen manuelle Eingriffe erforderlich sind. Nach der Installation bleiben die Verbindungen für Monate oder Jahre aktiv. Das Aufkommen von Grid Computing und Cloud-basierten Diensten bringt neue Anforderungen mit sich, die von heutigen optischen Transportnetzen nicht mehr erfüllt werden können. Dies begründet die Notwendigkeit einer Umstellung auf dynamische, optische Netze, welche die kurzfristige Bereitstellung von Bandbreite auf Nachfrage (Bandwidth on Demand - BoD) ermöglichen. Diese Netze müssen Verbindungen mit unterschiedlichen Bitratenanforderungen, mit zufälligen Ankunfts- und Haltezeiten und stringenten Einrichtungszeiten realisieren können. Grid Computing und Cloud-basierte Dienste führen in manchen Fällen zu Verbindungsanforderungen mit Haltezeiten im Bereich von Sekunden, wobei die Einrichtungszeiten im Extremfall in der Größenordnung von Millisekunden liegen können. Bei optischen Netzen für BoD muss der Verbindungsaufbau und -abbau, sowie das Netzmanagement ohne manuelle Eingriffe vonstattengehen. Die dafür notwendigen Technologien sind Flex-Grid-Wellenlängenmultiplexing, rekonfigurierbare optische Add / Drop-Multiplexer (ROADMs) und bandbreitenvariable, abstimmbare Transponder. Weiterhin sind Online-Ressourcenzuweisungsmechanismen erforderlich, um für jede eintreffende Verbindungsanforderung abhängig vom aktuellen Netzzustand entscheiden zu können, ob diese akzeptiert werden kann und welche Netzressourcen hierfür reserviert werden. Dies bedeutet, dass die Ressourcenzuteilung als Online-Optimierungsproblem behandelt werden muss. Die Entscheidungen sollen so getroffen werden, dass auf lange Sicht ein vorgegebenes Optimierungsziel erreicht wird. Die Ressourcenzuweisung bei dynamischen optischen Netzen lässt sich in die Teilfunktionen Routing- und Spektrumszuteilung (RSA), Verbindungsannahmekontrolle (CAC) und Dienstgütesteuerung (GoS Control) untergliedern. In dieser Dissertation wird das Problem der Online-Ressourcenzuteilung in dynamischen optischen Netzen behandelt. Es wird die Theorie der Markov-Entscheidungsprozesse (MDP) angewendet, um die Ressourcenzuweisung als Online-Optimierungsproblem zu formulieren. Die MDP-basierte Formulierung hat zwei Vorteile. Zum einen lassen sich verschiedene Optimierungszielfunktionen realisieren (z.B. die Minimierung der Blockierungswahrscheinlichkeiten oder die Maximierung der wirtschaftlichen Erlöse). Zum anderen lässt sich die Dienstgüte von Gruppen von Verbindungen mit spezifischen Verkehrsparametern gezielt beeinflussen (und damit eine gewisse GoS-Steuerung realisieren). Um das Optimierungsproblem zu lösen, wird in der Dissertation ein schnelles, adaptives und zustandsabhängiges Verfahren vorgestellt, dass im realen Netzbetrieb rekursiv ausgeführt wird und die Teilfunktionen RSA und CAC umfasst. Damit ist das Netz in der Lage, für jede eintreffende Verbindungsanforderung eine optimale Ressourcenzuweisung zu bestimmen. Weiterhin wird in der Dissertation die Implementierung des Verfahrens unter Verwendung eines 3-Way-Handshake-Protokolls für den Verbindungsaufbau betrachtet und ein analytisches Modell vorgestellt, um die Verbindungsaufbauzeit abzuschätzen. Die Arbeit wird abgerundet durch eine Bewertung der Investitionskosten (CAPEX) von dynamischen optischen Netzen. Es werden die wichtigsten Kostenfaktoren und die Beziehung zwischen den Kosten und der Performanz des Netzes analysiert. Die Leistungsfähigkeit aller in der Arbeit vorgeschlagenen Verfahren sowie die Genauigkeit des analytischen Modells zur Bestimmung der Verbindungsaufbauzeit wird durch umfangreiche Simulationen nachgewiesen. / Conventional optical transport networks have leveraged the provisioning of high-speed connectivity in the form of long-term installed, constant bit-rate connections. The setup times of such connections are in the order of weeks, given that in most cases manual installation is required. Once installed, connections remain active for months or years. The advent of grid computing and cloud-based services brings new connectivity requirements which cannot be met by the present-day optical transport network. This has raised awareness on the need for a changeover to dynamic optical networks that enable the provisioning of bandwidth on demand (BoD) in the optical domain. These networks will have to serve connections with different bit-rate requirements, with random interarrival times and durations, and with stringent setup latencies. Ongoing research has shown that grid computing and cloud-based services may in some cases request connections with holding times ranging from seconds to hours, and with setup latencies that must be in the order of milliseconds. To provide BoD, dynamic optical networks must perform connection setup, maintenance and teardown without manual labour. For that, software-configurable networks are needed that are deployed with enough capacity to automatically establish connections. Recently, network architectures have been proposed for that purpose that embrace flex-grid wavelength division multiplexing, reconfigurable optical add/drop multiplexers, and bandwidth variable and tunable transponders as the main technology drivers. To exploit the benefits of these technologies, online resource allocation methods are necessary to ensure that during network operation the installed capacity is efficiently assigned to connections. As connections may arrive and depart randomly, the traffic matrix is unknown, and hence, each connection request submitted to the network has to be processed independently. This implies that resource allocation must be tackled as an online optimization problem which for each connection request, depending on the network state, decides whether the request is admitted or rejected. If admitted, a further decision is made on which resources are assigned to the connection. The decisions are so calculated that, in the long-run, a desired performance objective is optimized. To achieve its goal, resource allocation implements control functions for routing and spectrum allocation (RSA), connection admission control (CAC), and grade of service (GoS) control. In this dissertation we tackle the problem of online resource allocation in dynamic optical networks. For that, the theory of Markov decision processes (MDP) is applied to formulate resource allocation as an online optimization problem. An MDP-based formulation has two relevant advantages. First, the problem can be solved to optimize an arbitrarily defined performance objective (e.g. minimization of blocking probability or maximization of economic revenue). Secondly, it can provide GoS control for groups of connections with different statistical properties. To solve the optimization problem, a fast, adaptive and state-dependent online algorithm is proposed to calculate a resource allocation policy. The calculation is performed recursively during network operation, and uses algorithms for RSA and CAC. The resulting policy is a course of action that instructs the network how to process each connection request. Furthermore, an implementation of the method is proposed that uses a 3-way handshake protocol for connection setup, and an analytical performance evaluation model is derived to estimate the connection setup latency. Our study is complemented by an evaluation of the capital expenditures of dynamic optical networks. The main cost drivers are identified. The performance of the methods proposed in this thesis, including the accuracy of the analytical evaluation of the connection setup latency, were evaluated by simulations. The contributions from the thesis provide a novel approach that meets the requirements envisioned for resource allocation in dynamic optical networks. info:eu-repo/classification/ddc/621.3 ddc:621.3

Search results