Global ETD Search

51	B-Spline Based Multitarget Tracking Sithiravel, Rajiv January 2014 (has links) Multitarget tracking in the presence of false alarm is a difficult problem to consider. The objective of multitarget tracking is to estimate the number of targets and their states recursively from available observations. At any given time, targets can be born, die and spawn from already existing targets. Sensors can detect these targets with a defined threshold, where normally the observation is influenced by false alarm. Also if the targets are with low signal to noise ratio (SNR) then the targets may not be detected. The Random Finite Set (RFS) filters can be used to solve such multitarget problem efficiently. Specially, one of the best and most widely used RFS based filter is the Probability Hypothesis Density (PHD) filter. The PHD filter approximates the posterior probability density function (PDF) by the first order moment only, where the targets SNR assumed to be much higher. The PHD filter supports targets die, born, spawn and missed-detection by using the well known implementations including Sequential Monte Carlo Probability Hypothesis Density (SMC-PHD) and Gaussian Mixture Probability Hypothesis Density (GM-PHD) methods. The SMC-PHD filter suffers from the well known degeneracy problems while GM-PHD filter may not be suitable for nonlinear and non-Gaussian target tracking problems. It is desirable to have a filter that can provide continuous estimates for any distribution. This is the motivation for the use of B-Splines in this thesis. One of the main focus of the thesis is the B-Spline based PHD (SPHD) filters. The Spline is a well developed theory and been used in academia and industry for more than five decades. The B-Spline can represent any numerical, geometrical and statistical functions and models including the PDF and PHD. The SPHD filter can be applied to linear, nonlinear, Gaussian and non-Gaussian multitarget tracking applications. The SPHD continuity can be maintained by selecting splines with order of three or more, which avoids the degeneracy-related problem. Another important characteristic of the SPHD filter is that the SPHD can be locally controlled, which allow the manipulations of the SPHD and its natural tendency for handling the nonlinear problems. The SPHD filter can be further extended to support maneuvering multitarget tracking, where it can be an alternative to any available PHD filter implementations. The PHD filter does not work well for very low observable (VLO) target tracking problems, where the targets SNR is normally very low. For very low SNR scenarios the PDF must be approximated by higher order moments. Therefore the PHD implementations may not be suitable for the problem considered in this thesis. One of the best estimator to use in VLO target tracking problem is the Maximum-Likelihood Probability Data Association (ML-PDA) algorithm. The standard ML-PDA algorithm is widely used in single target initialization or geolocation problems with high false alarm. The B-Spline is also used in the ML-PDA (SML-PDA) implementations. The SML-PDA algorithm has the capability to determine the global maximum of ML-PDA log-likelihood ratio with high efficiency in terms of state estimates and low computational complexity. For fast passive track initialization, search and rescue operations the SML-PDA algorithm can be used more efficiently compared to the standard ML-PDA algorithm. Also the SML-PDA algorithm with the extension supports the multitarget tracking. / Thesis / Doctor of Philosophy (PhD)
52	Learning in Partially Observable Markov Decision Processes Sachan, Mohit 21 August 2013 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Learning in Partially Observable Markov Decision process (POMDP) is motivated by the essential need to address a number of realistic problems. A number of methods exist for learning in POMDPs, but learning with limited amount of information about the model of POMDP remains a highly anticipated feature. Learning with minimal information is desirable in complex systems as methods requiring complete information among decision makers are impractical in complex systems due to increase of problem dimensionality. In this thesis we address the problem of decentralized control of POMDPs with unknown transition probabilities and reward. We suggest learning in POMDP using a tree based approach. States of the POMDP are guessed using this tree. Each node in the tree has an automaton in it and acts as a decentralized decision maker for the POMDP. The start state of POMDP is known as the landmark state. Each automaton in the tree uses a simple learning scheme to update its action choice and requires minimal information. The principal result derived is that, without proper knowledge of transition probabilities and rewards, the automata tree of decision makers will converge to a set of actions that maximizes the long term expected reward per unit time obtained by the system. The analysis is based on learning in sequential stochastic games and properties of ergodic Markov chains. Simulation results are presented to compare the long term rewards of the system under different decision control algorithms. Learning in POMDP Learning automata tree POMDP Computer programming Data structures (Computer science) Stochastic systems -- Research Game theory -- Mathematical models Sequences (Mathematics) Markov processes Decision making -- Simulation methods User interfaces (Computer systems)
53	Scheduling in Wireless Networks with Limited and Imperfect Channel Knowledge Ouyang, Wenzhuo 18 August 2014 (has links) No description available. Electrical Engineering Computer Science Engineering Operations Research
54	Opportunistic Scheduling Using Channel Memory in Markov-modeled Wireless Networks Murugesan, Sugumar 26 October 2010 (has links) No description available. Computer Science Electrical Engineering Engineering wireless communication downlink network broadcast opportunistic scheduling imperfect channel state information channel memory Markov channel ARQ feedback restless multi-armed bandit processes
55	Uma arquitetura de Agentes BDI para auto-regulação de Trocas Sociais em Sistemas Multiagentes Abertos / SELF-REGULATION OF PERSONALITY-BASED SOCIAL EXCHANGES IN OPEN MULTIAGENT SYSTEMS Gonçalves, Luciano Vargas 31 March 2009 (has links) Made available in DSpace on 2016-03-22T17:26:22Z (GMT). No. of bitstreams: 1 dm2_Luciano_vargas.pdf: 637463 bytes, checksum: b08b63e8c6a347cd2c86fc24fdfd8986 (MD5) Previous issue date: 2009-03-31 / The study and development of systems to control interactions in multiagent systems is an open problem in Artificial Intelligence. The system of social exchange values of Piaget is a social approach that allows for the foundations of the modeling of interactions between agents, where the interactions are seen as service exchanges between pairs of agents, with the evaluation of the realized or received services, thats is, the investments and profits in the exchange, and credits and debits to be charged or received, respectively, in future exchanges. This evaluation may be performed in different ways by the agents, considering that they may have different exchange personality traits. In an exchange process along the time, the different ways in the evaluation of profits and losses may cause disequilibrium in the exchange balances, where some agents may accumulate profits and others accumulate losses. To solve the exchange equilibrium problem, we use the Partially Observable Markov Decision Processes (POMDP) to help the agent decision of actions that can lead to the equilibrium of the social exchanges. Then, each agent has its own internal process to evaluate its current balance of the results of the exchange process between the other agents, observing its internal state, and with the observation of its partner s exchange behavior, it is able to deliberate on the best action it should perform in order to get the equilibrium of the exchanges. Considering an open multiagent system, it is necessary a mechanism to recognize the different personality traits, to build the POMDPs to manage the exchanges between the pairs of agents. This recognizing task is done by Hidden Markov Models (HMM), which, from models of known personality traits, can approximate the personality traits of the new partners, just by analyzing observations done on the agent behaviors in exchanges. The aim of this work is to develop an hybrid agent architecture for the self-regulation of social exchanges between personalitybased agents in a open multiagent system, based in the BDI (Beliefs, Desires, Intentions) architecture, where the agent plans are obtained from optimal policies of POMDPs, which model personality traits that are recognized by HMMs. To evaluate the proposed approach some simulations were done considering (known or new) different personality traits / O estudo e desenvolvimento de sistemas para o controle de interações em sistemas multiagentes é um tema em aberto dentro da Inteligência Artificial. O sistema de valores de trocas sociais de Piaget é uma abordagem social que possibilita fundamentar a modelagem de interações de agentes, onde as interações são vistas como trocas de serviços entre pares de agentes, com a valorização dos serviços realizados e recebidos, ou seja, investimentos e ganhos na troca realizada, e, também os créditos e débitos a serem cobrados ou recebidos, respectivamente, em trocas futuras. Esta avaliação pode ser realizada de maneira diferenciada pelos agentes envolvidos, considerando que estes apresentam traços de personalidade distintos. No decorrer de processo de trocas sociais a forma diferenciada de avaliar os ganhos e perdas nas interações pode causar desequilíbrio nos balanços de trocas dos agentes, onde alguns agentes acumulam ganhos e outros acumulam perdas. Para resolver a questão do equilíbrio das trocas, encontrou-se nos Processos de Decisão de Markov Parcialmente Observáveis (POMDP) uma metodologia capaz de auxiliar a tomada de decisões de cursos de ações na busca do equilíbrio interno dos agentes. Assim, cada agente conta com um mecanismo próprio para avaliar o seu estado interno, e, de posse das observações sobre o comportamento de troca dos parceiros, torna-se apto para deliberar sobre as melhores ações a seguir na busca do equilíbrio interno para o par de agentes. Com objetivo de operar em sistema multiagentes aberto, torna-se necessário um mecanismo para reconhecer os diferentes traços de personalidade, viabilizando o uso de POMDPs nestes ambientes. Esta tarefa de reconhecimento é desempenhada pelos Modelos de Estados Ocultos de Markov (HMM), que, a partir de modelos de traços de personalidade conhecidos, podem inferir os traços aproximados de novos parceiros de interações, através das observações sobre seus comportamentos nas trocas. O objetivo deste trabalho é desenvolver uma arquitetura de agentes híbrida para a auto-regulação de trocas sociais entre agentes baseados em traços de personalidade em sistemas multiagentes abertos. A arquitetura proposta é baseada na arquitetura BDI (Beliefs, Desires, Intentions), onde os planos dos agentes são obtidos através de políticas ótimas de POMDPs, que modelam traços de personalidade reconhecidos através de HMMs. Para avaliar a proposta, foram realizadas simulações envolvendo traços de personalidade conhecidos e novos traços Valores de trocas sociais auto-regulação de trocas sociais Arquitetura BDI Modelos Ocultos de Markov social exchange values self-regulation of social exchanges personalitybased multiagent systems BDI Architecture Hidden Markov Models
56	On Cooperative Surveillance, Online Trajectory Planning and Observer Based Control Anisi, David A. January 2009 (has links) The main body of this thesis consists of six appended papers. In the first two, different cooperative surveillance problems are considered. The second two consider different aspects of the trajectory planning problem, while the last two deal with observer design for mobile robotic and Euler-Lagrange systems respectively.In Papers A and B, a combinatorial optimization based framework to cooperative surveillance missions using multiple Unmanned Ground Vehicles (UGVs) is proposed. In particular, Paper A considers the the Minimum Time UGV Surveillance Problem (MTUSP) while Paper B treats the Connectivity Constrained UGV Surveillance Problem (CUSP). The minimum time formulation is the following. Given a set of surveillance UGVs and a polyhedral area, find waypoint-paths for all UGVs such that every point of the area is visible from a point on a waypoint-path and such that the time for executing the search in parallel is minimized. The connectivity constrained formulation extends the MTUSP by additionally requiring the induced information graph to be kept recurrently connected at the time instants when the UGVs perform the surveillance mission. In these two papers, the NP-hardness of both these problems are shown and decomposition techniques are proposed that allow us to find an approximative solution efficiently in an algorithmic manner.Paper C addresses the problem of designing a real time, high performance trajectory planner for an aerial vehicle that uses information about terrain and enemy threats, to fly low and avoid radar exposure on the way to a given target. The high-level framework augments Receding Horizon Control (RHC) with a graph based terminal cost that captures the global characteristics of the environment. An important issue with RHC is to make sure that the greedy, short term optimization does not lead to long term problems, which in our case boils down to two things: not getting into situations where a collision is unavoidable, and making sure that the destination is actually reached. Hence, the main contribution of this paper is to present a trajectory planner with provable safety and task completion properties. Direct methods for trajectory optimization are traditionally based on a priori temporal discretization and collocation methods. In Paper D, the problem of adaptive node distribution is formulated as a constrained optimization problem, which is to be included in the underlying nonlinear mathematical programming problem. The benefits of utilizing the suggested method for online trajectory optimization are illustrated by a missile guidance example.In Paper E, the problem of active observer design for an important class of non-uniformly observable systems, namely mobile robotic systems, is considered. The set of feasible configurations and the set of output flow equivalent states are defined. It is shown that the inter-relation between these two sets may serve as the basis for design of active observers. The proposed observer design methodology is illustrated by considering a unicycle robot model, equipped with a set of range-measuring sensors. Finally, in Paper F, a geometrically intrinsic observer for Euler-Lagrange systems is defined and analyzed. This observer is a generalization of the observer proposed by Aghannan and Rouchon. Their contractivity result is reproduced and complemented by a proof that the region of contraction is infinitely thin. Moreover, assuming a priori bounds on the velocities, convergence of the observer is shown by means of Lyapunov's direct method in the case of configuration manifolds with constant curvature. / QC 20100622 / TAIS, AURES Surveillance Missions Minimum-Time Surveillance Unmanned Ground Vehicles Connectivity Constraints Combinatorial Optimization Computational Optimal Control Receding Horizon Control Mission Uncertainty Safety Task Completion Adaptive Grid Methods Missile Guidance Nonlinear Observer Design Active Observers Non--uniformly Observable Systems Mobile Robotic Systems Intrinsic Observers Differential Geometric Methods Euler-Lagrange Systems Contraction Analysis. Optimization, systems theory Optimeringslära, systemteori Applied mathematics Tillämpad matematik
57	Online trajectory planning and observer based control Anisi, David A. January 2006 (has links) <p>The main body of this thesis consists of four appended papers. The first two consider different aspects of the trajectory planning problem, while the last two deal with observer design for mobile robotic and Euler-Lagrange systems respectively.</p><p>The first paper addresses the problem of designing a real time, high performance trajectory planner for aerial vehicles. The main contribution is two-fold. Firstly, by augmenting a novel safety maneuver at the end of the planned trajectory, this paper extends previous results by having provable safety properties in a 3D setting. Secondly, assuming initial feasibility, the planning method is shown to have finite time task completion. Moreover, in the second part of the paper, the problem of simultaneous arrival of multiple aerial vehicles is considered. By using a time-scale separation principle, one is able to adopt standard Laplacian control to this consensus problem, which is neither unconstrained, nor first order.</p><p>Direct methods for trajectory optimization are traditionally based on<i> a</i> <i>priori </i>temporal discretization and collocation methods. In the second paper, the problem of adaptive node distribution is formulated as a constrained optimization problem, which is to be included in the underlying nonlinear mathematical programming problem. The benefits of utilizing the suggested method for online trajectory optimization are illustrated by a missile guidance example.</p><p>In the third paper, the problem of active observer design for an important class of non-uniformly observable systems, namely mobile robotics systems, is considered. The set of feasible configurations and the set of output flow equivalent states are defined. It is shown that the inter-relation between these two sets may serve as the basis for design of active observers. The proposed observer design methodology is illustrated by considering a unicycle robot model, equipped with a set of range-measuring sensors.</p><p>Finally, in the fourth paper, a geometrically intrinsic observer for Euler-Lagrange systems is defined and analyzed. This observer is a generalization of the observer recently proposed by Aghannan and Rouchon. Their contractivity result is reproduced and complemented by a proof that the region of contraction is infinitely thin. However, assuming <i>a</i> <i>priori </i>bounds on the velocities, convergence of the observer is shown by means of Lyapunov's direct method in the case of configuration manifolds with constant curvature.</p> Computational Optimal Control Receding Horizon Control Mission Uncertainty Safety Task Completion Consensus Problem Simultaneous Arrival Adaptive Grid Methods Missile Guidance Nonlinear Observer Design Active Observers Non--uniformly Observable Systems Mobile Robotic Systems Intrinsic Observers Differential Geometric Methods Euler-Lagrange Systems Contraction Analysis. Optimization, systems theory Optimeringslära, systemteori
58	Strojové učení ve strategických hrách / Machine Learning in Strategic Games Vlček, Michael January 2018 (has links) Machine learning is spearheading progress for the field of artificial intelligence in terms of providing competition in strategy games to a human opponent, be it in a game of chess, Go or poker. A field of machine learning, which shows the most promising results in playing strategy games, is reinforcement learning. The next milestone for the current research lies in a computer game Starcraft II, which outgrows the previous ones in terms of complexity, and represents a potential new breakthrough in this field. The paper focuses on analysis of the problem, and suggests a solution incorporating a reinforcement learning algorithm A2C and hyperparameter optimization implementation PBT, which could mean a step forward for the current progress.
59	Utilisation des communications Device-to-Device pour améliorer l'efficacité des réseaux cellulaires / Use of Device-to-Device communications for efficient cellular networks Ibrahim, Rita 04 February 2019 (has links) Cette thèse étudie les communications directes entre les mobiles, appelées communications D2D, en tant que technique prometteuse pour améliorer les futurs réseaux cellulaires. Cette technologie permet une communication directe entre deux terminaux mobiles sans passer par la station de base. La modélisation, l'évaluation et l'optimisation des différents aspects des communications D2D constituent les objectifs fondamentaux de cette thèse et sont réalisés principalement à l'aide des outils mathématiques suivants: la théorie des files d'attente, l'optimisation de Lyapunov et les processus de décision markovien partiellement observable POMDP. Les résultats de cette étude sont présentés en trois parties. Dans la première partie, nous étudions un schéma de sélection entre mode cellulaire et mode D2D. Nous dérivons les régions de stabilité des scénarios suivants: réseaux cellulaires purs et réseaux cellulaires où les communications D2D sont activées. Une comparaison entre ces deux scénarios conduit à l'élaboration d'un algorithme de sélection entre le mode cellulaire et le mode D2D qui permet d'améliorer la capacité du réseau. Dans la deuxième partie, nous développons un algorithme d'allocation de ressources des communications D2D. Les utilisateurs D2D sont en mesure d'estimer leur propre qualité de canal, cependant la station de base a besoin de recevoir des messages de signalisation pour acquérir cette information. Sur la base de cette connaissance disponibles au niveau des utilisateurs D2D, une approche d'allocation des ressources est proposée afin d'améliorer l'efficacité énergétique des communications D2D. La version distribuée de cet algorithme s'avère plus performante que celle centralisée. Dans le schéma distribué des collisions peuvent se produire durant la transmission de l'état des canaux D2D ; ainsi un algorithme de réduction des collisions est élaboré. En outre, la mise en œuvre des algorithmes centralisé et distribué dans un réseau cellulaire, type LTE, est décrite en détails. Dans la troisième partie, nous étudions une politique de sélection des relais D2D mobiles. La mobilité des relais représente un des principaux défis que rencontre toute stratégie de sélection de relais. Le problème est modélisé par un processus contraint de décision markovien partiellement observable qui prend en compte le dynamisme des relais et vise à trouver la politique de sélection de relais qui optimise la performance du réseau cellulaire sous des contraintes de coût. / This thesis considers Device-to-Device (D2D) communications as a promising technique for enhancing future cellular networks. Modeling, evaluating and optimizing D2D features are the fundamental goals of this thesis and are mainly achieved using the following mathematical tools: queuing theory, Lyapunov optimization and Partially Observed Markov Decision Process (POMDP). The findings of this study are presented in three parts. In the first part, we investigate a D2D mode selection scheme. We derive the queuing stability regions of both scenarios: pure cellular networks and D2D-enabled cellular networks. Comparing both scenarios leads us to elaborate a D2D vs cellular mode selection design that improves the capacity of the network. In the second part, we develop a D2D resource allocation algorithm. We observe that D2D users are able to estimate their local Channel State Information (CSI), however the base station needs some signaling exchange to acquire this information. Based on the D2D users' knowledge of their local CSI, we provide an energy efficient resource allocation framework that shows how distributed scheduling outperforms centralized one. In the distributed approach, collisions may occur between the different CSI reporting; thus, we propose a collision reduction algorithm. Moreover, we give a detailed description on how both centralized and distributed algorithms can be implemented in practice. In the third part, we propose a mobile relay selection policy in a D2D relay-aided network. Relays' mobility appears as a crucial challenge for defining the strategy of selecting the optimal D2D relays. The problem is formulated as a constrained POMDP which captures the dynamism of the relays and aims to find the optimal relay selection policy that maximizes the performance of the network under cost constraints. Réseaux Cellulaires Sélection de mode de communication Allocation des ressources Sélection des relais Théorie des files d'attente Optimisation Lyapunov Device-to-Device (D2D) communications Cellular Networks Mode selection Resource Allocation Relay selection Queuing theory Lyapunov optimization
60	Online trajectory planning and observer based control Anisi, David A. January 2006 (has links) The main body of this thesis consists of four appended papers. The first two consider different aspects of the trajectory planning problem, while the last two deal with observer design for mobile robotic and Euler-Lagrange systems respectively. The first paper addresses the problem of designing a real time, high performance trajectory planner for aerial vehicles. The main contribution is two-fold. Firstly, by augmenting a novel safety maneuver at the end of the planned trajectory, this paper extends previous results by having provable safety properties in a 3D setting. Secondly, assuming initial feasibility, the planning method is shown to have finite time task completion. Moreover, in the second part of the paper, the problem of simultaneous arrival of multiple aerial vehicles is considered. By using a time-scale separation principle, one is able to adopt standard Laplacian control to this consensus problem, which is neither unconstrained, nor first order. Direct methods for trajectory optimization are traditionally based on a priori temporal discretization and collocation methods. In the second paper, the problem of adaptive node distribution is formulated as a constrained optimization problem, which is to be included in the underlying nonlinear mathematical programming problem. The benefits of utilizing the suggested method for online trajectory optimization are illustrated by a missile guidance example. In the third paper, the problem of active observer design for an important class of non-uniformly observable systems, namely mobile robotics systems, is considered. The set of feasible configurations and the set of output flow equivalent states are defined. It is shown that the inter-relation between these two sets may serve as the basis for design of active observers. The proposed observer design methodology is illustrated by considering a unicycle robot model, equipped with a set of range-measuring sensors. Finally, in the fourth paper, a geometrically intrinsic observer for Euler-Lagrange systems is defined and analyzed. This observer is a generalization of the observer recently proposed by Aghannan and Rouchon. Their contractivity result is reproduced and complemented by a proof that the region of contraction is infinitely thin. However, assuming a priori bounds on the velocities, convergence of the observer is shown by means of Lyapunov's direct method in the case of configuration manifolds with constant curvature. / QC 20101108 Computational Optimal Control Receding Horizon Control Mission Uncertainty Safety Task Completion Consensus Problem Simultaneous Arrival Adaptive Grid Methods Missile Guidance Nonlinear Observer Design Active Observers Non--uniformly Observable Systems Mobile Robotic Systems Intrinsic Observers Differential Geometric Methods Euler-Lagrange Systems Contraction Analysis. Computational Mathematics Beräkningsmatematik

Search results