Global ETD Search

61	Localização multirrobo cooperativa com planejamento / Planning for multi-robot localization Pinheiro, Paulo Gurgel, 1983- 11 September 2018 (has links) Orientador: Jacques Wainer / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-09-11T21:14:07Z (GMT). No. of bitstreams: 1 Pinheiro_PauloGurgel_M.pdf: 1259816 bytes, checksum: a4783df9aa3755becb68ee233ad43e3c (MD5) Previous issue date: 2009 / Resumo: Em um problema de localização multirrobô cooperativa, um grupo de robôs encontra-se em um determinado ambiente, cuja localização exata de cada um dos robôs é desconhecida. Neste cenário, uma distribuição de probabilidades aponta as chances de um robô estar em um determinado estado. É necessário então, que os robôs se movimentem pelo ambiente e gerem novas observações que serão compartilhadas, para calcular novas estimativas. Nos últimos anos, muitos trabalhos têm focado no estudo de técnicas probabilísticas, modelos de comunicação e modelos de detecções, para resolver o problema de localização. No entanto, a movimentação dos robôs é, em geral, definida por ações aleatórias. Ações aleatórias geram observações que podem ser inúteis para a melhoria da estimativa. Este trabalho apresenta uma proposta de localização com suporte a planejamento de ações. O objetivo é apresentar um modelo cujas ações realizadas pelos robôs são definidas por políticas. Escolhendo a melhor ação a ser realizada, é possível receber informações mais úteis dos sensores internos e externos e estimar as posturas mais rapidamente. O modelo proposto, denominado Modelo de Localização Planejada - MLP, utiliza POMDPs para modelar os problemas de localização e algoritmos específicos de geração de políticas. Foi utilizada a localização de Markov como técnica probabilística de localização e implementadas versões de modelos de detecção e propagação de informação. Neste trabalho, um simulador de problemas de localização multirrobô foi desenvolvido, no qual foram realizados experimentos em que o modelo proposto foi comparado a um modelo que não faz uso de planejamento de ações. Os resultados obtidos apontam que o modelo proposto é capaz de estimar as posturas dos robôs com uma menor quantidade de passos, sendo significativamente mais e ciente do que o modelo comparado sem planejamento. / Abstract: In a cooperative multi-robot localization problem, a group of robots is in a certain environment, where the exact location of each robot is unknown. In this scenario, there is only a distribution of probabilities indicating the chance of a robot to be in a particular state. It is necessary for the robots to move in the environment generating new observations, which will be shared to calculate new estimates. Currently, many studies have focused on the study of probabilistic techniques, models of communication and models of detection to solve the localization problem. However, the movement of robots is generally defined by random actions. Random actions generate observations that can be useless for improving the estimate. This work describes a proposal for multi-robot localization with support planning of actions. The objective is to describe a model whose actions performed by robots are defined by policies. Choosing the best action to be performed, the robot gets more useful information from internal and external sensors and estimates the posture more quickly. The proposed model, called Model of Planned Localization - MPL, uses POMDPs to model the problems of location and specific algorithms to generate policies. The Markov localization was used as probabilistic technique of localization and implemented versions of detection models and information propagation model. In this work, a simulator to multi-robot localization problems was developed, in which experiments were performed. The proposed model was compared to a model that does not make use of planning actions. The results showed that the proposed model is able to estimate the positions of robots with lower number of steps, being more e-cient than model compared. / Mestrado / Inteligencia Artificial / Mestre em Ciência da Computação Markov, Localização de Robótica Markov, Processos de Localização multirrobô Planejamento Markov localization Robotic Multi-robot localization Planning
62	Estratégias para otimização do algoritmo de Iteração de Valor Sensível a Risco / Strategies for optimization of Risk Sensitive Value Iteration algorithm Borges, Igor Oliveira 11 October 2018 (has links) Processos de decisão markovianos sensíveis a risco (Risk Sensitive Markov Decision Process - RS-MDP) permitem modelar atitudes de aversão e propensão ao risco no processo de tomada de decisão usando um fator de risco para representar a atitude ao risco. Para esse modelo, existem operadores que são baseados em funções de transformação linear por partes que incluem fator de risco e fator de desconto. Nesta dissertação são formulados dois algoritmos de Iteração de Valor Sensível a Risco baseados em um desses operadores, esses algoritmos são chamados de Iteração de Valor Sensível a Risco Síncrono (Risk Sensitive Value Iteration - RSVI) e Iteração de Valor Sensível a Risco Assíncrono (Asynchronous Risk Sensitive Value Iteration- A-RSVI). Também são propostas duas heurísticas que podem ser utilizadas para inicializar os valores dos algoritmos de forma a torná-los mais eficentes. Os resultados dos experimentos no domínio de Travessia do Rio em dois cenários de recompensas distintos mostram que: (i) o custo de processamento de políticas extremas a risco, tanto de aversão quanto de propensão, é elevado; (ii) um desconto elevado aumenta o tempo de convergência do algoritmo e reforça a sensibilidade ao risco adotada; (iii) políticas com valores para o fator de risco intermediários possuem custo computacional baixo e já possuem certa sensibilidade ao risco dependendo do fator de desconto utilizado; e (iv) o algoritmo A-RSVI com a heurística baseada no fator de risco pode reduzir o tempo para o algoritmo convergir, especialmente para valores extremos do fator de risco / Risk Sensitive Markov Decision Process (RS-MDP) allows modeling risk-averse and risk-prone attitudes in decision-making process using a risk factor to represent the risk-attitude. For this model, there are operators that are based on a piecewise linear transformation function that includes a risk factor and a discount factor. In this dissertation we formulate two Risk Sensitive Value Iteration algorithms based on one of these operators, these algorithms are called Synchronous Risk Sensitive Value Iteration (RSVI) and Asynchronous Risk Sensitive Value Iteration (A-RSVI). We also propose two heuristics that can be used to initialize the value of the RSVI or A-RSVI algorithms in order to make them more efficient. The results of experiments with the River domain in two distinct rewards scenarios show that: (i) the processing cost in extreme risk policies, for both risk-averse and risk-prone, is high; (ii) a high discount value increases the convergence time and reinforces the chosen risk attitude; (iii) policies with intermediate risk factor values have a low computational cost and show a certain sensitivity to risk based on the discount factor; and (iv) the A-RSVI algorithm with the heuristic based on the risk factor can decrease the convergence time of the algorithm, especially when we need a solution for extreme values of the risk factor Planejamento Estocástico Política Sensível a Risco Risk Sensitive Markov Decision Process Risk Sensitive Policy Stochastic Planning
63	A Partially Observable Markov Decision Process for Breast Cancer Screening Hudson, Joshua January 2019 (has links) In the US, breast cancer is one of the most common forms of cancer and the most lethal. There are many decisions that must be made by the doctor and/or the patient when dealing with a potential breast cancer. Many of these decisions are made under uncertainty, whether it is the uncertainty related to the progression of the patient's health, or that related to the accuracy of the doctor's tests. Each possible action under consideration can have positive effects, such as a surgery successfully removing a tumour, and negative effects: a post-surgery infection for example. The human mind simply cannot take into account all the variables involved and possible outcomes when making these decisions. In this report, a detailed Partially Observable Markov Decision Process (POMDP) for breast cancer screening decisions is presented. It includes 151 states, covering 144 different cancer states, and 2 competing screening methods. The necessary parameters were first set up using relevant medical literature and a patient history simulator. Then the POMDP was solved optimally for an infinite horizon, using the Perseus algorithm. The resulting policy provided several recommendations for breast cancer screening. The results indicated that clinical breast examinations are important for screening younger women. Regarding the decision to operate on a woman with breast cancer, the policy showed that invasive cancers with either a tumour size above 1.5 cm or which are in metastasis, should be surgically removed as soon as possible. However, the policy also recommended that patients who are certain to be healthy should have a breast biopsy. The cause of this error was explored further and the conclusion was reached that a finite horizon may be more appropriate for this application. POMDP Markov Decision Process Breast Cancer Screening Operations Research Probability Theory and Statistics Sannolikhetsteori och statistik Computer Sciences Datavetenskap (datalogi)
64	A Reinforcement Learning Approach To Obtain Treatment Strategies In Sequential Medical Decision Problems Poolla, Radhika 14 August 2003 (has links) Medical decision problems are extremely complex owing to their dynamic nature, large number of variable factors, and the associated uncertainty. Decision support technology entered the medical field long after other areas such as the airline industry and the manufacturing industry. Yet, it is rapidly becoming an indispensable tool in medical decision making problems including the class of sequential decision problems. In these problems, physicians decide on a treatment plan that optimizes a benefit measure such as the treatment cost, and the quality of life of the patient. The last decade saw the emergence of many decision support applications in medicine. However, the existing models have limited applications to decision problems with very few states and actions. An urgent need is being felt by the medical research community to expand the applications to more complex dynamic problems with large state and action spaces. This thesis proposes a methodology which models the class of sequential medical decision problems as a Markov decision process, and solves the model using a simulation based reinforcement learning (RL) algorithm. Such a methodology is capable of obtaining near optimal treatment strategies for problems with large state and action spaces. This methodology overcomes, to a large extent, the computational complexity of the value-iteration and policy-iteration algorithms of dynamic programming. An average reward reinforcement-learning algorithm is developed. The algorithm is applied on a sample problem of treating hereditary spherocytosis. The application demonstrates the ability of the proposed methodology to obtain effective treatment strategies for sequential medical decision problems. dynamic decision model markov decision process hereditory spherocytosis intervention quality adjusted life years average reward American Studies Arts and Humanities
65	Scaling Up Reinforcement Learning without Sacrificing Optimality by Constraining Exploration Mann, Timothy 1984- 14 March 2013 (has links) The purpose of this dissertation is to understand how algorithms can efficiently learn to solve new tasks based on previous experience, instead of being explicitly programmed with a solution for each task that we want it to solve. Here a task is a series of decisions, such as a robot vacuum deciding which room to clean next or an intelligent car deciding to stop at a traffic light. In such a case, state-of-the-art learning algorithms are difficult to employ in practice because they often make thou- sands of mistakes before reliably solving a task. However, humans learn solutions to novel tasks, often making fewer mistakes, which suggests that efficient learning algorithms may exist. One advantage that humans have over state- of-the-art learning algorithms is that, while learning a new task, humans can apply knowledge gained from previously solved tasks. The central hypothesis investigated by this dissertation is that learning algorithms can solve new tasks more efficiently when they take into consideration knowledge learned from solving previous tasks. Al- though this hypothesis may appear to be obviously true, what knowledge to use and how to apply that knowledge to new tasks is a challenging, open research problem. I investigate this hypothesis in three ways. First, I developed a new learning algorithm that is able to use prior knowledge to constrain the exploration space. Second, I extended a powerful theoretical framework in machine learning, called Probably Approximately Correct, so that I can formally compare the efficiency of algorithms that solve only a single task to algorithms that consider knowledge from previously solved tasks. With this framework, I found sufficient conditions for using knowledge from previous tasks to improve efficiency of learning to solve new tasks and also identified conditions where transferring knowledge may impede learning. I present situations where transfer learning can be used to intelligently constrain the exploration space so that optimality loss can be minimized. Finally, I tested the efficiency of my algorithms in various experimental domains. These theoretical and empirical results provide support for my central hypothesis. The theory and experiments of this dissertation provide a deeper understanding of what makes a learning algorithm efficient so that it can be widely used in practice. Finally, these results also contribute the general goal of creating autonomous machines that can be reliably employed to solve complex tasks. pruning scaling multiarmed bandit Markov decision process exploration/exploitation dilemma exploration machine learning transfer learning reinforcement learning
66	Delay-aware Scheduling in Wireless Coding Networks: To Wait or Not to Wait Ramasamy, Solairaja 2010 December 1900 (has links) Wireless technology has become an increasingly popular way to gain network access. Wireless networks are expected to provide efficient and reliable service and support a broad range of emerging applications, such as multimedia streaming and video conferencing. However, limited wireless spectrum together with interference and fading pose signi cant challenges for network designers. The novel technique of network coding has a significant potential for improving the throughput and reliability of wireless networks by taking advantage of the broadcast nature of wireless medium. Reverse carpooling is one of the main techniques used to realize the benefits of network coding in wireless networks. With reverse carpooling, two flows are traveling in opposite directions, sharing a common path. The network coding is performed in the intermediate (relay) nodes, which saves up to 50% of transmissions. In this thesis, we focus on the scheduling at the relay nodes in wireless networks with reverse carpooling. When two packets traveling in opposite directions are available at the relay node, the relay node combines them and broadcasts the resulting packet. This event is referred to as a coding opportunity. When only one packet is available, the relay node needs to decide whether to wait for future coding opportunities, or to transmit them without coding. Though the choice of holding packets exploits the positive aspects of network coding, without a proper policy in place that controls how long the packets should wait, it will have an adverse impact on delays and thus the overall network performance. Accordingly, our goal is to find an optimal control strategy that delicately balances the tradeoff between the number of transmissions and delays incurred by the packets. We also address the fundamental question of what local information we should keep track of and use in making the decision of of whether to transmit uncoded packet or wait for the next coding opportunity. The available information consists of queue length and time stamps indicating the arrival time of packets in the queue. We could also store history of all previous states and actions. However, using all this information makes the control very complex and so we try to find if the overhead in collecting waiting times and historical information is worth it. A major contribution of this thesis is a stochastic control framework that uses state information based on what can be observed and prescribes an optimal action. For that, we formulate and solve a stochastic dynamic program with the objective of minimizing the long run average cost per unit time incurred due to transmissions and delays. Subsequently, we show that a stationary policy based on queue lengths is optimal, and the optimal policy is of threshold-type. Then, we describe a non-linear optimization procedure to obtain the optimal thresholds. Further, we substantiate our analytical ndings by performing numerical experiments under varied settings. We compare systems that use only queue length with those where more information is available, and we show that optimal control that uses only the queue length is as good as any optimal control that relies on knowing the entire history. wireless networks network coding reverse carpooling markov decision process distributed controller relay node scheduling tradefoff queueing delay
67	Lifetime Condition Prediction For Bridges Bayrak, Hakan 01 October 2011 (has links) (PDF) Infrastructure systems are crucial facilities. They supply the necessary transportation, water and energy utilities for the public. However, while aging, these systems gradually deteriorate in time and approach the end of their lifespans. As a result, they require periodic maintenance and repair in order to function and be reliable throughout their lifetimes. Bridge infrastructure is an essential part of the transportation infrastructure. Bridge management systems (BMSs), used to monitor the condition and safety of the bridges in a bridge infrastructure, have evolved considerably in the past decades. The aim of BMSs is to use the resources in an optimal manner keeping the bridges out of risk of failure. The BMSs use the lifetime performance curves to predict the future condition of the bridge elements or bridges. The most widely implemented condition-based performance prediction and maintenance optimization model is the Markov Decision Process-based models (MDP). The importance of the Markov Decision Process-based model is that it defines the time-variant deterioration using the Markov Transition Probability Matrix and performs the lifetime cost optimization by finding the optimum maintenance policy. In this study, the Markov decision process-based model is examined and a computer program to find the optimal policy with discounted life-cycle cost is developed. The other performance prediction model investigated in this study is a probabilistic Bi-linear model which takes into account the uncertainties for the deterioration process and the application of maintenance actions by the use of random variables. As part of the study, in order to further analyze and develop the Bi-linear model, a Latin Hypercube Sampling-based (LHS) simulation program is also developed and integrated into the main computational algorithm which can produce condition, safety, and life-cycle cost profiles for bridge members with and without maintenance actions. Furthermore, a polynomial-based condition prediction is also examined as an alternative performance prediction model. This model is obtained from condition rating data by applying regression analysis. Regression-based performance curves are regenerated using the Latin Hypercube sampling method. Finally, the results from the Markov chain-based performance prediction are compared with Simulation-based Bi-linear prediction and the derivation of the transition probability matrix from simulated regression based condition profile is introduced as a newly developed approach. It has been observed that the results obtained from the Markov chain-based average condition rating profiles match well with those obtained from Simulation-based mean condition rating profiles. The result suggests that the Simulation-based condition prediction model may be considered as a potential model in future BMSs. TG Bridge Engineering. 1-470
68	Extensions of Multistage Stochastic Optimization with Applications in Energy and Healthcare Kuznia, Ludwig Charlemagne 01 January 2012 (has links) This dissertation focuses on extending solution methods in the area of stochastic optimization. Attention is focused to three specific problems in the field. First, a solution method for mixed integer programs subject to chance constraints is discussed. This class of problems serves as an effective modeling framework for a wide variety of applied problems. Unfortunately, chance constrained mixed integer programs tend to be very challenging to solve. Thus, the aim of this work is to address some of these challenges by exploiting the structure of the deterministic reformulation for the problem. Second, a stochastic program for integrating renewable energy sources into traditional energy systems is developed. As the global push for higher utilization of such green resources increases, such models will prove invaluable to energy system designers. Finally, a process for transforming clinical medical data into a model to assist decision making during the treatment planning phase for palliative chemotherapy is outlined. This work will likely provide decision support tools for oncologists. Moreover, given the new requirements for the usage electronic medical records, such techniques will have applicability to other treatment planning applications in the future. Benders' decomposition chemotherapy Markov decision process probabilistic programming random processes renewable energy systems American Studies Arts and Humanities Operational Research
69	A MARKOV DECISION PROCESS EMBEDDED WITH PREDICTIVE MODELING: A MODELING APPROACH FROM SYSTEM DYNAMICS MATHEMATICAL MODELS, AGENT-BASED MODELS TO A CLINICAL DECISION MAKING Shi, Zhenzhen January 1900 (has links) Doctor of Philosophy / Department of Industrial & Manufacturing Systems Engineering / David H. Ben-Arieh / Chih-Hang Wu / Patients who suffer from sepsis or septic shock are of great concern in the healthcare system. Recent data indicate that more than 900,000 severe sepsis or septic shock cases developed in the United States with mortality rates between 20% and 80%. In the United States alone, almost $17 billion is spent each year for the treatment of patients with sepsis. Clinical trials of treatments for sepsis have been extensively studied in the last 30 years, but there is no general agreement of the effectiveness of the proposed treatments for sepsis. Therefore, it is necessary to find accurate and effective tools that can help physicians predict the progression of disease in a patient-specific way, and then provide physicians recommendation on the treatment of sepsis to lower risk for patients dying from sepsis. The goal of this research is to develop a risk assessment tool and a risk management tool for sepsis. In order to achieve this goal, two system dynamic mathematical models (SDMMs) are initially developed to predict dynamic patterns of sepsis progression in innate immunity and adaptive immunity. The two SDMMs are able to identify key indicators and key processes of inflammatory responses to an infection, and a sepsis progression. Second, an integrated-mathematical-multi-agent-based model (IMMABM) is developed to capture the stochastic nature embedded in the development of inflammatory responses to a sepsis. Unlike existing agent-based models, this agent-based model is enhanced by incorporating developed SDMMs and extensive experimental data. With the risk assessment tools, a Markov decision process (MDP) is proposed, as a risk management tool, to apply to clinical decision-makings on sepsis. With extensive computational studies, the major contributions of this research are to firstly develop risk assessment tools to identify the risk of sepsis development during the immune system responding to an infection, and secondly propose a decision-making framework to manage the risk of infected individuals dying from sepsis. The methodology and modeling framework used in this dissertation can be expanded to other disease situations and treatment applications, and have a broad impact to the research area related to computational modeling, biology, medical decision-making, and industrial engineering. Health Care Management (0769) Immunology (0982) Industrial Engineering (0546)
70	Combinatorial optimization and Markov decision process for planning MRI examinations Geng, Na 29 April 2010 (has links) (PDF) This research is motivated by our collaborations with a large French university teaching hospital in order to reduce the Length of Stay (LoS) of stroke patients treated in the neurovascular department. Quick diagnosis is critical for stroke patients but relies on expensive and heavily used imaging facilities such as MRI (Magnetic Resonance Imaging) scanners. Therefore, it is very important for the neurovascular department to reduce the patient LoS by reducing their waiting time of imaging examinations. From the neurovascular department perspective, this thesis proposes a new MRI examinations reservation process in order to reduce patient waiting times without degrading the utilization of MRI. The service provider, i.e., the imaging department, reserves each week a certain number of appropriately distributed contracted time slots (CTS) for the neurovascular department to ensure quick MRI examination of stroke patients. In addition to CTS, it is still possible for stroke patients to get MRI time slots through regular reservation (RTS). This thesis first proposes a stochastic programming model to simultaneously determine the contract decision, i.e., the number of CTS and its distribution, and the patient assignment policy to assign patients to either CTS or RTS. To solve this problem, structure properties of the optimal patient assignment policy for a given contract are proved by an average cost Markov decision process (MDP) approach. The contract is determined by a Monte Carlo approximation approach and then improved by local search. Computational experiments show that the proposed algorithms can efficiently solve the model. The new reservation process greatly reduces the average waiting time of stroke patients. At the same time, some CTS cannot be used for the lack of patients.To reduce the unused CTS, we further explore the possibility of the advance cancellation of CTS. Structure properties of optimal control policies for one-day and two-day advance cancellation are established separately via an average-cost MDP approach with appropriate modeling and advanced convexity concepts used in control of queueing systems. Computational experiments show that appropriate advance cancellations of CTS greatly reduce the unused CTS with nearly the same waiting times. [SDV] Life Sciences [SDV] Sciences du Vivant Planning MRI exams Contract Advance cancellation Markov decision process Stochastic programming Optimal strategies

Search results