• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 84
  • 20
  • 13
  • 8
  • 5
  • 2
  • 1
  • Tagged with
  • 161
  • 161
  • 161
  • 48
  • 35
  • 33
  • 28
  • 26
  • 25
  • 24
  • 24
  • 23
  • 22
  • 20
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
101

Processos de decisão Markovianos com probabilidades imprecisas e representações relacionais: algoritmos e fundamentos. / Markov decision processes with imprecise probabilities and relational representations: foundations and algorithms.

Shirota Filho, Ricardo 03 May 2012 (has links)
Este trabalho é dedicado ao desenvolvimento teórico e algorítmico de processos de decisão markovianos com probabilidades imprecisas e representações relacionais. Na literatura, essa configuração tem sido importante dentro da área de planejamento em inteligência artificial, onde o uso de representações relacionais permite obter descrições compactas, e o emprego de probabilidades imprecisas resulta em formas mais gerais de incerteza. São três as principais contribuições deste trabalho. Primeiro, efetua-se uma discussão sobre os fundamentos de tomada de decisão sequencial com probabilidades imprecisas, em que evidencia-se alguns problemas ainda em aberto. Esses resultados afetam diretamente o (porém não restrito ao) modelo de interesse deste trabalho, os processos de decisão markovianos com probabilidades imprecisas. Segundo, propõe-se três algoritmos para processos de decisão markovianos com probabilidades imprecisas baseadas em programação (otimização) matemática. E terceiro, desenvolvem-se ideias propostas por Trevizan, Cozman e de Barros (2008) no uso de variantes do algoritmo Real-Time Dynamic Programming para resolução de problemas de planejamento probabilístico descritos através de versões estendidas da linguagem de descrição de domínios de planejamento (PPDDL). / This work is devoted to the theoretical and algorithmic development of Markov Decision Processes with Imprecise Probabilities and relational representations. In the literature, this configuration is important within artificial intelligence planning, where the use of relational representations allow compact representations and imprecise probabilities result in a more general form of uncertainty. There are three main contributions. First, we present a brief discussion of the foundations of decision making with imprecise probabilities, pointing towards key questions that remain unanswered. These results have direct influence upon the model discussed within this text, that is, Markov Decision Processes with Imprecise Probabilities. Second, we propose three algorithms for Markov Decision Processes with Imprecise Probabilities based on mathematical programming. And third, we develop ideas proposed by Trevizan, Cozman e de Barros (2008) on the use of variants of Real-Time Dynamic Programming to solve problems of probabilistic planning described by an extension of the Probabilistic Planning Domain Definition Language (PPDDL).
102

Automated Hierarchy Discovery for Planning in Partially Observable Domains

Charlin, Laurent January 2006 (has links)
Planning in partially observable domains is a notoriously difficult problem. However, in many real-world scenarios, planning can be simplified by decomposing the task into a hierarchy of smaller planning problems which, can then be solved independently of one another. Several approaches, mainly dealing with fully observable domains, have been proposed to optimize a plan that decomposes according to a hierarchy specified a priori. Some researchers have also proposed to discover hierarchies in fully observable domains. In this thesis, we investigate the problem of automatically discovering planning hierarchies in partially observable domains. The main advantage of discovering hierarchies is that, for a plan of a fixed size, hierarchical plans can be more expressive than non-hierarchical ones. Our solution frames the discovery and optimization of a hierarchical policy as a non-convex optimization problem. By encoding the hierarchical structure as variables of the optimization problem, we can automatically discover a hierarchy. Successfully solving the optimization problem therefore yields an optimal hierarchy and an optimal policy. We describe several techniques to solve the optimization problem. Namely, we provide results using general non-linear solvers, mixed-integer linear and non-linear solvers or a form of bounded hierarchical policy iteration. Our method is flexible enough to allow any parts of the hierarchy to be specified based on prior knowledge while letting the optimization discover the unknown parts. It can also discover hierarchical policies, including recursive policies, that are more compact (potentially infinitely fewer parameters).
103

Automated Hierarchy Discovery for Planning in Partially Observable Domains

Charlin, Laurent January 2006 (has links)
Planning in partially observable domains is a notoriously difficult problem. However, in many real-world scenarios, planning can be simplified by decomposing the task into a hierarchy of smaller planning problems which, can then be solved independently of one another. Several approaches, mainly dealing with fully observable domains, have been proposed to optimize a plan that decomposes according to a hierarchy specified a priori. Some researchers have also proposed to discover hierarchies in fully observable domains. In this thesis, we investigate the problem of automatically discovering planning hierarchies in partially observable domains. The main advantage of discovering hierarchies is that, for a plan of a fixed size, hierarchical plans can be more expressive than non-hierarchical ones. Our solution frames the discovery and optimization of a hierarchical policy as a non-convex optimization problem. By encoding the hierarchical structure as variables of the optimization problem, we can automatically discover a hierarchy. Successfully solving the optimization problem therefore yields an optimal hierarchy and an optimal policy. We describe several techniques to solve the optimization problem. Namely, we provide results using general non-linear solvers, mixed-integer linear and non-linear solvers or a form of bounded hierarchical policy iteration. Our method is flexible enough to allow any parts of the hierarchy to be specified based on prior knowledge while letting the optimization discover the unknown parts. It can also discover hierarchical policies, including recursive policies, that are more compact (potentially infinitely fewer parameters).
104

Efficient Partially Observable Markov Decision Process Based Formulation Of Gene Regulatory Network Control Problem

Erdogdu, Utku 01 April 2012 (has links) (PDF)
The need to analyze and closely study the gene related mechanisms motivated the research on the modeling and control of gene regulatory networks (GRN). Dierent approaches exist to model GRNs / they are mostly simulated as mathematical models that represent relationships between genes. Though it turns into a more challenging problem, we argue that partial observability would be a more natural and realistic method for handling the control of GRNs. Partial observability is a fundamental aspect of the problem / it is mostly ignored and substituted by the assumption that states of GRN are known precisely, prescribed as full observability. On the other hand, current works addressing partially observability focus on formulating algorithms for the nite horizon GRN control problem. So, in this work we explore the feasibility of realizing the problem in a partially observable setting, mainly with Partially Observable Markov Decision Processes (POMDP). We proposed a POMDP formulation for the innite horizon version of the problem. Knowing the fact that POMDP problems suer from the curse of dimensionality, we also proposed a POMDP solution method that automatically decomposes the problem by isolating dierent unrelated parts of the problem, and then solves the reduced subproblems. We also proposed a method to enrich gene expression data sets given as input to POMDP control task, because in available data sets there are thousands of genes but only tens or rarely hundreds of samples. The method is based on the idea of generating more than one model using the available data sets, and then sampling data from each of the models and nally ltering the generated samples with the help of metrics that measure compatibility, diversity and coverage of the newly generated samples.
105

On the control of airport departure operations.

Burgain, Pierrick Antoine 15 November 2010 (has links)
This thesis is focused on airport departure operations; its objective is to assign a value to surface surveillance information within a collaborative framework. The research develops a cooperative concept that improves the control of departure operations at busy airports and evaluates its merit using a classical and widely accepted airport departure model. The research then assumes departure operations are collaboratively controlled and develops a stochastic model of taxi operations on the airport surface. Finally, this study investigates the effect of feeding back different levels of surface surveillance information to the departure control process. More specifically, it examines the environmental and operational impact of aircraft surface location information on the taxi clearance process. Benefits are evaluated by measuring and comparing engine emissions for given runway utilization rates.
106

A reinforcement learning approach to obtain treatment strategies in sequential medical decision problems [electronic resource] / by Radhika Poolla.

Poolla, Radhika. January 2003 (has links)
Title from PDF of title page. / Document formatted into pages; contains 104 pages. / Thesis (M.S.I.E.)--University of South Florida, 2003. / Includes bibliographical references. / Text (Electronic thesis) in PDF format. / ABSTRACT: Medical decision problems are extremely complex owing to their dynamic nature, large number of variable factors, and the associated uncertainty. Decision support technology entered the medical field long after other areas such as the airline industry and the manufacturing industry. Yet, it is rapidly becoming an indispensable tool in medical decision making problems including the class of sequential decision problems. In these problems, physicians decide on a treatment plan that optimizes a benefit measure such as the treatment cost, and the quality of life of the patient. The last decade saw the emergence of many decision support applications in medicine. However, the existing models have limited applications to decision problems with very few states and actions. An urgent need is being felt by the medical research community to expand the applications to more complex dynamic problems with large state and action spaces. / ABSTRACT: This thesis proposes a methodology which models the class of sequential medical decision problems as a Markov decision process, and solves the model using a simulation based reinforcement learning (RL) algorithm. Such a methodology is capable of obtaining near optimal treatment strategies for problems with large state and action spaces. This methodology overcomes, to a large extent, the computational complexity of the value-iteration and policy-iteration algorithms of dynamic programming. An average reward reinforcement-learning algorithm is developed. The algorithm is applied on a sample problem of treating hereditary spherocytosis. The application demonstrates the ability of the proposed methodology to obtain effective treatment strategies for sequential medical decision problems. / System requirements: World Wide Web browser and PDF reader. / Mode of access: World Wide Web.
107

Multimedia Delivery over Heterogeneous Wireless Networks

Xing, Min 29 April 2015 (has links)
There is an increasing demand for multimedia services in heterogeneous wireless networks. Considering the highly dynamic wireless channels and the relatively large size of the multimedia data, how to support efficient and reliable multimedia delivery is a pressing issue. In this dissertation, we investigate the multimedia delivery algorithms in heterogeneous wireless networks from three different aspects. First, we study the single-flow rate adaptation of video streaming algorithm over multiple wireless interfaces. In order to maintain high video streaming quality while reducing the wireless service cost, the optimal video streaming process with multiple links is formulated as a Markov Decision Process (MDP). The reward function is designed to consider the quality of service (QoS) requirements for video traffic, such as the startup latency, playback fluency, average playback quality, playback smoothness and wireless service cost. To solve the MDP in real time, we propose an adaptive, best-action search algorithm to obtain a sub-optimal solution. To evaluate the performance of the proposed adaptation algorithm, we implemented a testbed using the Android mobile phone and the Scalable Video Coding (SVC) codec and conducted experiments with real video flow. Then, with the multiple multimedia flows competing for limited wireless resources, we propose a utility-based scheduling algorithm for multimedia transmission in Drive-thru Internet. A utility model is devised to map the throughput to user's satisfaction level in terms of multimedia data quality, such as Peak Signal-to-Noise Ratio (PSNR) of video. The objective of the scheduling problem is to maximize the total utility. Then the optimization problem is formulated as a finite-state decision problem with the assumption that future arrival information is known, and it is solved by a searching algorithm as the benchmark. To obtain a real-time solution, a practical heuristic algorithm based on the concept of utility potential is devised. We further implemented the solution and conducted extensive simulations using NS-3. Finally, the multimedia dissemination problem in large-scale VANETs is investigated. We first utilize a hybrid-network framework to address the mobility and scalability issues in large-scale VANETs content distribution. Then, we formulate a utility-based maximization problem to find the best delivery strategy and select an optimal path for the multimedia data dissemination, where the utility function has taken the delivery delay, the Quality of Services (QoS) and the storage cost into consideration. We obtain the closed-form of the utility function, and then obtain the optimal solution of the problem with the convex optimization theory. Finally, we conducted extensive trace-driven simulations to evaluate the performance of the proposed algorithm with real traces collected by taxis in Shanghai. In summary, the research outcomes of the dissertation can contribute to three different aspects of multimedia delivery in heterogeneous wireless networks. First, we have proposed a real-time rate adaptation algorithm for video streaming with multiple wireless interfaces, to maintain the high quality while reducing the wireless services cost. Second, we have presented an optimal scheduling algorithm which can maximize the total satisfaction for multimedia transmission in Drive-thru Internet. Third, we have derived the theoretical analysis of the utility functions including delivery delay, QoS and the storage cost, and have obtained an optimal solution for multimedia data dissemination in large-scale VANETs to achieve the highest utility. / Graduate / 0984 / 0544
108

Contributions to Simulation-based High-dimensional Sequential Decision Making

Hoock, Jean-Baptiste 10 April 2013 (has links) (PDF)
My thesis is entitled "Contributions to Simulation-based High-dimensional Sequential Decision Making". The context of the thesis is about games, planning and Markov Decision Processes. An agent interacts with its environment by successively making decisions. The agent starts from an initial state until a final state in which the agent can not make decision anymore. At each timestep, the agent receives an observation of the state of the environment. From this observation and its knowledge, the agent makes a decision which modifies the state of the environment. Then, the agent receives a reward and a new observation. The goal is to maximize the sum of rewards obtained during a simulation from an initial state to a final state. The policy of the agent is the function which, from the history of observations, returns a decision. We work in a context where (i) the number of states is huge, (ii) reward carries little information, (iii) the probability to reach quickly a good final state is weak and (iv) prior knowledge is either nonexistent or hardly exploitable. Both applications described in this thesis present these constraints : the game of Go and a 3D simulator of the european project MASH (Massive Sets of Heuristics). In order to take a satisfying decision in this context, several solutions are brought : 1. Simulating with the compromise exploration/exploitation (MCTS) 2. Reducing the complexity by local solving (GoldenEye) 3. Building a policy which improves itself (RBGP) 4. Learning prior knowledge (CluVo+GMCTS) Monte-Carlo Tree Search (MCTS) is the state of the art for the game of Go. From a model of the environment, MCTS builds incrementally and asymetrically a tree of possible futures by performing Monte-Carlo simulations. The tree starts from the current observation of the agent. The agent switches between the exploration of the model and the exploitation of decisions which statistically give a good cumulative reward. We discuss 2 ways for improving MCTS : the parallelization and the addition of prior knowledge. The parallelization does not solve some weaknesses of MCTS; in particular some local problems remain challenges. We propose an algorithm (GoldenEye) which is composed of 2 parts : detection of a local problem and then its resolution. The algorithm of resolution reuses some concepts of MCTS and it solves difficult problems of a classical database. The addition of prior knowledge by hand is laborious and boring. We propose a method called Racing-based Genetic Programming (RBGP) in order to add automatically prior knowledge. The strong point is that RBGP rigorously validates the addition of a prior knowledge and RBGP can be used for building a policy (instead of only optimizing an algorithm). In some applications such as MASH, simulations are too expensive in time and there is no prior knowledge and no model of the environment; therefore Monte-Carlo Tree Search can not be used. So that MCTS becomes usable in this context, we propose a method for learning prior knowledge (CluVo). Then we use pieces of prior knowledge for improving the rapidity of learning of the agent and for building a model, too. We use from this model an adapted version of Monte-Carlo Tree Search (GMCTS). This method solves difficult problems of MASH and gives good results in an application to a word game.
109

Load allocation for optimal risk management in systems with incipient failure modes

Bole, Brian McCaslyn 13 January 2014 (has links)
The development and implementation challenges associated with a proposed load allocation paradigm for fault risk assessment and system health management based on uncertain fault diagnostic and failure prognostic information are investigated. Health management actions are formulated in terms of a value associated with improving system reliability, and a cost associated with inducing deviations from a system's nominal performance. Three simulated case study systems are considered to highlight some of the fundamental challenges of formulating and solving an optimization on the space of available supervisory control actions in the described health management architecture. Repeated simulation studies on the three case-study systems are used to illustrate an empirical approach for tuning the conservatism of health management policies by way of adjusting risk assessment metrics in the proposed health management paradigm. The implementation and testing of a real-world prognostic system is presented to illustrate model development challenges not directly addressed in the analysis of the simulated case study systems. Real-time battery charge depletion prediction for a small unmanned aerial vehicle is considered in the real-world case study. An architecture for offline testing of prognostics and decision making algorithms is explained to facilitate empirical tuning of risk assessment metrics and health management policies, as was demonstrated for the three simulated case study systems.
110

Optimal Control and Estimation of Stochastic Systems with Costly Partial Information

Kim, Michael J. 31 August 2012 (has links)
Stochastic control problems that arise in sequential decision making applications typically assume that information used for decision-making is obtained according to a predetermined sampling schedule. In many real applications however, there is a high sampling cost associated with collecting such data. It is therefore of equal importance to determine when information should be collected as it is to decide how this information should be utilized for optimal decision-making. This type of joint optimization has been a long-standing problem in the operations research literature, and very few results regarding the structure of the optimal sampling and control policy have been published. In this thesis, the joint optimization of sampling and control is studied in the context of maintenance optimization. New theoretical results characterizing the structure of the optimal policy are established, which have practical interpretation and give new insight into the value of condition-based maintenance programs in life-cycle asset management. Applications in other areas such as healthcare decision-making and statistical process control are discussed. Statistical parameter estimation results are also developed with illustrative real-world numerical examples.

Page generated in 0.0617 seconds