• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 51
  • 22
  • 17
  • 6
  • 6
  • 5
  • 1
  • Tagged with
  • 136
  • 136
  • 112
  • 39
  • 27
  • 22
  • 22
  • 21
  • 19
  • 18
  • 17
  • 17
  • 17
  • 16
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Contributions to the theory of Gittins indices : with applications in pharmaceutical research and clinical trials

Wang, You-Gan January 1991 (has links)
No description available.
2

Search behaviour : an analysis of information collection and usage during the decision process

Fletcher, Keith January 1986 (has links)
The purpose of this research was to investigate the nature of consumer decision making. It considered the purchase of a video cassette recorder and investigated whether the assumptions of a model based on satisficing behaviour could be justified. It considered the nature of search behaviour and evaluation during the decision process and the factors which might influence it. The research therefore studied the stages of the decision process from the nature of Problem Recognition and Problem Classification, including the development of evoked sets during the decision process, the preference for and use of different information sources, the nature of search behaviour, the importance of choice criteria and the decision rules used while employing these choice criteria. This was investigated using three seperate but linked research approaches. A sample of the population in the West of Scotland was analysed to investigate differences between video owners and non video owners, while qualitative interviews were conducted to study the decision process itself. Conjoint Analysis was used to consider the relative importance of choice criteria. The study confirmed the sequential nature of the decision process and found a phased sequence of choice and search. Despite the nature of the good (expensive and innovative) the decision was generally considered of a low involvement nature. While the predictions of low involvement learning that a satisficing decision would be taken were found to be true our findings disagreed with the accepted theory on the use of information sources. It was also considered that it would be wrong to assume no cognitive processes were taking place as various choice heuristics were found which simplified the decision for the consumer.
3

Acceleration of Iterative Methods for Markov Decision Processes

Shlakhter, Oleksandr 21 April 2010 (has links)
This research focuses on Markov Decision Processes (MDP). MDP is one of the most important and challenging areas of Operations Research. Every day people make many decisions: today's decisions impact tomorrow's and tomorrow's will impact the ones made the day after. Problems in Engineering, Science, and Business often pose similar challenges: a large number of options and uncertainty about the future. MDP is one of the most powerful tools for solving such problems. There are several standard methods for finding optimal or approximately optimal policies for MDP. Approaches widely employed to solve MDP problems include value iteration and policy iteration. Although simple to implement, these approaches are, nevertheless, limited in the size of problems that can be solved, due to excessive computation required to find close-to-optimal solutions. My thesis proposes a new value iteration and modified policy iteration methods for classes of the expected discounted MDPs and average cost MDPs. We establish a class of operators that can be integrated into value iteration and modified policy iteration algorithms for Markov Decision Processes, so as to speed up the convergence of the iterative search. Application of these operators requires a little additional computation per iteration but reduces the number of iterations significantly. The development of the acceleration operators relies on two key properties of Markov operator, namely contraction mapping and monotonicity in a restricted region. Since Markov operators of the classical value iteration and modified policy iteration methods for average cost MDPs do not possess the contraction mapping property, for these models we restrict our study to average cost problems that can be formulated as the stochastic shortest path problem. The performance improvement is significant, while the implementation of the operator into the value iteration is trivial. Numerical studies show that the accelerated methods can be hundreds of times more efficient for solving MDP problems than the other known approaches. The computational savings can be significant especially when the discount factor approaches 1 and the transition probability matrix becomes dense, in which case the standard iterative algorithms suffer from slow convergence.
4

Acceleration of Iterative Methods for Markov Decision Processes

Shlakhter, Oleksandr 21 April 2010 (has links)
This research focuses on Markov Decision Processes (MDP). MDP is one of the most important and challenging areas of Operations Research. Every day people make many decisions: today's decisions impact tomorrow's and tomorrow's will impact the ones made the day after. Problems in Engineering, Science, and Business often pose similar challenges: a large number of options and uncertainty about the future. MDP is one of the most powerful tools for solving such problems. There are several standard methods for finding optimal or approximately optimal policies for MDP. Approaches widely employed to solve MDP problems include value iteration and policy iteration. Although simple to implement, these approaches are, nevertheless, limited in the size of problems that can be solved, due to excessive computation required to find close-to-optimal solutions. My thesis proposes a new value iteration and modified policy iteration methods for classes of the expected discounted MDPs and average cost MDPs. We establish a class of operators that can be integrated into value iteration and modified policy iteration algorithms for Markov Decision Processes, so as to speed up the convergence of the iterative search. Application of these operators requires a little additional computation per iteration but reduces the number of iterations significantly. The development of the acceleration operators relies on two key properties of Markov operator, namely contraction mapping and monotonicity in a restricted region. Since Markov operators of the classical value iteration and modified policy iteration methods for average cost MDPs do not possess the contraction mapping property, for these models we restrict our study to average cost problems that can be formulated as the stochastic shortest path problem. The performance improvement is significant, while the implementation of the operator into the value iteration is trivial. Numerical studies show that the accelerated methods can be hundreds of times more efficient for solving MDP problems than the other known approaches. The computational savings can be significant especially when the discount factor approaches 1 and the transition probability matrix becomes dense, in which case the standard iterative algorithms suffer from slow convergence.
5

A minimum cost and risk mitigation approach for blood collection

Zeng, Chenxi 27 May 2016 (has links)
Due to the limited supply and perishable nature of blood products, effective management of blood collection is critical for high quality healthcare delivery. Whole blood is typically collected over a 6 to 8 hour collection window from volunteer donors at sites, e.g., schools, universities, churches, companies, that are a significant distance from the blood products processing facility and then transported from collection site to processing facility by a blood mobile. The length of time between collecting whole blood and processing it into cryoprecipitate ("cryo"), a critical blood product for controlling massive hemorrhaging, cannot take longer than 8 hours (the 8 hour collection to completion constraint), while the collection to completion constraint for other blood products is 24 hours. In order to meet the collection to completion constraint for cryo, it is often necessary to have a "mid-drive collection"; i.e., for a vehicle other than the blood mobile to pickup and transport, at extra cost, whole blood units collected during early in the collection window to the processing facility. In this dissertation, we develop analytical models to: (1) analyze which collection sites should be designated as cryo collection sites to minimize total collection costs while satisfying the collection to completion constraint and meeting the weekly production target (the non-split case), (2) analyze the impact of changing the current process to allow collection windows to be split into two intervals and then determining which intervals should be designated as cryo collection intervals (the split case), (3) insure that the weekly production target is met with high probability. These problems lead to MDP models with large state and action spaces and constraints to guarantee that the weekly production target is met with high probability. These models are computationally intractable for problems having state and action spaces of realistic cardinality. We consider two approaches to guarantee that the weekly production target is met with high probability: (1) a penalty function approach and (2) a chance constraint approach. For the MDP with penalty function approach, we first relax a constraint that significantly reduces the cardinality of the state space and provides a lower bound on the optimal expected weekly cost of collecting whole blood for cryo while satisfying the collection to completion constraint. We then present an action elimination procedure that coupled with the constraint relaxation leads to a computationally tractable lower bound. We then develop several heuristics that generate sub-optimal policies and provide an analytical description of the difference between the upper and lower bounds in order to determine the quality of the heuristics. For the multiple decision epoch MDP model with chance constraint approach, we first note by example that a straightforward application of dynamic programming can lead to a sub-optimal policy. We then restrict the model to a single decision epoch. We then use a computationally tractable rolling horizon procedure for policy determination. We also present a simple greedy heuristic (another rolling horizon decision making procedure) based on ranking the collection intervals by mid-drive pickup cost per unit of expected cryo collected, which results in a competitive sub-optimal solution and leads to the development of a practical decision support tool (DST). Using real data from the American Red Cross (ARC), we estimate that this DST reduces total cost by about 30% for the non-split case and 70% for the split case, compared to the current practice. Initial implementation of the DST at the ARC Southern regional manufacturing and service center supports our estimates and indicates the potential for significant improvement in current practice.
6

Reinforcement learning by incremental patching

Kim, Min Sub, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links)
This thesis investigates how an autonomous reinforcement learning agent can improve on an approximate solution by augmenting it with a small patch, which overrides the approximate solution at certain states of the problem. In reinforcement learning, many approximate solutions are smaller and easier to produce than ???flat??? solutions that maintain distinct parameters for each fully enumerated state, but the best solution within the constraints of the approximation may fall well short of global optimality. This thesis proposes that the remaining gap to global optimality can be efficiently minimised by learning a small patch over the approximate solution. In order to improve the agent???s behaviour, algorithms are presented for learning the overriding patch. The patch is grown around particular regions of the problem where the approximate solution is found to be deficient. Two heuristic strategies are proposed for concentrating resources to those areas where inaccuracies in the approximate solution are most costly, drawing a compromise between solution quality and storage requirements. Patching also handles problems with continuous state variables, by two alternative methods: Kuhn triangulation over a fixed discretisation and nearest neighbour interpolation with a variable discretisation. As well as improving the agent???s behaviour, patching is also applied to the agent???s model of the environment. Inaccuracies in the agent???s model of the world are detected by statistical testing, using a selective sampling strategy to limit storage requirements for collecting data. The patching algorithms are demonstrated in several problem domains, illustrating the effectiveness of patching under a wide range of conditions. A scenario drawn from a real-time strategy game demonstrates the ability of patching to handle large complex tasks. These contributions combine to form a general framework for patching over approximate solutions in reinforcement learning. Complex problems cannot be solved by brute force alone, and some form of approximation is necessary to handle large problems. However, this does not mean that the limitations of approximate solutions must be accepted without question. Patching demonstrates one way in which an agent can leverage approximation techniques without losing the ability to handle fine yet important details.
7

Systems Medicine: An Integrated Approach with Decision Making Perspective

Faryabi, Babak 14 January 2010 (has links)
Two models are proposed to describe interactions among genes, transcription factors, and signaling cascades involved in regulating a cellular sub-system. These models fall within the class of Markovian regulatory networks, and can accommodate for different biological time scales. These regulatory networks are used to study pathological cellular dynamics and discover treatments that beneficially alter those dynamics. The salient translational goal is to design effective therapeutic actions that desirably modify a pathological cellular behavior via external treatments that vary the expressions of targeted genes. The objective of therapeutic actions is to reduce the likelihood of the pathological phenotypes related to a disease. The task of finding effective treatments is formulated as sequential decision making processes that discriminate the gene-expression profiles with high pathological competence versus those with low pathological competence. Thereby, the proposed computational frameworks provide tools that facilitate the discovery of effective drug targets and the design of potent therapeutic actions on them. Each of the proposed system-based therapeutic methods in this dissertation is motivated by practical and analytical considerations. First, it is determined how asynchronous regulatory models can be used as a tool to search for effective therapeutic interventions. Then, a constrained intervention method is introduced to incorporate the side-effects of treatments while searching for a sequence of potent therapeutic actions. Lastly, to bypass the impediment of model inference and to mitigate the numerical challenges of exhaustive search algorithms, a heuristic method is proposed for designing system-based therapies. The presentation of the key ideas in method is facilitated with the help of several case studies.
8

Reinforcement learning by incremental patching

Kim, Min Sub, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links)
This thesis investigates how an autonomous reinforcement learning agent can improve on an approximate solution by augmenting it with a small patch, which overrides the approximate solution at certain states of the problem. In reinforcement learning, many approximate solutions are smaller and easier to produce than ???flat??? solutions that maintain distinct parameters for each fully enumerated state, but the best solution within the constraints of the approximation may fall well short of global optimality. This thesis proposes that the remaining gap to global optimality can be efficiently minimised by learning a small patch over the approximate solution. In order to improve the agent???s behaviour, algorithms are presented for learning the overriding patch. The patch is grown around particular regions of the problem where the approximate solution is found to be deficient. Two heuristic strategies are proposed for concentrating resources to those areas where inaccuracies in the approximate solution are most costly, drawing a compromise between solution quality and storage requirements. Patching also handles problems with continuous state variables, by two alternative methods: Kuhn triangulation over a fixed discretisation and nearest neighbour interpolation with a variable discretisation. As well as improving the agent???s behaviour, patching is also applied to the agent???s model of the environment. Inaccuracies in the agent???s model of the world are detected by statistical testing, using a selective sampling strategy to limit storage requirements for collecting data. The patching algorithms are demonstrated in several problem domains, illustrating the effectiveness of patching under a wide range of conditions. A scenario drawn from a real-time strategy game demonstrates the ability of patching to handle large complex tasks. These contributions combine to form a general framework for patching over approximate solutions in reinforcement learning. Complex problems cannot be solved by brute force alone, and some form of approximation is necessary to handle large problems. However, this does not mean that the limitations of approximate solutions must be accepted without question. Patching demonstrates one way in which an agent can leverage approximation techniques without losing the ability to handle fine yet important details.
9

What if you are not Bayesian? The consequences for decisions involving risk

Goodwin, P., Onkal, Dilek, Stekler, H.O. 22 September 2017 (has links)
Yes / Many studies have examined the extent to which individuals’ probability judgments depart from Bayes’ theorem when revising probability estimates in the light of new information. Generally, these studies have not considered the implications of such departures for decisions involving risk. We identify when such departures will occur in two common types of decisions. We then report on two experiments where people were asked to revise their own prior probabilities of a forthcoming economic recession in the light of new information. When the reliability of the new information was independent of the state of nature, people tended to overreact to it if their prior probability was low and underreact if it was high. When it was not independent, they tended to display conservatism. We identify the circumstances where discrepancies in decisions arising from a failure to use Bayes’ theorem were most likely to occur in the decision context we examined. We found that these discrepancies were relatively rare and, typically, were not serious.
10

Finite Memory Policies for Partially Observable Markov Decision Proesses

Lusena, Christopher 01 January 2001 (has links)
This dissertation makes contributions to areas of research on planning with POMDPs: complexity theoretic results and heuristic techniques. The most important contributions are probably the complexity of approximating the optimal history-dependent finite-horizon policy for a POMDP, and the idea of heuristic search over the space of FFTs.

Page generated in 0.1043 seconds