• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 51
  • 22
  • 17
  • 6
  • 6
  • 5
  • 1
  • Tagged with
  • 136
  • 136
  • 112
  • 39
  • 27
  • 22
  • 22
  • 21
  • 19
  • 18
  • 17
  • 17
  • 17
  • 16
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Modelling Infertility with Markov Chains

Dorff, Rebecca 20 June 2013 (has links) (PDF)
Infertility affects approximately 15% of couples. Testing and interventions are costly, in time, money, and emotional energy. This paper will discuss using Markov decision and multi-armed bandit processes to identify a systematic approach of interventions that will lead to the desired baby while minimizing costs.
22

Optimal Control of Non-Conventional Queueing Networks: A Simulation-Based Approximate Dynamic Programming Approach

Chen, Xiaoting 02 June 2015 (has links)
No description available.
23

Reinforcement learning with time perception

Liu, Chong January 2012 (has links)
Classical value estimation reinforcement learning algorithms do not perform very well in dynamic environments. On the other hand, the reinforcement learning of animals is quite flexible: they can adapt to dynamic environments very quickly and deal with noisy inputs very effectively. One feature that may contribute to animals' good performance in dynamic environments is that they learn and perceive the time to reward. In this research, we attempt to learn and perceive the time to reward and explore situations where the learned time information can be used to improve the performance of the learning agent in dynamic environments. The type of dynamic environments that we are interested in is that type of switching environment which stays the same for a long time, then changes abruptly, and then holds for a long time before another change. The type of dynamics that we mainly focus on is the time to reward, though we also extend the ideas to learning and perceiving other criteria of optimality, e.g. the discounted return, so that they can still work even when the amount of reward may also change. Specifically, both the mean and variance of the time to reward are learned and then used to detect changes in the environment and to decide whether the agent should give up a suboptimal action. When a change in the environment is detected, the learning agent responds specifically to the change in order to recover quickly from it. When it is found that the current action is still worse than the optimal one, the agent gives up this time's exploration of the action and then remakes its decision in order to avoid longer than necessary exploration. The results of our experiments using two real-world problems show that they have effectively sped up learning, reduced the time taken to recover from environmental changes, and improved the performance of the agent after the learning converges in most of the test cases compared with classical value estimation reinforcement learning algorithms. In addition, we have successfully used spiking neurons to implement various phenomena of classical conditioning, the simplest form of animal reinforcement learning in dynamic environments, and also pointed out a possible implementation of instrumental conditioning and general reinforcement learning using similar models.
24

Efficient algorithms for infinite-state recursive stochastic models and Newton's method

Stewart, Alistair Mark January 2015 (has links)
Some well-studied infinite-state stochastic models give rise to systems of nonlinear equations. These systems of equations have solutions that are probabilities, generally probabilities of termination in the model. We are interested in finding efficient, preferably polynomial time, algorithms for calculating probabilities associated with these models. The chief tool we use to solve systems of polynomial equations will be Newton’s method as suggested by [EY09]. The main contribution of this thesis is to the analysis of this and related algorithms. We give polynomial-time algorithms for calculating probabilities for broad classes of models for which none were known before. Stochastic models that give rise to such systems of equations include such classic and heavily-studied models as Multi-type Branching Processes, Stochastic Context- Free Grammars(SCFGs) and Quasi Birth-Death Processes. We also consider models that give rise to infinite-state Markov Decision Processes (MDPs) by giving algorithms for approximating optimal probabilities and finding policies that give probabilities close to the optimal probability, in several classes of infinite-state MDPs. Our algorithms for analysing infinite-state MDPs rely on a non-trivial generalization of Newton’s method that works for the max/min polynomial systems that arise as Bellman optimality equations in these models. For SCFGs, which are used in statistical natural language processing, in addition to approximating termination probabilities, we analyse algorithms for approximating the probability that a grammar produces a given string, or produces a string in a given regular language. In most cases, we show that we can calculate an approximation to the relevant probability in time polynomial in the size of the model and the number of bits of desired precision. We also consider more general systems of monotone polynomial equations. For such systems we cannot give a polynomial-time algorithm, which pre-existing hardness results render unlikely, but we can still give an algorithm with a complexity upper bound which is exponential only in some parameters that are likely to be bounded for the monotone polynomial equations that arise for many interesting stochastic models.
25

Troubleshooting Trucks : Automated Planning and Diagnosis / Felsökning av lastbilar : automatiserad planering och diagnos

Warnquist, Håkan January 2015 (has links)
This thesis considers computer-assisted troubleshooting of heavy vehicles such as trucks and buses. In this setting, the person that is troubleshooting a vehicle problem is assisted by a computer that is capable of listing possible faults that can explain the problem and gives recommendations of which actions to take in order to solve the problem such that the expected cost of restoring the vehicle is low. To achieve this, such a system must be capable of solving two problems: the diagnosis problem of finding which the possible faults are and the decision problem of deciding which action should be taken. The diagnosis problem has been approached using Bayesian network models. Frameworks have been developed for the case when the vehicle is in the workshop only and for remote diagnosis when the vehicle is monitored during longer periods of time. The decision problem has been solved by creating planners that select actions such that the expected cost of repairing the vehicle is minimized. New methods, algorithms, and models have been developed for improving the performance of the planner. The theory developed has been evaluated on models of an auxiliary braking system, a fuel injection system, and an engine temperature control and monitoring system.
26

Post-decision Processes : Consolidation and value conflicts in decision making

Shamoun, Sanny January 2004 (has links)
<p>The studies in the present thesis focus on post-decision processes using the theoretical framework of Differentiation and Consolidation Theory. This thesis consists of three studies. In all these studies, pre-decision evaluations are compared with post-decision evaluations in order to explore differences in evaluations of decision alternatives before and after a decision. The main aim of the studies was to describe and gain a clearer and better understanding of how people re-evaluate information, following a decision for which they have experienced the decision and outcome. The studies examine how the attractiveness evaluations of important attributes are restructured from the pre-decision to the post-decision phase; particularly restructuring processes of value conflicts. Value conflict attributes are those in which information speaks against the chosen alternative in a decision. The first study investigates an important real-life decision and illustrates different post-decision (consolidation) processes following the decision. The second study tests whether decisions with value conflicts follow the same consolidation (post-decision restructuring) processes when the conflict is controlled experimentally, as in earlier studies of less controlled real-life decisions. The third study investigates consolidation and value conflicts in decisions in which the consequences are controlled and of different magnitudes. </p><p>The studies in the present thesis have shown how attractiveness restructuring of attributes in conflict occurs in the post-decision phase. Results from the three studies indicated that attractiveness restructuring of attributes in conflict was stronger for important real-life decisions (Study 1) and in situations in which real consequences followed a decision (Study 3) than in more controlled, hypothetical decision situations (Study 2). </p><p>Finally, some proposals for future research are suggested, including studies of the effects of outcomes and consequences on consolidation of prior decisions and how a decision maker’s involvement affects his or her pre- and post-decision processes.</p>
27

Learning in a state of confusion : employing active perception and reinforcement learning in partially observable worlds

Crook, Paul A. January 2007 (has links)
In applying reinforcement learning to agents acting in the real world we are often faced with tasks that are non-Markovian in nature. Much work has been done using state estimation algorithms to try to uncover Markovian models of tasks in order to allow the learning of optimal solutions using reinforcement learning. Unfortunately these algorithms which attempt to simultaneously learn a Markov model of the world and how to act have proved very brittle. Our focus differs. In considering embodied, embedded and situated agents we have a preference for simple learning algorithms which reliably learn satisficing policies. The learning algorithms we consider do not try to uncover the underlying Markovian states, instead they aim to learn successful deterministic reactive policies such that agents actions are based directly upon the observations provided by their sensors. Existing results have shown that such reactive policies can be arbitrarily worse than a policy that has access to the underlying Markov process and in some cases no satisficing reactive policy can exist. Our first contribution is to show that providing agents with alternative actions and viewpoints on the task through the addition of active perception can provide a practical solution in such circumstances. We demonstrate empirically that: (i) adding arbitrary active perception actions to agents which can only learn deterministic reactive policies can allow the learning of satisficing policies where none were originally possible; (ii) active perception actions allow the learning of better satisficing policies than those that existed previously and (iii) our approach converges more reliably to satisficing solutions than existing state estimation algorithms such as U-Tree and the Lion Algorithm. Our other contributions focus on issues which affect the reliability with which deterministic reactive satisficing policies can be learnt in non-Markovian environments. We show that that greedy action selection may be a necessary condition for the existence of stable deterministic reactive policies on partially observable Markov decision processes (POMDPs). We also set out the concept of Consistent Exploration. This is the idea of estimating state-action values by acting as though the policy has been changed to incorporate the action being explored. We demonstrate that this concept can be used to develop better algorithms for learning reactive policies to POMDPs by presenting a new reinforcement learning algorithm; the Consistent Exploration Q(l) algorithm (CEQ(l)). We demonstrate on a significant number of problems that CEQ(l) is more reliable at learning satisficing solutions than the algorithm currently regarded as the best for learning deterministic reactive policies, that of SARSA(l).
28

A Foundation for Sustainable Product Development

Hallstedt, Sophie January 2008 (has links)
Product development is a particularly critical intervention point for the transformation of society towards sustainability. Current socio-ecological impacts over product life-cycles are evidence that current practices are insufficient. The aim of this thesis is to form a foundation for sustainable product development through the integration of a sustainability perspective into product development procedures and processes. Literature reviews and theoretical considerations as well as interviews, questionnaires, observations, testing and action research through case studies in various companies have indicated gaps in current methodology and have guided the development of a new general Method for Sustainable Product Development (MSPD). This method combines a framework for strategic sustainable development based on backcasting from basic sustainability principles with a standard concurrent engineering development model. A modular system of guiding questions, derived by considering the sustainability principles and the product life-cycle, is the key feature. Initial testing indicates that this MSPD works well for identification of sustainability problems as well as for generation of possible solutions. However, these tests also indicate that there is sometimes a desire for a quick overview of the sustainability performance of a specific product category. This is to guide early strategic decisions before the more comprehensive and detailed work with the MSPD is undertaken, or, alternatively, when an overview is sufficient to make decisions. In response, a Template for Sustainable Product Development (TSPD) approach is presented as a supplement to the MSPD. To generate products that support sustainable development of society it is necessary to combine sustainability assessments with improvements of technical product properties. An introductory procedure for such sustainability-driven design optimization is suggested based on a case study. For maximum efficiency of a company in finding viable pathways towards sustainability, it is also necessary to coordinate different methods and tools that are useful for sustainable product development and integrate them into the overall decision-making processes at different levels in companies. To find gaps in the sustainability integration in a company’s decision system, an assessment approach is suggested based on case studies. A general conclusion from this research is that the support needed for making sustainability-related decisions are not systematically integrated in companies today. However, this thesis also indicates that it is possible to create generic methods and tools that aid the integration of sustainability aspects in companies’ strategic decision-making and product development. These methods and tools can be used to guide the prioritization of investments and technical optimization on the increasingly sustainability-driven market, thus providing a foundation for competitive sustainable product development.
29

Oligopolies in private spectrum commons: analysis and regulatory implications

Kavurmacioglu, Emir 17 February 2016 (has links)
In an effort to make more spectrum available, recent initiatives by the FCC let mobile providers offer spot service of their licensed spectrum to secondary users, hence paving the way to dynamic secondary spectrum markets. This dissertation investigates secondary spectrum markets under different regulatory regimes by identifying profitability conditions and possible competitive outcomes in an oligopoly model. We consider pricing in a market where multiple providers compete for secondary demand. First, we analyze the market outcomes when providers adopt a coordinated access policy, where, besides pricing, a provider can elect to apply admission control on secondary users based on the state of its network. We next consider a competition when providers implement an uncoordinated access policy (i.e., no admission control). Through our analysis, we identify profitability conditions and fundamental price thresholds, including break-even and market sharing prices. We prove that regardless of the specific form of the secondary demand function, competition under coordinated access always leads to a price war outcome. In contrast, under uncoordinated access, market sharing becomes a viable market outcome if the intervals of prices for which the providers are willing to share the market overlap. We then turn our attention to how a network provider use carrier (spectrum) aggregation in order to lower its break-even price and gain an edge over its competition. To this end, we determine the optimal (minimum) level of carrier aggregation that a smaller provider needs. Under a quality-driven (QD) regime, we establish an efficient way of numerically calculating the optimal carrier aggregation and derive scaling laws. We extend the results to delay-related metrics and show their applications to profitable pricing in secondary spectrum markets. Finally, we consider the problem of profitability over a spatial topology, where identifying system behavior suffers from the curse of dimensionality. Hence, we propose an approximation model that captures system behavior to the first-order and provide an expression to calculate the break-even price at each network location and provide simulation results for accuracy comparison. All of our results hold for general forms of demand, thus avoid restricting assumptions of customer preferences and the valuation of the spectrum.
30

Representações compactas para processos de decisão de Markov e sua aplicação na adminsitração de impressoras. / Compact representations of Markov decision processes and their application to printer management.

Torres, João Vitor 02 June 2006 (has links)
Os Processos de Decisão de Markov (PDMs) são uma importante ferramenta de planejamento e otimização em ambientes que envolvem incertezas. Contudo a especificação e representação computacional das distribuições de probabilidades subjacentes a PDMs é uma das principais dificuldades de utilização desta ferramenta. Este trabalho propõe duas estratégias para representação destas probabilidades de forma compacta e eficiente. Estas estratégias utilizam redes Bayesianas e regularidades entre os estados e as variáveis. As estratégias apresentadas são especialmente úteis em sistemas onde as variáveis têm muitas categorias e possuem forte inter-relação. Além disso, é apresentada a aplicação destes modelos no gerenciamento de grupos de impressoras (um problema real da indústria e que motivou o desenvolvimento do trabalho) permitindo que estas atuem coletiva e não individualmente. O último tópico discutido é uma análise comparativa da mesma aplicação utilizando Lógica Difusa. / Markov Decision Processes (MDPs) are an important tool for planning and optimization in environments under uncertainty. The specification and computational representation of the probability distributions underlying MDPs are central difficulties for their application. This work proposes two strategies for representation of probabilities in a compact and efficient way. These strategies use Bayesian networks and regularities among states and variables. The proposed strategies are particularly useful in systems whose variables have many categories and have strong interrelation. This proposal has been applied to the management of clusters of printers, a real problem that in fact motivated the work. Markov Decision Processes are then used to allow printers to act as a group, and not just individually. The work also presents a comparison between MDPs and Fuzzy Logic in the context of clusters of printers.

Page generated in 0.1243 seconds