Spelling suggestions: "subject:"markov codecision canprocess"" "subject:"markov codecision 3.3vprocess""
31 |
Dynamic modeling in sustainable operations and supply chain managementLiu, Baolong 06 September 2018 (has links)
Cette thèse articule plusieurs questions importantes dans les opérations durables et la gestion de la chaîne d'approvisionnement, non seulement afin de fournir des idées pour améliorer la performance des entreprises, mais aussi pour inciter ces dernières à adopter les moyens appropriés pour un meilleur environnement de notre société. Le lien entre le niveau de l'entreprise et le niveau de la société est que l'amélioration de la performance écologique par une meilleure gestion des opérations dans les entreprises et les chaînes d'approvisionnement est un élément indispensable pour améliorer l'environnement dans notre société. Prenons la Chine comme exemple. Depuis quelques années, le gouvernement a commencé à favoriser toutes les initiatives pour résoudre les problèmes de pollution de l'air. Un moyen important et utile est de mettre en place une réglementation stricte et de surveiller les efforts des entreprises qui seront passibles d'amendes sérieuses si certaines normes ne sont pas respectées par des inspections aléatoires. Par conséquent, les entreprises doivent coopérer pour améliorer leur rentabilité et, plus important encore, leurs impacts environnementaux. Grâce à cet effort prolongé, malgré le fait que la situation future est incertaine, la qualité de l'air s'est progressivement améliorée en Chine. Cette thèse, dans un cadre plus général, vise à fournir aux entreprises des informations importantes afin qu'elles soient non seulement en mesure de respecter la réglementation, mais aussi en mesure d'apporter véritablement leur contribution à la construction d'un environnement meilleur pour les générations futures. Notre objectif fondamental est d'obtenir une compréhension approfondie des compromis auxquels les entreprises sont confrontées, de modéliser les problèmes de recherche de solutions possibles et d'aider les entreprises/chaînes d'approvisionnement à améliorer leur performance d'un point de vue théorique. Ensuite, la thèse aidera indirectement les entreprises à réaliser l'importance du développement de moyens de gestion durable des opérations et de la chaîne d'approvisionnement sur notre société. La thèse est organisée comme la structure suivante. Le chapitre 3 est le premier essai, Environmental Collaboration and Process Innovation in Supply Chain Management with Coordination. Le chapitre 4 comprend le contenu du deuxième essai, Remanufacturing of Multi-Component Systems with Product Substitution, et le troisième essai, Joint Dynamic Pricing and Return Quality Strategies Under Demand Cannibalization , est présenté au chapitre 5. Le chapitre 6 donne les remarques finales générales des trois essais, suivies de la liste de référence, et les annexes. / This thesis articulates several important issues in sustainable operations and supply chain management not only to provide insights for enhancing the performance of firms but also to appeal to the enterprises to adopt appropriate means for a better environment of our society. The link from firm level to society level is that, to improve the green performance through better operations management efficiency in firms and supply chains, is an indispensable element to ameliorate the environment in our society. Taking China as an example. Since a few years ago (The Straitstimes, 2017; Stanway & Perry, 2018), the government started to spare no effort in resolving the air pollution problems. An important and useful means is to put strict regulations and monitoring the efforts of firms which will face serious fine if certain standards are not met by random inspection. Therefore, firms have to cooperate for the betterment of its profitability and, more importantly, the environmental impacts. Throughout the endeavor, despite the uncertain future situation, the air quality has gradually improved in China (Zheng, 2018). This thesis, in a more general setting, aims to provide important insights to firms so that they are not only able to meet the regulations but genuinely to make contributions to building a better environment for our future generations. Basically, our goal is to obtain deep understanding of the trade-offs with which companies are faced, and to model the problems for seeking possible solutions and helping firms/supply chains to enhance their performance from a theoretical point of view. Then, indirectly, the work will help firms to realize the importance of developing sustainable operations and supply chain management means on our society. The structure of the thesis is organized as follows. Chapter 2 introduces the thesis in French. Chapter 3 is the first essay, Environmental Collaboration and Process Innovation in Supply Chain Management with Coordination. Chapter 4 includes the contents of the second essay, Remanufacturing of Multi-Component Systems with Product Substitution , and the third essay, Joint Dynamic Pricing and Return Quality Strategies Under Demand Cannibalization, is introduced in Chapter 5. Chapter 6 gives the general concluding remarks of the three essays which is followed by the reference list and the appendices.
|
32 |
Rational Fools: (Ir)rational Choices of Humans, Rhesus Macaques, and Capuchin Monkeys in Dynamic Stochastic EnvironmentsWatzek, Julia 01 May 2017 (has links)
Human and animal decision-making is known to violate rational expectations in a variety of contexts. Statistical structures of real-world environments may account for such seemingly irrational behavior. In a computerized experiment, 16 capuchins, 7 rhesus monkeys, and 30 humans chose between up to three options of different value. The options disappeared and became available again with different probabilities. Subjects overwhelmingly chose transitively (A>B, B>C, and A>C) in the control condition, where doing so maximized overall gain. However, most subjects also adhered to transitivity in the test condition, where it was suboptimal but led to negligible losses compared to the optimal strategy. Only a few of the capuchins were able to maximize long-term gain by violating transitivity. Adhering to rational choice principles may facilitate the formation of near-optimal decision rules when short- and long-term goals align. Such cognitive shortcuts may have evolved to preserve mental resources.
|
33 |
Simulating interactions among multiple charactersShum, Pak Ho January 2010 (has links)
In this thesis, we attack a challenging problem in the field of character animation: synthesizing interactions among multiple virtual characters in real-time. Although there are heavy demands in the gaming and animation industries, no systemic solution has been proposed due to the difficulties to model the complex behaviors of the characters. We represent the continuous interactions among characters as a discrete Markov Decision Process, and design a general objective function to evaluate the immediate rewards of launching an action. By applying game theory such as tree expansion and min-max search, the optimal actions that benefit the character the most in the future are selected. The simulated characters can interact competitively while achieving the requests from animators cooperatively. Since the interactions between two characters depend on a lot of criteria, it is difficult to exhaustively precompute the optimal actions for all variations of these criteria. We design an off-policy approach that samples and precomputes only meaningful interactions. With the precomputed policy, the optimal movements under different situations can be evaluated in real-time. To simulate the interactions for a large number of characters with minimal computational overhead, we propose a method to precompute short durations of interactions between two characters as connectable patches. The patches are concatenated spatially to generate interactions with multiple characters, and temporally to generate longer interactions. Based on the optional instructions given by the animators, our system automatically applies concatenations to create a huge scene of interacting crowd. We demonstrate our system by creating scenes with high quality interactions. On one hand, our algorithm can automatically generate artistic scenes of interactions such as the fighting scenes in movies that involve hundreds of characters. On the other hand, it can create controllable, intelligent characters that interact with the opponents for real-time applications such as 3D computer games.
|
34 |
A Learning Approach To Obtain Efficient Testing Strategies In Medical DiagnosisFakih, Saif 15 March 2004 (has links)
Determining the most efficient use of diagnostic tests is one of the complex issues facing the medical practitioners. It is generally accepted that excessive use of tests is common practice in medical diagnosis. Many tests are performed even though the incremental knowledge gained does not affect the course of diagnosis. With the soaring cost of healthcare in the US, there is a critical need for cutting costs of diagnostic tests, while achieving a higher level of diagnostic accuracy. Various decision making tools assisting physicians in diagnosis management have been presented to the literature. One such method, called analytical hierarchy process, utilize a multilevel structure of decision criterion for sequential pair wise comparison of available test choices. Many of the decision-analytic methods are based on Bayes' theory and decision trees. These methods use threshold treatment probabilities and performance characteristics of the tests, such as true-positive rate and false-positive rates, to choose among the available alternatives. Sequential testing approaches tend to elongate the diagnosis process, whereas the parallel testing approach generally involves higher number of tests.
This research is focused on developing a machine learning based methodology for finding an efficient testing strategy for medical diagnosis. The method, based on the patient parameters (both observed and tested), recommends test(s) with the objective of optimizing a measure of performance for the diagnosis process. The performance measure is a combined cost of the testing, the risk and discomfort associated with the tests and the time taken to reach diagnosis. The performance measure also considers the diagnostic ability of the tests.
The methodology is developed combining tools from the fields of data mining (rough set theory, in particular), utility theory, Markov decision processes (MDP), and reinforcement learning (RL). The rough set theory is used in extracting diagnostic information in the form of rules from the medical databases. Utility theory is used to bring three non-homogenous measures (cost of testing, risk and discomfort and diagnostic ability) into one cost based measure of performance. The MDP framework along with an RL algorithm facilitates obtaining efficient testing strategies. The methodology is implemented on a sample problem of diagnosing Solitary Pulmonary Nodule (SPN). The results obtained are compared with those from four other approaches. It is shown that the RL based methodology holds significant promise in improving the performance of diagnostic process.
|
35 |
Dynamic Cooperative Secondary Access inHierarchical Spectrum Sharing NetworksWang, Liping, Fodor, Viktoria Unknown Date (has links)
We consider a hierarchical spectrum sharing network consisting of a primary and a cognitive secondary transmitter-receiver pair, with non-backlogged traffic. The secondary transmitter may utilize cooperative transmission techniques to relay primary traffic while superimposing its own information, or transmit opportunistically when the primary user is idle. The secondary user meets a dilemma in this scenario. Choosing cooperation it can transmit a packet immediately even if the primary queue is not empty, but it has to bear the additional cost of relaying, since the primary performance needs to be guaranteed. To solve this dilemma we propose dynamic cooperative secondary access control that takes the state of the spectrum sharing network into account. We formulate the problem as a Markov Decision Process (MDP) and prove the existence of a stationary policy that is average cost optimal. Then we consider the scenario when the traffic and link statistics are not known at the secondary user, and propose to find the optimal transmission strategy using reinforcement learning. With extensive numerical evaluation, we demonstrate that dynamic cooperation with state aware sequential decision is very efficient in spectrum sharing systems with stochastic traffic, and show that dynamic cooperation is necessary for the secondary system to be able to adapt to changing load conditions or to changing available energy resource. Our results show, that learning based access control, with or without known primary buffer state, has close to optimal performance. / <p>QS 2013</p>
|
36 |
Simulation and Optimization of Wind Farm Operations under Stochastic ConditionsByon, Eunshin 2010 May 1900 (has links)
This dissertation develops a new methodology and associated solution tools to
achieve optimal operations and maintenance strategies for wind turbines, helping
reduce operational costs and enhance the marketability of wind generation. The
integrated framework proposed includes two optimization models for enabling decision
support capability, and one discrete event-based simulation model that characterizes
the dynamic operations of wind power systems. The problems in the optimization
models are formulated as a partially observed Markov decision process to determine
an optimal action based on a wind turbine's health status and the stochastic weather
conditions.
The rst optimization model uses homogeneous parameters with an assumption
of stationary weather characteristics over the decision horizon. We derive a set of
closed-form expressions for the optimal policy and explore the policy's monotonicity.
The second model allows time-varying weather conditions and other practical aspects.
Consequently, the resulting strategy are season-dependent. The model is solved using
a backward dynamic programming method. The bene ts of the optimal policy are
highlighted via a case study that is based upon eld data from the literature and
industry. We nd that the optimal policy provides options for cost-e ective actions,
because it can be adapted to a variety of operating conditions.
Our discrete event-based simulation model incorporates critical components, such
as a wind turbine degradation model, power generation model, wind speed model,
and maintenance model. We provide practical insights gained by examining di erent
maintenance strategies. To the best of our knowledge, our simulation model is the
rst discrete-event simulation model for wind farm operations.
Last, we present the integration framework, which incorporates the optimization
results in the simulation model. Preliminary results reveal that the integrated model
has the potential to provide practical guidelines that can reduce the operation costs
as well as enhance the marketability of wind energy.
|
37 |
Adaptive routing in schedule based stochastic time-dependent transit networksRambha, Tarun 29 October 2012 (has links)
In this thesis, an adaptive transit routing (ATR) problem in a schedule based stochastic time-dependent transit network is defined and formulated as a finite horizon Markov Decision Process (MDP). The transit link travel times are assumed to be random with known probability distributions. Routing strategies are defined to be conditional on the arrival times at intermediate nodes, and the location and arrival times of other buses in the network. In other words, a traveler in the network decides to walk, wait or board a bus based on the real time information of all buses in the network. The objective is to find a strategy that minimizes the expected travel time, subject to constraints that guarantee that the destination is reached within a certain threshold. The value of the threshold was chosen to reflect the risk averse attitude of travelers and is computed based on the earliest time by which the destination can be reached with probability 1. The problem inherits the curse of dimensionality and state space reduction through pre-processing is achieved by solving variants of the time dependent shortest path problem. An interesting analogy between the state space reduction techniques and the concept of light cones is discussed. A dynamic program framework to solve the problem is developed by defining the state space, decision space and transition functions. Numerical results on a small instance of the Austin transit network are presented to investigate the extent of reduction in state space using the proposed methods. / text
|
38 |
FASTER DYNAMIC PROGRAMMING FOR MARKOV DECISION PROCESSESDai, Peng 01 January 2007 (has links)
Markov decision processes (MDPs) are a general framework used by Artificial Intelligence (AI) researchers to model decision theoretic planning problems. Solving real world MDPs has been a major and challenging research topic in the AI literature. This paper discusses two main groups of approaches in solving MDPs. The first group of approaches combines the strategies of heuristic search and dynamic programming to expedite the convergence process. The second makes use of graphical structures in MDPs to decrease the effort of classic dynamic programming algorithms. Two new algorithms proposed by the author, MBLAO* and TVI, are described here.
|
39 |
Active Sensing for Partially Observable Markov Decision ProcessesKoltunova, Veronika 10 January 2013 (has links)
Context information on a smart phone can be used to tailor applications for specific situations (e.g. provide tailored routing advice based on location, gas prices and traffic). However, typical context-aware smart phone applications use very limited context information such as user identity, location and time. In the future, smart phones will need to decide from a wide range of sensors to gather information from in order to best accommodate user needs and preferences in a given context.
In this thesis, we present a model for active sensor selection within decision-making processes, in which observational features are selected based on longer-term impact on the decisions made by the smart phone. This thesis formulates the problem as a partially observable Markov decision process (POMDP), and proposes a non-myopic solution to the problem using a state of the art approximate planning algorithm Symbolic Perseus. We have tested our method on a 3 small example domains, comparing different policy types, discount factors and cost settings. The experimental results proved that the proposed approach delivers a better policy in the situation of costly sensors, while at the same time provides the advantage of faster policy computation with less memory usage.
|
40 |
Active Sensing for Partially Observable Markov Decision ProcessesKoltunova, Veronika 10 January 2013 (has links)
Context information on a smart phone can be used to tailor applications for specific situations (e.g. provide tailored routing advice based on location, gas prices and traffic). However, typical context-aware smart phone applications use very limited context information such as user identity, location and time. In the future, smart phones will need to decide from a wide range of sensors to gather information from in order to best accommodate user needs and preferences in a given context.
In this thesis, we present a model for active sensor selection within decision-making processes, in which observational features are selected based on longer-term impact on the decisions made by the smart phone. This thesis formulates the problem as a partially observable Markov decision process (POMDP), and proposes a non-myopic solution to the problem using a state of the art approximate planning algorithm Symbolic Perseus. We have tested our method on a 3 small example domains, comparing different policy types, discount factors and cost settings. The experimental results proved that the proposed approach delivers a better policy in the situation of costly sensors, while at the same time provides the advantage of faster policy computation with less memory usage.
|
Page generated in 0.0943 seconds