• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 53
  • 22
  • 17
  • 6
  • 6
  • 5
  • 1
  • Tagged with
  • 138
  • 138
  • 114
  • 41
  • 29
  • 22
  • 22
  • 21
  • 19
  • 18
  • 18
  • 17
  • 17
  • 16
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Algoritmos assíncronos de iteração de política para Processos de Decisão Markovianos com Probabilidades Intervalares / Asynchronous policy iteration algorithms for Bounded-parameter Markov Decision Processes

Reis, Willy Arthur Silva 02 August 2019 (has links)
Um Processo de Decisão Markoviano (MDP) pode ser usado para modelar problemas de decisão sequencial. No entanto, podem existir limitações na obtenção de probabilidades para modelagem da transição de estados ou falta de confiabilidade nas informações existentes sobre estas probabilidades. Um modelo menos restritivo e que pode resolver este problema é o Processo de Decisão Markoviano com Probabilidades Intervalares (BMDP), que permite a representação imprecisa das probabilidades de transição de estados e raciocínio sobre uma solução robusta. Para resolver BMDPs de horizonte infinito, existem os algoritmos síncronos de Iteração de Valor Intervalar e Iteração de Política Robusto, que são ineficientes quando o tamanho do espaço de estados é grande. Neste trabalho são propostos algoritmos assíncronos de Iteração de Política baseados no particionamento do espaço de estados em subconjuntos aleatórios (Robust Asynchronous Policy Iteration - RAPI) ou em componentes fortemente conexos (Robust Topological Policy Iteration - RTPI). Também são propostas formas de inicializar a função valor e a política dos algoritmos, de forma a melhorar a convergência destes. O desempenho dos algoritmos propostos é avaliado em comparação com o algoritmo de Iteração de Política Robusto para BMDPs para domínios de planejamento existentes e um novo domínio proposto. Os resultados dos experimentos realizados mostram que (i) quanto mais estruturado é o domínio, melhor é o desempenho do algoritmo RTPI; (ii) o uso de computação paralela no algoritmo RAPI possui um pequeno ganho computacional em relação à sua versão sequencial; e (iii) uma boa inicialização da função valor e política pode impactar positivamente o tempo de convergência dos algoritmos. / A Markov Decision Process (MDP) can be used to model sequential decision problems. However, there may be limitations in obtaining probabilities for state transition modeling or lack of reliability in existing information on these probabilities. A less restrictive model that can solve this problem is the Bounded-parameter Markov Decision Process (BMDP), which allows the imprecise representation of the transition probabilities and reasoning about a robust solution. To solve infinite horizon BMDPs, there are synchronous algorithms such as Interval Value Iteration and Robust Policy Iteration, which are inefficient for large state spaces. In this work, we propose new asynchronous Policy Iteration algorithms based on state space partitioning in random subsets (Robust Asynchronous Policy Iteration - RAPI) or in strongly connected components (Robust Topological Policy Iteration - RTPI). We also propose ways to initialize the value function and policy of the algorithms, in order to improve their convergence. The performance of the proposed algorithms is evaluated in comparison with the Robust Policy Iteration algorithm for BMDPs for existing planning domains and a proposed new domain. The results of the experiments show that (i) the more structured the domain, the better is the performance of the RTPI algorithm; (ii) the use of parallel computing in the RAPI algorithm has a small computational gain compared to its sequential version; and (iii) a good initialization of the value function and policy can positively impact the convergence time of the algorithms.
92

Policy Explanation and Model Refinement in Decision-Theoretic Planning

Khan, Omar Zia January 2013 (has links)
Decision-theoretic systems, such as Markov Decision Processes (MDPs), are used for sequential decision-making under uncertainty. MDPs provide a generic framework that can be applied in various domains to compute optimal policies. This thesis presents techniques that offer explanations of optimal policies for MDPs and then refine decision theoretic models (Bayesian networks and MDPs) based on feedback from experts. Explaining policies for sequential decision-making problems is difficult due to the presence of stochastic effects, multiple possibly competing objectives and long-range effects of actions. However, explanations are needed to assist experts in validating that the policy is correct and to help users in developing trust in the choices recommended by the policy. A set of domain-independent templates to justify a policy recommendation is presented along with a process to identify the minimum possible number of templates that need to be populated to completely justify the policy. The rejection of an explanation by a domain expert indicates a deficiency in the model which led to the generation of the rejected policy. Techniques to refine the model parameters such that the optimal policy calculated using the refined parameters would conform with the expert feedback are presented in this thesis. The expert feedback is translated into constraints on the model parameters that are used during refinement. These constraints are non-convex for both Bayesian networks and MDPs. For Bayesian networks, the refinement approach is based on Gibbs sampling and stochastic hill climbing, and it learns a model that obeys expert constraints. For MDPs, the parameter space is partitioned such that alternating linear optimization can be applied to learn model parameters that lead to a policy in accordance with expert feedback. In practice, the state space of MDPs can often be very large, which can be an issue for real-world problems. Factored MDPs are often used to deal with this issue. In Factored MDPs, state variables represent the state space and dynamic Bayesian networks model the transition functions. This helps to avoid the exponential growth in the state space associated with large and complex problems. The approaches for explanation and refinement presented in this thesis are also extended for the factored case to demonstrate their use in real-world applications. The domains of course advising to undergraduate students, assisted hand-washing for people with dementia and diagnostics for manufacturing are used to present empirical evaluations.
93

Reinforcement Learning for Parameter Control of Image-Based Applications

Taylor, Graham January 2004 (has links)
The significant amount of data contained in digital images present barriers to methods of learning from the information they hold. Noise and the subjectivity of image evaluation further complicate such automated processes. In this thesis, we examine a particular area in which these difficulties are experienced. We attempt to control the parameters of a multi-step algorithm that processes visual information. A framework for approaching the parameter selection problem using reinforcement learning agents is presented as the main contribution of this research. We focus on the generation of state and action space, as well as task-dependent reward. We first discuss the automatic determination of fuzzy membership functions as a specific case of the above problem. Entropy of a fuzzy event is used as a reinforcement signal. Membership functions representing brightness have been automatically generated for several images. The results show that the reinforcement learning approach is superior to an existing simulated annealing-based approach. The framework has also been evaluated by optimizing ten parameters of the text detection for semantic indexing algorithm proposed by Wolf et al. Image features are defined and extracted to construct the state space. Generalization to reduce the state space is performed with the fuzzy ARTMAP neural network, offering much faster learning than in the previous tabular implementation, despite a much larger state and action space. Difficulties in using a continuous action space are overcome by employing the DIRECT method for global optimization without derivatives. The chosen parameters are evaluated using metrics of recall and precision, and are shown to be superior to the parameters previously recommended. We further discuss the interplay between intermediate and terminal reinforcement.
94

Optimal Streaming Of Rate Adaptable Video

Gurses, Eren 01 June 2006 (has links) (PDF)
In this study, we study the dynamics of network adaptive video streaming and propose novel algorithms for rate distortion control in video streaming. While doing so, we maintain inter-protocol fairness with TCP (Transmission Control Protocol) that is the dominant transport protocol in the current Internet. The proposed algorithms are retransmission-based and necessitate the use of playback buffers in order to tolerate the extra latency introduced by retransmissions. In the first part, we propose a practical network-adaptive streaming scheme based on TCP transport and the idea of Selective Frame Discarding (SFD) that makes use of two-layer temporally scalable video. The efficacy of the SFD scheme is validated for playout buffer times in the order of seconds and therefore makes it suitable more for delay tolerant streaming applications. In the second part of the thesis, we propose an application layer rate-distortion control algorithm which provides Optimal Scheduling and Rate Control (OSRC) policies in the average reward sense in order to achieve efficient streaming of video. The Optimal Scheduling (OS) we propose maximizes the probability of successfully on time delivery according to a prespecified set of rate constraints, and different channel conditions by using Markov Decision Process (MDP) models. On the other hand optimal rate control (RC) is achieved by calculating the optimal rate constraint which minimizes the average distortion of a video streaming session by making use of the video distortion model derived for lossy channels and achievable success probabilities provided by the set of optimal schedules. For numerical examples, we focus on an equation-based TCP friendly rate control (TFRC) protocol where transport layer retransmissions are disabled and Fine Granular Scalable (FGS) coded video is used for improved rate adaptation capabilities but with an additional rate distortion penalty. The efficacy of the proposed OSRC algorithm is demonstrated by means of both analytical results and ns-2 simulations.
95

Sensor-based prognostics and structured maintenance policies for components with complex degradation

Elwany, Alaa H. 23 September 2009 (has links)
We propose a mathematical framework that integrates low-level sensory signals from monitoring engineering systems and their components with high-level decision models for maintenance optimization. Our objective is to derive optimal adaptive maintenance strategies that capitalize on condition monitoring information to update maintenance actions based upon the current state of health of the system. We refer to this sensor-based decision methodology as "sense-and-respond logistics". As a first step, we develop and extend degradation models to compute and periodically update the remaining life distribution of fielded components using in situ degradation signals. Next, we integrate these sensory updated remaining life distributions with maintenance decision models to; (1) determine, in real-time, the optimal time to replace a component such that the lost opportunity costs due to early replacements are minimized and system utilization is increased, and (2) sequentially determine the optimal time to order a spare part such that inventory holding costs are minimized while preventing stock outs. Lastly, we integrate the proposed degradation model with Markov process models to derive structured replacement and spare parts ordering policies. In particular, we show that the optimal maintenance policy for our problem setting is a monotonically non-decreasing control limit type policy. We validate our methodology using real-world data from monitoring a piece of rotating machinery using vibration accelerometers. We also demonstrate that the proposed sense-and-respond decision methodology results in better decisions and reduced costs compared to other traditional approaches.
96

Value of information and supply uncertainty in supply chains

Cheong, Tae Su 16 August 2011 (has links)
This dissertation focuses on topics related to the value of real-time information and/or to supply uncertainties due to uncertain lead-times and yields in supply chains. The first two of these topics address issues associated with freight transportation, while the remaining two topics are concerned with inventory replenishment. We first assess the value of dynamic tour determination for the traveling salesman problem (TSP). Given a network with traffic dynamics that can be modeled as a Markov chain, we present a policy determination procedure that optimally builds a tour dynamically. We then explore the potential for expected total travel cost reduction due to dynamic tour determination, relative to two a priori tour determination procedures. Second, we consider the situation where the decision to continue or abort transporting perishable freight from an origin to a destination can be made at intermediate locations, based on real-time freight status monitoring. We model the problem as a partially observed Markov decision process (POMDP) and develop an efficient procedure for determining an optimal policy. We determine structural characteristics of an optimal policy and upper and lower bounds on the optimal reward function. Third, we analyze a periodic review inventory control problem with lost sales and random yields and present conditions that guarantee the existence of an optimal policy having a so-called staircase structure. We make use of this structure to accelerate both value iteration and policy evaluation. Lastly, we examine a model of inventory replenishment where both lead time and supply qualities are uncertain. We model this problem as an MDP and show that the weighted sum of inventory in transit and inventory at the destination is a sufficient statistic, assuming that random shrinkage can occur from the origin to the supply system or destination, shrinkage is deterministic within the supply system and from the supply system to the destination, and no shrinkage occurs once goods reach the destination.
97

Policy Explanation and Model Refinement in Decision-Theoretic Planning

Khan, Omar Zia January 2013 (has links)
Decision-theoretic systems, such as Markov Decision Processes (MDPs), are used for sequential decision-making under uncertainty. MDPs provide a generic framework that can be applied in various domains to compute optimal policies. This thesis presents techniques that offer explanations of optimal policies for MDPs and then refine decision theoretic models (Bayesian networks and MDPs) based on feedback from experts. Explaining policies for sequential decision-making problems is difficult due to the presence of stochastic effects, multiple possibly competing objectives and long-range effects of actions. However, explanations are needed to assist experts in validating that the policy is correct and to help users in developing trust in the choices recommended by the policy. A set of domain-independent templates to justify a policy recommendation is presented along with a process to identify the minimum possible number of templates that need to be populated to completely justify the policy. The rejection of an explanation by a domain expert indicates a deficiency in the model which led to the generation of the rejected policy. Techniques to refine the model parameters such that the optimal policy calculated using the refined parameters would conform with the expert feedback are presented in this thesis. The expert feedback is translated into constraints on the model parameters that are used during refinement. These constraints are non-convex for both Bayesian networks and MDPs. For Bayesian networks, the refinement approach is based on Gibbs sampling and stochastic hill climbing, and it learns a model that obeys expert constraints. For MDPs, the parameter space is partitioned such that alternating linear optimization can be applied to learn model parameters that lead to a policy in accordance with expert feedback. In practice, the state space of MDPs can often be very large, which can be an issue for real-world problems. Factored MDPs are often used to deal with this issue. In Factored MDPs, state variables represent the state space and dynamic Bayesian networks model the transition functions. This helps to avoid the exponential growth in the state space associated with large and complex problems. The approaches for explanation and refinement presented in this thesis are also extended for the factored case to demonstrate their use in real-world applications. The domains of course advising to undergraduate students, assisted hand-washing for people with dementia and diagnostics for manufacturing are used to present empirical evaluations.
98

Lexicographic refinements in possibilistic sequential decision-making models / Raffinements lexicographiques en prise de décision séquentielle possibiliste

El Khalfi, Zeineb 31 October 2017 (has links)
Ce travail contribue à la théorie de la décision possibiliste et plus précisément à la prise de décision séquentielle dans le cadre de la théorie des possibilités, à la fois au niveau théorique et pratique. Bien qu'attrayante pour sa capacité à résoudre les problèmes de décision qualitatifs, la théorie de la décision possibiliste souffre d'un inconvénient important : les critères d'utilité qualitatives possibilistes comparent les actions avec les opérateurs min et max, ce qui entraîne un effet de noyade. Pour surmonter ce manque de pouvoir décisionnel, plusieurs raffinements ont été proposés dans la littérature. Les raffinements lexicographiques sont particulièrement intéressants puisqu'ils permettent de bénéficier de l'arrière-plan de l'utilité espérée, tout en restant "qualitatifs". Cependant, ces raffinements ne sont définis que pour les problèmes de décision non séquentiels. Dans cette thèse, nous présentons des résultats sur l'extension des raffinements lexicographiques aux problèmes de décision séquentiels, en particulier aux Arbres de Décision et aux Processus Décisionnels de Markov possibilistes. Cela aboutit à des nouveaux algorithmes de planification plus "décisifs" que leurs contreparties possibilistes. Dans un premier temps, nous présentons des relations de préférence lexicographiques optimistes et pessimistes entre les politiques avec et sans utilités intermédiaires, qui raffinent respectivement les utilités possibilistes optimistes et pessimistes. Nous prouvons que les critères proposés satisfont le principe de l'efficacité de Pareto ainsi que la propriété de monotonie stricte. Cette dernière garantit la possibilité d'application d'un algorithme de programmation dynamique pour calculer des politiques optimales. Nous étudions tout d'abord l'optimisation lexicographique des politiques dans les Arbres de Décision possibilistes et les Processus Décisionnels de Markov à horizon fini. Nous fournissons des adaptations de l'algorithme de programmation dynamique qui calculent une politique optimale en temps polynomial. Ces algorithmes sont basés sur la comparaison lexicographique des matrices de trajectoires associées aux sous-politiques. Ce travail algorithmique est complété par une étude expérimentale qui montre la faisabilité et l'intérêt de l'approche proposée. Ensuite, nous prouvons que les critères lexicographiques bénéficient toujours d'une fondation en termes d'utilité espérée, et qu'ils peuvent être capturés par des utilités espérées infinitésimales. La dernière partie de notre travail est consacrée à l'optimisation des politiques dans les Processus Décisionnels de Markov (éventuellement infinis) stationnaires. Nous proposons un algorithme d'itération de la valeur pour le calcul des politiques optimales lexicographiques. De plus, nous étendons ces résultats au cas de l'horizon infini. La taille des matrices augmentant exponentiellement (ce qui est particulièrement problématique dans le cas de l'horizon infini), nous proposons un algorithme d'approximation qui se limite à la partie la plus intéressante de chaque matrice de trajectoires, à savoir les premières lignes et colonnes. Enfin, nous rapportons des résultats expérimentaux qui prouvent l'efficacité des algorithmes basés sur la troncation des matrices. / This work contributes to possibilistic decision theory and more specifically to sequential decision-making under possibilistic uncertainty, at both the theoretical and practical levels. Even though appealing for its ability to handle qualitative decision problems, possibilisitic decision theory suffers from an important drawback: qualitative possibilistic utility criteria compare acts through min and max operators, which leads to a drowning effect. To overcome this lack of decision power, several refinements have been proposed in the literature. Lexicographic refinements are particularly appealing since they allow to benefit from the expected utility background, while remaining "qualitative". However, these refinements are defined for the non-sequential decision problems only. In this thesis, we present results on the extension of the lexicographic preference relations to sequential decision problems, in particular, to possibilistic Decision trees and Markov Decision Processes. This leads to new planning algorithms that are more "decisive" than their original possibilistic counterparts. We first present optimistic and pessimistic lexicographic preference relations between policies with and without intermediate utilities that refine the optimistic and pessimistic qualitative utilities respectively. We prove that these new proposed criteria satisfy the principle of Pareto efficiency as well as the property of strict monotonicity. This latter guarantees that dynamic programming algorithm can be used for calculating lexicographic optimal policies. Considering the problem of policy optimization in possibilistic decision trees and finite-horizon Markov decision processes, we provide adaptations of dynamic programming algorithm that calculate lexicographic optimal policy in polynomial time. These algorithms are based on the lexicographic comparison of the matrices of trajectories associated to the sub-policies. This algorithmic work is completed with an experimental study that shows the feasibility and the interest of the proposed approach. Then we prove that the lexicographic criteria still benefit from an Expected Utility grounding, and can be represented by infinitesimal expected utilities. The last part of our work is devoted to policy optimization in (possibly infinite) stationary Markov Decision Processes. We propose a value iteration algorithm for the computation of lexicographic optimal policies. We extend these results to the infinite-horizon case. Since the size of the matrices increases exponentially (which is especially problematic in the infinite-horizon case), we thus propose an approximation algorithm which keeps the most interesting part of each matrix of trajectories, namely the first lines and columns. Finally, we reports experimental results that show the effectiveness of the algorithms based on the cutting of the matrices.
99

O Supremo Tribunal Federal e os tratados internacionais de direitos humanos : uma análise política das decisões judiciais / The Supreme Court and the international treaties on human rights : a political analysis of judicial decisions

Barreira, Karen Elaine Sakalauska, 1984- 23 August 2018 (has links)
Orientador: Andrei Koerner / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Filosofia e Ciências Humanas / Made available in DSpace on 2018-08-23T22:32:27Z (GMT). No. of bitstreams: 1 Barreira_KarenElaineSakalauska_M.pdf: 1870541 bytes, checksum: a079355088e3c6e4209facbbe6765520 (MD5) Previous issue date: 2013 / Resumo: Nas últimas décadas, em função das atribuições conferidas pela Constituição Federal de 1988 ao Supremo Tribunal Federal, as pesquisas sobre Judiciário passaram a examinar essa instituição a partir de perspectivas próprias da Ciência Política. Partindo da ideia de que as decisões dos ministros não consistem em mera atividade formal, de subsunção do fato à norma posta, mas por vezes interferem na arena política, essas pesquisas procuram delinear o papel político exercido pela Corte e apresentar o padrão decisório dos ministros no tocante a temas diversos, através da análise empírica de suas decisões. Neste contexto, esta dissertação propõe estudar o comportamento decisório dos ministros do STF em um tema inexplorado por essas pesquisas: a incorporação dos tratados internacionais de direitos humanos ao ordenamento jurídico interno. O trabalho toma como ponto de partida a Emenda Constitucional 45/2004, que introduziu nova regra e, com isso, reconfigurou o debate. Ele se dirige a analisar as posições dos ministros nas decisões após 2004, enfatizando as escolhas estratégico-normativas em seus votos e explorando a lógica da tomada de decisão / Abstract: In recent decades, due to the powers conferred by the 1988 Constitution to the Supreme Court, the researches on the judiciary began to examine the institution from the perspectives of the Political Science. Starting from the idea that the decisions of the ministers do not consist of a mere formal activity, a subsumption of the fact to the legal norm, but sometimes interfere in the political arena, this research seeks to delineate the political role played by the Court and to reveal the decision-making standard of ministers in respect to various issues, through the empirical analysis of their decisions. In this context, this paper proposes to study the behavior of Ministers decision of the Supreme Court on an issue unexplored by these studies: the incorporation of international human rights treaties to domestic law. The work takes as its starting point the Constitutional Amendment 45/2004, which introduced new rule and thereby reconfigured the debate. It turns to analyze the positions of ministers in decisions after 2004, emphasizing the strategic and normative choices on their votes and exploring the logic of decision making / Mestrado / Ciencia Politica / Mestra em Ciência Política
100

Ska vi inte fatta några beslut idag? : En studie om beslutsprocesser i skolan med rektors roll i fokus / Aren’t we going to make any decisions today? : A Study about Decision Processes in School with Focus on the Role of the Head Teacher

Kronqvist Håård, Malin January 2017 (has links)
Syftet med studien är att bidra med kunskap om rektors och skolans besluts-processer och hur dessa kan förstås ur ett organisationsteoretiskt perspektiv. Fallstudien har använts som forskningsmetod och en F-9 skola ingår i studien. Det empiriska materialet består av mötesobservationer, intervjuer och doku-mentanalys. Den teoretiska utgångspunkten för studien är organisationsteori med särskilt fokus på beslutsteori. Två modeller används som analysverktyg: faser i beslutsprocessen och sammanfattning av beslutsmodeller. Rektors roll i beslutsprocesserna är också en viktig faktor i analysen.En fråga i uppsatsen fokuserar på hur beslutsprocesserna går till på den stude-rade skolan och vad är rektors roll i dessa är. Resultatet visar här att det mesta av informationen som samlas in av rektor inför beslutsfattande är verbal och få val mellan direkta alternativ görs. Beslut är ofta vaga och de beslut som fattas rör i hög utsträckning praktiska frågor. Mycket lite av verkställande och upp-följning av beslut har setts i studien.Den andra fråga som ställs i uppsatsen är hur beslutsprocesserna kan beskrivas och förstås utifrån organisationsteoretiska och beslutsteoretiska begrepp och modeller. Resultatet pekar på att det är den inkrementella beslutsmodell som bäst stämmer in på den studerade skolans beslutsprocesser, med detta sagt är en modell aldrig en direkt avspegling av verkligheten utan inslag från fler mo-deller kan ses i studien. Då resultatet visar på vaga och otydliga beslutsproces-ser är slutsatsen att skolor skulle vinna på att tydliggöra processerna för alla inblandade, att börja tala om en designad beslutsprocess. / The purpose of this essay is to make a knowledge contribution regarding head teachers’ and schools’ decision processes and how these can be understood from an organizational theoretical perspective. A case study was carried out in a so called “F - 9” school (pre-school class and year 1 - 9 compulsory school). A triangulation of the methods observation, document analysis and interviews was used. The theoretical point of departure for the study is organisational the-ory with special focus on decision theory. Two models are used as analytical tools: phases in the decision process and a summary of the decision models used. The role of the head teacher is also an important factor in the analysis.One question in the essay focuses on how the decision processes are carried out and what the head teacher’s role in these are. The results show that most of the information gathered by the head teacher before the decisions is verbal and that few choices between alternatives are made. Decisions are often vague and the decisions made primarily concerns practical questions. Very little in terms of implementation and follow up of the decisions have been seen in the study.The second question asked in the essay is how the decision processes can be described and understood with organisation theoretical and decision theoretical terms and models. The result indicates that it is the incremental decision model that best describe the decision process of the participating school, however, no model is ever a direct reflection of reality and elements of more models can be seen in the study. As the results indicate vague and unclear decision processes, the conclusion is that schools would gain from clarifying the processes for all concerned, there is need to talk about a designed decision process.

Page generated in 0.4059 seconds