Spelling suggestions: "subject:"decisionprocess"" "subject:"decisionsprocess""
331 |
Résolution de processus décisionnels de Markov à espace d'état et d'action factorisés - Application en agroécologie / Solving Markov decision processes with factored state and action spaces - Application in agroecologyRadoszycki, Julia 09 October 2015 (has links)
Cette thèse porte sur la résolution de problèmes de décision séquentielle sous incertitude,modélisés sous forme de processus décisionnels de Markov (PDM) dont l’espace d’étatet d’action sont tous les deux de grande dimension. La résolution de ces problèmes avecun bon compromis entre qualité de l’approximation et passage à l’échelle est encore unchallenge. Les algorithmes de résolution dédiés à ce type de problèmes sont rares quandla dimension des deux espaces excède 30, et imposent certaines limites sur la nature desproblèmes représentables.Nous avons proposé un nouveau cadre, appelé PDMF3, ainsi que des algorithmesde résolution approchée associés. Un PDMF3 est un processus décisionnel de Markov àespace d’état et d’action factorisés (PDMF-AF) dont non seulement l’espace d’état etd’action sont factorisés mais aussi dont les politiques solutions sont contraintes à unecertaine forme factorisée, et peuvent être stochastiques. Les algorithmes que nous avonsproposés appartiennent à la famille des algorithmes de type itération de la politique etexploitent des techniques d’optimisation continue et des méthodes d’inférence dans lesmodèles graphiques. Ces algorithmes de type itération de la politique ont été validés sur un grand nombre d’expériences numériques. Pour de petits PDMF3, pour lesquels la politique globale optimale est disponible, ils fournissent des politiques solutions proches de la politique globale optimale. Pour des problèmes plus grands de la sous-classe des processus décisionnels de Markov sur graphe (PDMG), ils sont compétitifs avec des algorithmes de résolution de l’état de l’art en termes de qualité. Nous montrons aussi que nos algorithmes permettent de traiter des PDMF3 de très grande taille en dehors de la sous-classe des PDMG, sur des problèmes jouets inspirés de problèmes réels en agronomie ou écologie. L’espace d’état et d’action sont alors tous les deux de dimension 100, et de taille 2100. Dans ce cas, nous comparons la qualité des politiques retournées à celle de politiques expertes. Dans la seconde partie de la thèse, nous avons appliqué le cadre et les algorithmesproposés pour déterminer des stratégies de gestion des services écosystémiques dans unpaysage agricole. Les adventices, plantes sauvages des milieux agricoles, présentent desfonctions antagonistes, étant à la fois en compétition pour les ressources avec la cultureet à la base de réseaux trophiques dans les agroécosystèmes. Nous cherchons à explorerquelles organisations du paysage (ici composé de colza, blé et prairie) dans l’espace etdans le temps permettent de fournir en même temps des services de production (rendementen céréales, fourrage et miel), des services de régulation (régulation des populationsd’espèces adventices et de pollinisateurs sauvages) et des services culturels (conservationd’espèces adventices et de pollinisateurs sauvages). Pour cela, nous avons développé unmodèle de la dynamique des adventices et des pollinisateurs et de la fonction de récompense pour différents objectifs (production, maintien de la biodiversité ou compromisentre les services). L’espace d’état de ce PDMF3 est de taille 32100, et l’espace d’actionde taille 3100, ce qui en fait un problème de taille conséquente. La résolution de ce PDMF3 a conduit à identifier différentes organisations du paysage permettant d’atteindre différents bouquets de services écosystémiques, qui diffèrent dans la magnitude de chacune des trois classes de services écosystémiques. / This PhD thesis focuses on the resolution of problems of sequential decision makingunder uncertainty, modelled as Markov decision processes (MDP) whose state and actionspaces are both of high dimension. Resolution of these problems with a good compromisebetween quality of approximation and scaling is still a challenge. Algorithms for solvingthis type of problems are rare when the dimension of both spaces exceed 30, and imposecertain limits on the nature of the problems that can be represented.We proposed a new framework, called F3MDP, as well as associated approximateresolution algorithms. A F3MDP is a Markov decision process with factored state andaction spaces (FA-FMDP) whose solution policies are constrained to be in a certainfactored form, and can be stochastic. The algorithms we proposed belong to the familyof approximate policy iteration algorithms and make use of continuous optimisationtechniques, and inference methods for graphical models.These policy iteration algorithms have been validated on a large number of numericalexperiments. For small F3MDPs, for which the optimal global policy is available, theyprovide policy solutions that are close to the optimal global policy. For larger problemsfrom the graph-based Markov decision processes (GMDP) subclass, they are competitivewith state-of-the-art algorithms in terms of quality. We also show that our algorithmsallow to deal with F3MDPs of very large size outside the GMDP subclass, on toy problemsinspired by real problems in agronomy or ecology. The state and action spaces arethen both of dimension 100, and of size 2100. In this case, we compare the quality of thereturned policies with the one of expert policies. In the second part of the thesis, we applied the framework and the proposed algorithms to determine ecosystem services management strategies in an agricultural landscape.Weed species, ie wild plants of agricultural environments, have antagonistic functions,being at the same time in competition with the crop for resources and keystonespecies in trophic networks of agroecosystems. We seek to explore which organizationsof the landscape (here composed of oilseed rape, wheat and pasture) in space and timeallow to provide at the same time production services (production of cereals, fodder andhoney), regulation services (regulation of weed populations and wild pollinators) andcultural services (conservation of weed species and wild pollinators). We developed amodel for weeds and pollinators dynamics and for reward functions modelling differentobjectives (production, conservation of biodiversity or trade-off between services). Thestate space of this F3MDP is of size 32100, and the action space of size 3100, which meansthis F3MDP has substantial size. By solving this F3MDP, we identified various landscapeorganizations that allow to provide different sets of ecosystem services which differ inthe magnitude of each of the three classes of ecosystem services.
|
332 |
Nákupní chování a postoje zákazníků maloobchodní jednotky / Buying behavior and the attitudes of customers of a retail unitMAREŠOVÁ, Petra January 2013 (has links)
The main objective of this diploma thesis was to analyze the buying behaviour and attitudes of customers and then propose concrete measures for improvement. The first part of the work contains theoretical information. There are characterized basic terms such as consumer and buying behavior, purchase decision process, factors affecting the purchase decisions of consumers, individual types of buyers, marketing research, marketing research process and marketing research techniques. Second part is the practical part and contains information about next companies - FLOP JIH spol. s r.o., FLOSMAN a. s., and also about the retail unit Flop Diskont Mladá Vožice. This part also contains analysis of buing behaviour and attitudes of customers of this chosen unit. The analysis was made through a questionnaire survey.At the end of the work there are suggested any recommendations in several areas in which the retail unit has some potential to improve.
|
333 |
Online Learning and Simulation Based Algorithms for Stochastic OptimizationLakshmanan, K January 2012 (has links) (PDF)
In many optimization problems, the relationship between the objective and parameters is not known. The objective function itself may be stochastic such as a long-run average over some random cost samples. In such cases finding the gradient of the objective is not possible. It is in this setting that stochastic approximation algorithms are used. These algorithms use some estimates of the gradient and are stochastic in nature. Amongst gradient estimation techniques, Simultaneous Perturbation Stochastic Approximation (SPSA) and Smoothed Functional(SF) scheme are widely used. In this thesis we have proposed a novel multi-time scale quasi-Newton based smoothed functional (QN-SF) algorithm for unconstrained as well as constrained optimization. The algorithm uses the smoothed functional scheme for estimating the gradient and the quasi-Newton method to solve the optimization problem. The algorithm is shown to converge with probability one.
We have also provided here experimental results on the problem of optimal routing in a multi-stage network of queues. Policies like Join the Shortest Queue or Least Work Left assume knowledge of the queue length values that can change rapidly or hard to estimate. If the only information available is the expected end-to-end delay as with our case, such policies cannot be used. The QN-SF based probabilistic routing algorithm uses only the total end-to-end delay for tuning the probabilities. We observe from the experiments that the QN-SF algorithm has better performance than the gradient and Jacobi versions of Newton based smoothed functional algorithms. Next we consider constrained routing in a similar queueing network. We extend the QN-SF algorithm to this case. We study the convergence behavior of the algorithm and observe that the constraints are satisfied at the point of convergence. We provide experimental results for the constrained routing setup as well.
Next we study reinforcement learning algorithms which are useful for solving Markov Decision Process(MDP) when the precise information on transition probabilities is not known. When the state, and action sets are very large, it is not possible to store all the state-action tuples. In such cases, function approximators like neural networks have been used. The popular Q-learning algorithm is known to diverge when used with linear function approximation due to the ’off-policy’ problem. Hence developing stable learning algorithms when used with function approximation is an important problem.
We present in this thesis a variant of Q-learning with linear function approximation that is based on two-timescale stochastic approximation. The Q-value parameters for a given policy in our algorithm are updated on the slower timescale while the policy parameters themselves are updated on the faster scale. We perform a gradient search in the space of policy parameters. Since the objective function and hence the gradient are not analytically known, we employ the efficient one-simulation simultaneous perturbation stochastic approximation(SPSA) gradient estimates that employ Hadamard matrix based deterministic perturbations. Our algorithm has the advantage that, unlike Q-learning, it does not suffer from high oscillations due to the off-policy problem when using function approximators. Whereas it is difficult to prove convergence of regular Q-learning with linear function approximation because of the off-policy problem, we prove that our algorithm which is on-policy is convergent. Numerical results on a multi-stage stochastic shortest path problem show that our algorithm exhibits significantly better performance and is more robust as compared to Q-learning. Future work would be to compare it with other policy-based reinforcement learning algorithms.
Finally, we develop an online actor-critic reinforcement learning algorithm with function approximation for a problem of control under inequality constraints. We consider the long-run average cost Markov decision process(MDP) framework in which both the objective and the constraint functions are suitable policy-dependent long-run averages of certain sample path functions. The Lagrange multiplier method is used to handle the inequality constraints. We prove the asymptotic almost sure convergence of our algorithm to a locally optimal solution. We also provide the results of numerical experiments on a problem of routing in a multistage queueing network with constraints on long-run average queue lengths. We observe that our algorithm exhibits good performance on this setting and converges to a feasible point.
|
334 |
When the physical patient becomes digital : A study of the innovation “digital health care center” on the Swedish marketTelemo Nilsson, Sara, Rexha, Laurinda January 2016 (has links)
Object of study: The innovation “Digital health care center” from a multi-level stakeholder’s perspective. Problem: A new technology era has opened up for new kind of innovations. Digital health care centers are a service that recently has been introduced on the Swedish market, which needs further investigation. To be able to better understand, explain and predict future behavior of an innovation the innovation could be theoretical conceptualized and classified. In the specific area of health care, new innovation should preferable be investigated in from a multilevel perspective, including different stakeholders opinions. One if the stakeholders are the customers. If new innovative products and services want to be successful, it required consumers to adopt the product or service, but relatively few studies have focused on the adoption of technology services among customers. Purpose: The purpose of this thesis is to gain a better understanding of the innovation “digital health care center” in Sweden. Research question: How can the innovation “digital health care center” be described through a stakeholder perspective? Method: The empirical data were collected through qualitative semi-structured interviews and a structured quantitative questionnaire. Conclusions: The innovation digital health care center can from a multi-level perspective be described as an innovation that contributes and have an impact on the market and the healthcare industry in many ways. The innovation could be described as a complement to traditional health care. The innovation has influences from different theoretical classes of innovation which means that the innovation cannot be categorized in a specific class. The innovation can be considered successful because it facilitates for the patient.. According to the stakeholder group potential patients, a majority of the respondents thinks that increased availability and time-efficiency would be facilitating factors and reasons for using the service. The innovation is described by the various stakeholders as contributing to a better society. The care becomes more productive, cost effective, more available, and in the broader perspective, the innovation contributes to increased digitalization of the healthcare sector as a whole. There are many new possible fields of application which in the healthcare industry which could develop the innovation further. Strengths and opportunities with the innovation can be considering outweighing weaknesses with the innovation and potential threats of the innovation.
|
335 |
Ekologiska val i en e-handelskontext : En mental modell över användares köpbeteendeEkwall, Andreas, Braaf, Johanna January 2013 (has links)
Att köpa ekologiskt är ingen ny företeelse, men i takt med att både e-handel och efterfrågan på ekologiska varor konstant ökar så behövs det stöd för designers för att kunna möta dessa utmaningar. För att kunna designa e-handelswebbsidor med ekologiska valmöjligheter krävs förståelse för användaren och dess behov. Genomatt kartlägga målinriktade beteenden går det att påvisa när en användare är mottaglig för ett ekologiskt val på en e-handelswebbsida. En mental modell byggs utifrån användares målinriktade beteende och denna kan användas som stöd av designers för att förstå när en användare är mer mottaglig för ekologiska val på en e-handelswebbsida. Denna studie resulterar i en identifiering och kartläggning över när en användare är mer öppen för ekologiska val. / Buying ecologically not a new phenomenon, but as to both e-commerce and the demand for organic products is constantly increasing, so support is required for designers to meet these challenges. In order to design e-commerce websites with ecological options requires an understanding of the user and their needs. By mapping the targeted behaviors you can detect when a user is susceptible to an ecological choice for an e-commerce website. A mental model is built based onuser's targeted behavior, and this can be used to support the designers to understand when a user is more susceptible to ecological choices on an e-commercewebsite. This study results in the identification and mapping of when a user is more open to ecological choices.
|
336 |
Kupní rozhodovací proces spotřebitele v cestovním ruchu / Buyer decision process in travel and tourismBatulková, Monika January 2008 (has links)
The aim of this diploma thesis is to analyze a buyer decision process of consumers in travel industry with a focus on outgoing tourism. The diploma thesis is divided in six parts. The first part focuses on theory of consumer behavior followed by marketing of services with an emphasis on tourism. Next part covers buyer decision process of consumers in travel industry based on results from analysis of a questionnaire. These results are integrated with agency data. A summary of acquired results is compared to all phases of buyer decision process at the end of this thesis.
|
337 |
Förstagångsköpare av högengagemangsprodukter : Hur de söker information och utvärderar alternativ / First-time buyers of high involvement products : How they search for information and evaluate alternativesRutgerson, Isabelle, Alm, Jessica, Liljhagen, Hampus January 2020 (has links)
The purpose of this study is to generate an understanding of first-time buyers of high involvement products, by examine how they search for information and evaluate alternatives. Three research questions were formulated to achieve the purpose of the study. Two of them concern first-time buyers’ behavior and the third one aims to answer if any possible explanations to their behavior could be identified. The study is based on theories within the research field of consumer behavior regarding purchase behavior, the consumer decision process, decision making style, involvement and knowledge along with uncertainty. In order to answer the purpose and the associated research questions, data was collected with a qualitative approach through semi structured interviews. The empirical data was analyzed by a thematic analysis, derived from a model based on the theoretical framework. The results of the study indicate that first-time buyers do not consider their internal information search adequate, and therefore search for further information externally. Their external information search tends to involve several sources. The sources credibility seems to be based on previous experiences from other situations. How they evaluate alternatives also seems to be influenced by previous use of cut-offs and decision rules, to simplify their decision making. Further the results argue that the stages search for information and evaluation of alternatives is rather integrated and iterative than detached. However, it appears that the decision-making style of first-time buyers of high involvement products differ in their degree of involvement, levels of knowledge and experienced uncertainty. Both the complex and the dissonance reducing buying behavior is occurring in first-time buyers of high involvement products. Additionally, the results indicate that tendencies of both a complex and dissonance reducing buying behavior could be identified in one individual.This study is written in Swedish. / Syftet med den här studien är att skapa en förståelse för förstagångsköpare av högengagemangsprodukter genom att undersöka hur de söker information och utvärderar alternativ. Tre forskningsfrågor formulerades utifrån studiens syfte, varav två rör förstagångsköpares beteende och den tredje ämnar ge svar på eventuella förklaringar till deras agerande. Studien utgår från teorier inom forskningsfältet för konsumentbeteende som berör köpbeteende, köpbeslutsprocessen, beslutsfattarstil, engagemang samt kunskap och osäkerhet. Med ett kvalitativt angreppssätt samlades data in genom semistrukturerade intervjuer för att ge svar på syftet med tillhörande forskningsfrågor. En modell togs fram baserat på den teoretiska referensramen som sedan låg till grund för en tematisk analys av empirin. Studiens resultat visar att förstagångsköpares interna informationssökning inte är tillräcklig vid högengagemangsköp, vilket resulterar i att ytterligare information söks externt. I den externa sökningen tenderar de att söka information från flera källor. Källornas trovärdighet verkar bedömas utifrån deras tidigare erfarenheter från andra sammanhang. Även utvärderingen influeras av tidigare tillämpning av brytpunkter och beslutsregler som underlättar beslutsfattandet. Det framgår också att stadierna informationssökning och utvärdering av alternativ snarare sker integrerat, i en iterativ process, än var för sig. Studiens resultat bekräftar att beslutsfattarstil, engagemang samt kunskaps- och osäkerhetsnivå influerar informationssöknings- och utvärderingsprocessen hos förstagångsköpare av högengagemangsprodukter. Däremot framgår det att förstagångsköpare av högengagemangsprodukter har olika beslutsfattarstil, grad av engagemang samt besitter olika nivåer av kunskap och upplever varierad grad av osäkerhet. Både ett komplext och dissonansreducerande köpbeteende förekommer hos förstagångsköpare av högengagemangsprodukter. Dessutom visar resultatet att det kan identifieras tendenser som tyder på både ett komplext och dissonansreducerande köpbeteende hos en och samma individ. Studien är skriven på svenska.
|
338 |
Strojové učení ve strategických hrách / Machine Learning in Strategic GamesVlček, Michael January 2018 (has links)
Machine learning is spearheading progress for the field of artificial intelligence in terms of providing competition in strategy games to a human opponent, be it in a game of chess, Go or poker. A field of machine learning, which shows the most promising results in playing strategy games, is reinforcement learning. The next milestone for the current research lies in a computer game Starcraft II, which outgrows the previous ones in terms of complexity, and represents a potential new breakthrough in this field. The paper focuses on analysis of the problem, and suggests a solution incorporating a reinforcement learning algorithm A2C and hyperparameter optimization implementation PBT, which could mean a step forward for the current progress.
|
339 |
Quality of Service Aware Mechanisms for (Re)Configuring Data Stream Processing Applications on Highly Distributed Infrastructure / Mécanismes prenant en compte la qualité de service pour la (re)configuration d’applications de traitement de flux de données sur une infrastructure hautement distribuéeDa Silva Veith, Alexandre 23 September 2019 (has links)
Une grande partie de ces données volumineuses ont plus de valeur lorsqu'elles sont analysées rapidement, au fur et à mesure de leur génération. Dans plusieurs scénarios d'application émergents, tels que les villes intelligentes, la surveillance opérationnelle de grandes infrastructures et l'Internet des Objets (Internet of Things), des flux continus de données doivent être traités dans des délais très brefs. Dans plusieurs domaines, ce traitement est nécessaire pour détecter des modèles, identifier des défaillances et pour guider la prise de décision. Les données sont donc souvent rassemblées et analysées par des environnements logiciels conçus pour le traitement de flux continus de données. Ces environnements logiciels pour le traitement de flux de données déploient les applications sous-la forme d'un graphe orienté ou de dataflow. Un dataflow contient une ou plusieurs sources (i.e. capteurs, passerelles ou actionneurs); opérateurs qui effectuent des transformations sur les données (e.g., filtrage et agrégation); et des sinks (i.e., éviers qui consomment les requêtes ou stockent les données). Nous proposons dans cette thèse un ensemble de stratégies pour placer les opérateurs dans une infrastructure massivement distribuée cloud-edge en tenant compte des caractéristiques des ressources et des exigences des applications. En particulier, nous décomposons tout d'abord le graphe d'application en identifiant quelques comportements tels que des forks et des joints, puis nous le plaçons dynamiquement sur l'infrastructure. Des simulations et un prototype prenant en compte plusieurs paramètres d'application démontrent que notre approche peut réduire la latence de bout en bout de plus de 50% et aussi améliorer d'autres métriques de qualité de service. L'espace de recherche de solutions pour la reconfiguration des opérateurs peut être énorme en fonction du nombre d'opérateurs, de flux, de ressources et de liens réseau. De plus, il est important de minimiser le coût de la migration tout en améliorant la latence. Des travaux antérieurs, Reinforcement Learning (RL) et Monte-Carlo Tree Searh (MCTS) ont été utilisés pour résoudre les problèmes liés aux grands nombres d’actions et d’états de recherche. Nous modélisons le problème de reconfiguration d'applications sous la forme d'un processus de décision de Markov (MDP) et étudions l'utilisation des algorithmes RL et MCTS pour concevoir des plans de reconfiguration améliorant plusieurs métriques de qualité de service. / A large part of this big data is most valuable when analysed quickly, as it is generated. Under several emerging application scenarios, such as in smart cities, operational monitoring of large infrastructure, and Internet of Things (IoT), continuous data streams must be processed under very short delays. In multiple domains, there is a need for processing data streams to detect patterns, identify failures, and gain insights. Data is often gathered and analysed by Data Stream Processing Engines (DSPEs).A DSPE commonly structures an application as a directed graph or dataflow. A dataflow has one or multiple sources (i.e., gateways or actuators); operators that perform transformations on the data (e.g., filtering); and sinks (i.e., queries that consume or store the data). Most complex operator transformations store information about previously received data as new data is streamed in. Also, a dataflow has stateless operators that consider only the current data. Traditionally, Data Stream Processing (DSP) applications were conceived to run in clusters of homogeneous resources or on the cloud. In a cloud deployment, the whole application is placed on a single cloud provider to benefit from virtually unlimited resources. This approach allows for elastic DSP applications with the ability to allocate additional resources or release idle capacity on demand during runtime to match the application requirements.We introduce a set of strategies to place operators onto cloud and edge while considering characteristics of resources and meeting the requirements of applications. In particular, we first decompose the application graph by identifying behaviours such as forks and joins, and then dynamically split the dataflow graph across edge and cloud. Comprehensive simulations and a real testbed considering multiple application settings demonstrate that our approach can improve the end-to-end latency in over 50% and even other QoS metrics. The solution search space for operator reassignment can be enormous depending on the number of operators, streams, resources and network links. Moreover, it is important to minimise the cost of migration while improving latency. Reinforcement Learning (RL) and Monte-Carlo Tree Search (MCTS) have been used to tackle problems with large search spaces and states, performing at human-level or better in games such as Go. We model the application reconfiguration problem as a Markov Decision Process (MDP) and investigate the use of RL and MCTS algorithms to devise reconfiguring plans that improve QoS metrics.
|
340 |
Network Utility Maximization Based on Information FreshnessCho-Hsin Tsai (12225227) 20 April 2022 (has links)
<p>It is predicted that there would be 41.6 billion IoT devices
by 2025, which has kindled new interests on the timing coordination between
sensors and controllers, i.e., how to use the waiting time to improve the
performance. Sun et al. showed that a <i>controller</i> can strictly improve
the data freshness, the so-called Age-of-Information (AoI), via careful
scheduling designs. The optimal waiting policy for the <i>sensor</i> side was
later characterized in the context of remote estimation. The first part of this
work develops the jointly optimal sensor/controller waiting policy. It
generalizes the above two important results in that not only do we consider
joint sensor/controller designs, but we also assume random delay in both the
forward and feedback directions. </p>
<p> </p>
<p>The second part of the work revisits and significantly
strengthens the seminal results of Sun et al on the following fronts: (i) When
designing the optimal offline schemes with full knowledge of the delay
distributions, a new <i>fixed-point-based</i> method is proposed with <i>quadratic
convergence rate</i>; (ii) When the distributional knowledge is unavailable,
two new low-complexity online algorithms are proposed, which provably attain
the optimal average AoI penalty; and (iii) the online schemes also admit a
modular architecture, which allows the designer to <i>upgrade</i> certain
components to handle additional practical challenges. Two such upgrades are
proposed: (iii.1) the AoI penalty function incurred at the destination is
unknown to the source node and must also be estimated on the fly, and (iii.2)
the unknown delay distribution is Markovian instead of i.i.d. </p>
<p> </p>
<p>With the exponential growth of interconnected IoT devices
and the increasing risk of excessive resource consumption in mind, the third
part of this work derives an optimal joint cost-and-AoI minimization solution
for multiple coexisting source-destination (S-D) pairs. The results admit a new <i>AoI-market-price</i>-based
interpretation and are applicable to the setting of (i) general heterogeneous
AoI penalty functions and Markov delay distributions for each S-D pair, and
(ii) a general network cost function of aggregate throughput of all S-D pairs. </p>
<p> </p>
<p>In each part of this work, extensive simulation is used to
demonstrate the superior performance of the proposed schemes. The discussion on
analytical as well as numerical results sheds some light on designing practical
network utility maximization protocols.</p>
|
Page generated in 0.0979 seconds