Spelling suggestions: "subject:"markovian"" "subject:"markoviano""
91 |
Desigualdade regional de renda e migrações : mobilidade intergeracional educacional e intrageracional de renda no BrasilNetto Junior, José Luis da Silva January 2008 (has links)
A presente tese tem como objetivo analisar as relações entre as variáveis educacionais e a desigualdade de renda no Brasil e suas repercussões no que se refere a mobilidade intergeracional educacional e intrageracional de renda. O objetivo específico é o de verificar como a mobilidade intergeracional educacional e intrageracional de renda se diferencia regionalmente e de que modo se distingue entre os migrantes e não migrantes. Os resultados sugerem que a desigualdade de renda e de capital humano têm uma relação positiva não linear. Nas áreas onde o indicador de desigualdade de capital humano é maior, a influência dos pais nos mais baixos estratos educacionais é grande se comparado as regiões onde a desigualdade educacional é mais baixa. De um modo geral, nas regiões e estados mais pobres, os pais menos qualificados têm maior influência sobre a trajetória educacional de seus filhos. Em paralelo na região onde os estados têm os mais altos indicadores de desigualdade educacional apresenta a menor mobilidade de renda dentre as regiões analisadas. Os pais migrantes com baixa escolaridade têm uma influência menor sobre a educação dos seus filhos que seus equivalentes nas áreas de origem. E por último, os migrantes têm uma mobilidade de renda maior que a população de suas áreas de origem o que sugere uma seletividade positiva destes. / This thesis aims to analyze the relationship between educational variables and income inequality in Brazil and its repercussion related to educational and income mobility. The specific goal is to verify how the income mobility and human capital accumulation behave considering the regional differences in Brazil and migrant and native population. The results show a non-linear and positive relationship between income and human capital inequality. In the areas where the human capital inequality is higher, parents with no schooling have more influence than in the places where educational inequality is lower. At same time, the income mobility is higher in the Center and Southeast regions e lower in Northeast. The migrant parents with low schooling have less influence over the child schooling in comparison with the equivalents in their origin region. population has higher income mobility than non-migrant.
|
92 |
Propriedades de filtros lineares para sistemas lineares com saltos markovianos a tempo discreto / Properties of linear filters for discrete-time Markov jump linear systems.Maria Josiane Ferreira Gomes 12 March 2015 (has links)
Este trabalho é dedicado ao estudo do erro de estimação em filtragem linear para sistemas lineares com parâmentros sujeitos a saltos markovianos a tempo discreto. Indroduzimos o conceito de alcançabilidade média para uma classe de sistemas. Construímos um conjunto de matrizes de alcançabilidade e mostramos que o conceito usual de alcan- çabilidade definido através da positividade do gramiano é caracterizado pela definição por posto completo destas matrizes. A alcançabilidade média funciona como condição necessária e suficiente para positividade do segundo momento do estado do sistema, resultado esse que auxilia na caracterização da positividade uniforme da matriz de covariância do erro de estimação. Abordamos a estabilidade de estimadores com a interpretação de que a covariância do erro permanece limitada na presença de erro de qualquer magnitude no modelo do ruído, que é uma característica relevante para aplicações. Apresentamos uma prova de que filtros markovianos são estáveis sempre que o segundo momento condicionado é positivo. Exemplos numéricos encontram-se inclusos. / This work studies linear filtering for discrete-time systems with Markov jump parameters. We introduce a notion of average reachability for these systems and present a set of matrices playing the role of reachability matrices, in the sense that their rank is full if and only if the system is average reachable. Reachability is also a sufficient condition for the second moment of the system to be positive. Uniform positiveness of the error covariance matrix is studied for general (possibly non-markovian) linear estimators, relying on the state second moment positiveness. Satbility of linear markovian estimators is also addressed, allowing to show that markovian estimators are stable whenever the system is reachable, with the interpretation that the error covariance remains bounded in the presence of error of any magnitude in the model of the noise, which is a relevant feature for applications. Numerical examples are included.
|
93 |
[en] STOCHASTIC HARMONIC MODEL FOR PRICE FLUCTUATIONS / [pt] MODELO HARMÔNICO ESTOCÁSTICO PARA AS FLUTUAÇÕES DE PREÇOVICTOR JORGE LIMA GALVAO ROSA 18 December 2017 (has links)
[pt] Consideramos o oscilador harmônico com amortecimento aleatório em presença de ruído externo. Os ruídos, representando perturbações externas e internas, são modelados pelo processo de Ornstein-Uhlenbeck ou ruído branco e pelo processo dicotômico ou ruído branco, respectivamente. Usando técnicas de sistemas dinâmicos, analisamos o valor médio e a dispersão da posição e da velocidade do oscilador harmônico estocástico, apresentando resultados analíticos e numéricos. Em particular, obtemos expressões para a expansão de baixa-ordem em relação ao tempo de correlação da perturbação interna, no caso da atuação do ruído dicotômico. Finalmente, usando o modelo de oscilador harmônico com amortecimento aleatório como referência, investigamos a série intradiária de preços do mercado brasileiro. / [en] We consider the random damping harmonic oscillator in presence of external noise. The noises, representing external and internal perturbations, are modeled as an Ornstein-Uhlenbeck process or a white noise and as a dichotomous process or a white noise, respectively. Using dynamical systems tools, we analyze the expected value as well as the dispersion of the stochastic harmonic oscillator s position and velocity, presenting analytical and numerical results. In particular, we also provide expressions for the low-order expansion in the correlation time of the internal perturbation, in the case the dichotomous noise is at play. Using random damped harmonic oscillator model as a reference, we conclude by investigating the intra-day Brazilian stock price series.
|
94 |
Analysis And Optimization Of Queueing Models With Markov Modulated Poisson InputHemachandra, Nandyala 06 1900 (has links) (PDF)
No description available.
|
95 |
Discrete event modeling and analysis for systems biology models / ..Soueidan, Hayssam 04 December 2009 (has links)
Les travaux effectués durant cette thèse portent sur la spécification, l'analyse et l'application de systèmes a événements discrets pour la modélisation de processus biologiques stochastiques en biologie des systèmes. Le point de départ de cette thèse est le langage de modélisation AltaRica, que nous étendons afin de permettre de décrire des événements temporisés selon des distributions de probabilités quelconques (dégénérées, discrètes et continues). Nous définissons ensuite la sémantique de ce langage en terme d'automates de mode stochastiques et présentons trois opérations de compositions permettant de modéliser des systèmes hiérarchiques avec événements synchronisés et partage de valeurs via un mécanisme de connexion. Nous donnons ensuite au automates de mode stochastiques une sémantique en termes de systèmes de transitions dont les transitions sont étiquetées par des distributions de probabilités et des probabilités de transitions instantanées. Nous caractérisons ensuite 6 sous classes de ces systèmes de transitions et donnons pour chacune de ces classes un algorithme de simulation ainsi qu'une mesure de probabilité sur les chemins finis. Nous montrons que pour certaines de ces classes, notre sémantique est conforme avec les mesures de probabilité de chemin usuellement associées aux chaînes de Markov a temps discret, a temps continu et aux processus semi-Markoviens généralisés. Nous abordons ensuite le problème de la réutilisation de modèles continus existant dans un système discret. Nous donnons une méthode d'abstraction permettant de représenter un ensemble de trajectoires bornées ou non d'un modèle continu sous forme d'un système de transition stochastique fini. A travers des exemples tirés de la littérature, nous montrons que notre abstraction préserve les propriétés "qualitatives" (par exemple oscillations, hystérie) des modèles continus et qu'une comparaison entre trajectoires basée sur leurs représentations en termes de systèmes de transitions permet de regrouper les trajectoires en fonction de comportements qualitatifs plus fins que ceux permis par la théorie des bifurcations. Finalement, nous étudions a l'aide de ces modèles des processus liés a la division cellulaire chez les levures. En particulier, nous définissons un modèle pour le vieillissement cellulaire dans une population de levure où le comportement individuel d'une cellule est régi par une équation différentielle ordinaire et où le processus de division est régi par un système de transition. Nous montrons a l'aide de ce modèle que la survie d'une population de levure de type Schizosaccharomyces Pombe, qui se divisent par une fission médiane, n'est possible que grâce a un mécanisme de distribution non symétrique des dégâts oxydatifs entre la progéniture et la cellule souche. Cette hypothèse fut validée expérimentalement lors d'une collaboration avec le laboratoire de micro-biologie de Göteborg. / A general goal of systems biology is to acquire a detailed understanding of the dynamics of living systems by relating functional properties of whole systems with the interactions of their constituents. Often this goal is tackled through computer simulation. A number of different formalisms are currently used to construct numerical representations of biological systems, and a certain wealth of models is proposed using ad hoc methods. There arises an interesting question of to what extent these models can be reused and composed, together or in a larger framework. In this thesis, we propose BioRica as a means to circumvent the difficulty of incorporating disparate approaches in the same modeling study. BioRica is an extension of the AltaRica specification language to describe hierarchical non-deterministic General Semi-Markov processes. We first extend the syntax and automata semantics of AltaRica in order to account for stochastic labeling. We then provide a semantics to BioRica programs in terms of stochastic transition systems, that are transition systems with stochastic labeling. We then develop numerical methods to symbolically compute the probability of a given finite path in a stochastic transition systems. We then define algorithms and rules to compile a BioRica system into a stand alone C++ simulator that simulates the underlying stochastic process. We also present language extensions that enables the modeler to include into a BioRica hierarchical systems nodes that use numerical libraries (e.g. Mathematica, Matlab, GSL). Such nodes can be used to perform numerical integration or flux balance analysis during discrete event simulation. We then consider the problem of using models with uncertain parameter values. Quantitative models in Systems Biology depend on a large number of free parameters, whose values completely determine behavior of models. Some range of parameter values produce similar system dynamics, making it possible to define general trends for trajectories of the system (e.g. oscillating behavior) for some parameter values. In this work, we defined an automata-based formalism to describe the qualitative behavior of systems’ dynamics. Qualitative behaviors are represented by finite transition systems whose states contain predicate valuation and whose transitions are labeled by probabilistic delays. We provide algorithms to automatically build such automata representation by using random sampling over the parameter space and algorithms to compare and cluster the resulting qualitative transition system. Finally, we validate our approach by studying a rejuvenation effect in yeasts cells population by using a hierarchical population model defined in BioRica. Models of ageing for yeast cells aim to provide insight into the general biological processes of ageing. For this study, we used the BioRica framework to generate a hierarchical simulation tool that allows dynamic creation of entities during simulation. The predictions of our hierarchical mathematical model has been validated experimentally by the micro-biology laboratory of Gothenburg
|
96 |
Modelling and analysis of wireless MAC protocols with applications to vehicular networksJafarian, Javad January 2014 (has links)
The popularity of the wireless networks is so great that we will soon reach the point where most of the devices work based on that, but new challenges in wireless channel access will be created with these increasingly widespread wireless communications. Multi-channel CSMA protocols have been designed to enhance the throughput of the next generation wireless networks compared to single-channel protocols. However, their performance analysis still needs careful considerations. In this thesis, a set of techniques are proposed to model and analyse the CSMA protocols in terms of channel sensing and channel access. In that respect, the performance analysis of un-slotted multi-channel CSMA protocols is studied through considering the hidden terminals. In the modelling phase, important parameters such as shadowing and path loss impairments are being considered. Following that, due to the high importance of spectrum sensing in CSMA protocols, the Double-Threshold Energy Detector (DTED) is thoroughly investigated in this thesis. An iterative algorithm is also proposed to determine optimum values of detection parameters in a sensing-throughput problem formulation. Vehicle-to-Roadside (V2R) communication, as a part of Intelligent Transportation System (ITS), over multi-channel wireless networks is also modelled and analysed in this thesis. In this respect, through proposing a novel mathematical model, the connectivity level which an arbitrary vehicle experiences during its packet transmission with a RSU is also investigated.
|
97 |
Phénoménologie de particules actives à états internes finis et discrets : une étude individuelle et collective / Phenomenology of active particles with finite and discrete internal states : an individual and collective studyGómez Nava, Luis Alberto 05 November 2018 (has links)
Dans cette thèse, nous présentons un cadre théorique pour étudier les systèmes de particules actives fonctionnant avec une quantité discrète d'états internes qui contrôlent le comportement externe de ces objets. Les concepts théoriques développés dans cette thèse sont introduits afin de comprendre un grand nombre de systèmes biologiques multi-agents dont les individus présentent différents types de comportements se succédant au cours du temps. Par construction, le modèle théorique suppose que l'observateur extérieur a accès uniquement au comportement visible des individus, et non pas à leurs états internes. C'est seulement après une étude détaillée de la dynamique comportementale que l'existence de ces états internes devient évidente. Cette analyse est cruciale pour pouvoir associer les comportements observés expérimentalement avec un ou plusieurs états internes du modèle. Cette association entre les états et les comportements doit être faite selon les observations et la phénoménologie du système biologique faisant l'objet de l'étude. Les scénarios qui peuvent être observés en utilisant notre modèle théorique sont déterminés par la conception du mécanisme interne des individus (nombre d'états internes, taux de transition, etc…) et seront de nature markovienne par construction. Tous les travaux expérimentaux et théoriques contenus dans cette thèse démontrent que notre modèle est approprié pour décrire des systèmes réels montrant des comportements intermittents individuels ou collectifs. Ce nouveau cadre théorique pour des particules actives avec états internes, introduit ici, est encore en développement et nous sommes convaincus qu'il peut potentiellement ouvrir de nouvelles branches de recherche à l'interface entre la physique, la biologie et les mathématiques. / In this thesis we introduce a theoretical framework to understand collections of active particles that operate with a finite number of discrete internal states that control the external behavior of these entities. The theoretical concepts developed in this thesis are conceived to understand the large number of existing multiagent biological systems where the individuals display distinct behavioral phases that alternate with each other. By construction, the premise of our theoretical model is that an external observer has access only to the external behavior of the individuals, but not to their internal state. It is only after careful examination of the behavioral dynamics that the existence of these internal states becomes evident. This analysis is key to be able to associate the experimentally observed behaviors of individuals with one or many internal states of the model. This association between states and behaviors should be done accordingly to the observations and the phenomenology displayed by the biological system that is being the subject of study. The possible scenarios that can be observed using our theoretical model are determined by the design of the internal mechanism of the individuals (number of internal states, transition rates, etc...) and will be of markovian nature by construction. All the experimental and theoretical work contained in this thesis is evidence that our model is suitable to be used to describe real-life systems showing individual or collective intermittent behaviors. This here-introduced new framework of active particles with internal states is still in development and we are convinced that it can potentially open new branches of research at the interface between physics, biology and mathematics.
|
98 |
Reduced Density Matrix Approach to the Laser-Assisted Electron Transport in Molecular WiresWelack, Sven 30 November 2005 (has links)
The electron transport through a molecular wire under the influence of an
external laser field is studied using a reduced density matrix formalism.
The full system is partitioned into the relevant part, i.e. the wire, electron
reservoirs and a phonon bath. An earlier second-order perturbation theory approach of Meier and Tannor for
bosonic environments which employs a numerical decomposition of the spectral
density is used to describe the coupling to the phonon bath and is extended
to deal with the electron transfer between the reservoirs and the molecular wire.
Furthermore, from the resulting time-nonlocal (TNL) scheme a time-local (TL)
approach can be determined. Both are employed to propagate the reduced density
operator in time for an arbitrary time-dependent system Hamiltonian which
incorporates the laser field non-perturbatively.
Within the TL formulation, one can extract a current operator for the open quantum system.
This enables a more general formulation of the problem which is necessary to
employ an optimal control algorithm for open quantum systems in order to
compute optimal control fields for time-distributed target states, e.g. current patterns. Thus, we take
a fundamental step towards optimal control in molecular electronics. Numerical examples of the population dynamics, laser controlled current, TNL vs. TL and optimal control fields are presented to demonstrate the diverse applicability of
the derived formalism.
|
99 |
Non-Markovian Dissipative Quantum Mechanics with Stochastic TrajectoriesKoch, Werner 12 October 2010 (has links)
All fields of physics - be it nuclear, atomic and molecular, solid state, or optical - offer examples of systems which are strongly influenced by the environment of the actual system under investigation. The scope of what is called "the environment" may vary, i.e., how far from the system of interest an interaction between the two does persist. Typically, however, it is much larger than the open system itself. Hence, a fully quantum mechanical treatment of the combined system without approximations and without limitations of the type of system is currently out of reach.
With the single assumption of the environment to consist of an internally thermalized set of infinitely many harmonic oscillators, the seminal work of Stockburger and Grabert [Chem. Phys., 268:249-256, 2001] introduced an open system description that captures the environmental influence by means of a stochastic driving of the reduced system. The resulting stochastic Liouville-von Neumann equation describes the full non-Markovian dynamics without explicit memory but instead accounts for it implicitly through the correlations of the complex-valued noise forces.
The present thesis provides a first application of the Stockburger-Grabert stochastic Liouville-von Neumann equation to the computation of the dynamics of anharmonic, continuous open systems. In particular, it is demonstrated that trajectory based propagators allow for the construction of a numerically stable propagation scheme. With this approach it becomes possible to achieve the tremendous increase of the noise sample count necessary to stochastically converge the results when investigating such systems with continuous variables. After a test against available analytic results for the dissipative harmonic oscillator, the approach is subsequently applied to the analysis of two different realistic, physical systems.
As a first example, the dynamics of a dissipative molecular oscillator is investigated. Long time propagation - until thermalization is reached - is shown to be possible with the presented approach. The properties of the thermalized density are determined and they are ascertained to be independent of the system's initial state. Furthermore, the dependence on the bath's temperature and coupling strength is analyzed and it is demonstrated how a change of the bath parameters can be used to tune the system from the dissociative to the bound regime.
A second investigation is conducted for a dissipative tunneling scenario in which a wave packet impinges on a barrier. The dependence of the transmission probability on the initial state's kinetic energy as well as the bath's temperature and coupling strength is computed.
For both systems, a comparison with the high-temperature Markovian quantum Brownian limit is performed. The importance of a full non-Markovian treatment is demonstrated as deviations are shown to exist between the two descriptions both in the low temperature cases where they are expected and in some of the high temperature cases where their appearance might not be anticipated as easily.:1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Theory of Open Quantum Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1 Influence Functional Formalism . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Quantum Brownian Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Stochastic Unraveling of the Influence Functional . . . . . . . . . . . . . . . 20
2.4 Improved Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.1 Modified Dynamic Response . . . . . . . . . . . . . . . . . . . . . . . 23
2.4.2 Guide Trajectory Transformation . . . . . . . . . . . . . . . . . . . . 24
2.5 Obtaining Properly Correlated Stochastic Samples from Filtered White Noise 24
3 Unified Stochastic Trajectory Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1 Semiclassical Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.1 Guide Trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.2 Real Coherent State Center Coordinates . . . . . . . . . . . . . . . . 31
3.1.3 Propagation Scheme Including Stochastic Forces . . . . . . . . . . . 32
3.2 Stochastic Bohmian Mechanics with Complex Action . . . . . . . . . . . . . 33
3.2.1 Hydrodynamic Formulation of Bohmian Mechanics . . . . . . . . . . 33
3.2.2 Bohmian Mechanics with Complex Action . . . . . . . . . . . . . . . 34
3.2.3 Stochastic BOMCA Trajectories . . . . . . . . . . . . . . . . . . . . 38
3.3 Noise Distribution Preserving Removal of Adverse Samples . . . . . . . . . . 39
4 Dissipative Harmonic Oscillator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1 Reservoir Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Harmonic Oscillator Analytic Expectation Values . . . . . . . . . . . . . . . 42
4.2.1 Ohmic Bath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.2 Drude Regularized Bath . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Sampling Strategies and Analytic Comparison . . . . . . . . . . . . . . . . . 44
4.4 Limits of the Markovian Approximation . . . . . . . . . . . . . . . . . . . . 45
5 Dissipative Vibrational Dynamics of Diatomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1 Molecular Morse Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 Anharmonic Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3 Transient Non-Markovian Effects . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4 Trapping by Dissipation and Thermalization . . . . . . . . . . . . . . . . . . 53
6 Tunneling with Dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.1 Eckart Barrier Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.2 Dissipative Tunneling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.3 Investigation of Markovianity . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7 Summary and Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Appendix A Conventions for Constants, Reservoir Kernels, and Influence Phases 69
Appendix B Stochastic Calculus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
B.1 Stochastic Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . 71
B.2 Position Verlet Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
B.3 Runge-Kutta Fourth Order Scheme . . . . . . . . . . . . . . . . . . . . . . . 73
Appendix CMorse Oscillator Expectation Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Appendix DPrerequisites of a Successful Stochastic Propagation . . . . . . . . . . . . . . 79
D.1 Hubbard-Stratonovich Transformation . . . . . . . . . . . . . . . . . . . . . 79
D.2 Kernels of the Reservoir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
D.2.1 Quadratic Cutoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
D.2.2 Quartic Cutoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
D.2.3 Strictly Ohmic Reservoir . . . . . . . . . . . . . . . . . . . . . . . . . 89
D.3 Guide Trajectory Integration . . . . . . . . . . . . . . . . . . . . . . . . . . 90
D.3.1 Quadratic Cutoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
D.3.2 Quartic Cutoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
D.3.3 Strictly Ohmic Reservoir . . . . . . . . . . . . . . . . . . . . . . . . . 92
D.4 Computation of Matrix Elements and Expectation Values . . . . . . . . . . 92
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
|
100 |
Représentations graphiques de fonctions et processus décisionnels Markoviens factorisés . / Graphical representations of functions and factored Markovian decision processesMagnan, Jean-Christophe 02 February 2016 (has links)
En planification théorique de la décision, le cadre des Processus Décisionnels Markoviens Factorisés (Factored Markov Decision Process, FMDP) a produit des algorithmes efficaces de résolution des problèmes de décisions séquentielles dans l'incertain. L'efficacité de ces algorithmes repose sur des structures de données telles que les Arbres de Décision ou les Diagrammes de Décision Algébriques (ADDs). Ces techniques de planification sont utilisées en Apprentissage par Renforcement par l'architecture SDYNA afin de résoudre des problèmes inconnus de grandes tailles. Toutefois, l'état-de-l'art des algorithmes d'apprentissage, de programmation dynamique et d'apprentissage par renforcement utilisés par SDYNA, requière que le problème soit spécifié uniquement à l'aide de variables binaires et/ou utilise des structures améliorables en termes de compacité. Dans ce manuscrit, nous présentons nos travaux de recherche visant à élaborer et à utiliser une structure de donnée plus efficace et moins contraignante, et à l'intégrer dans une nouvelle instance de l'architecture SDYNA. Dans une première partie, nous présentons l'état-de-l'art de la modélisation de problèmes de décisions séquentielles dans l'incertain à l'aide de FMDP. Nous abordons en détail la modélisation à l'aide d'DT et d'ADDs.Puis nous présentons les ORFGs, nouvelle structure de données que nous proposons dans cette thèse pour résoudre les problèmes inhérents aux ADDs. Nous démontrons ainsi que les ORFGs s'avèrent plus efficaces que les ADDs pour modéliser les problèmes de grandes tailles. Dans une seconde partie, nous nous intéressons à la résolution des problèmes de décision dans l'incertain par Programmation Dynamique. Après avoir introduit les principaux algorithmes de résolution, nous nous attardons sur leurs variantes dans le domaine factorisé. Nous précisons les points de ces variantes factorisées qui sont améliorables. Nous décrivons alors une nouvelle version de ces algorithmes qui améliore ces aspects et utilise les ORFGs précédemment introduits. Dans une dernière partie, nous abordons l'utilisation des FMDPs en Apprentissage par Renforcement. Puis nous présentons un nouvel algorithme d'apprentissage dédié à la nouvelle structure que nous proposons. Grâce à ce nouvel algorithme, une nouvelle instance de l'architecture SDYNA est proposée, se basant sur les ORFGs ~:~l'instance SPIMDDI. Nous testons son efficacité sur quelques problèmes standards de la littérature. Enfin nous présentons quelques travaux de recherche autour de cette nouvelle instance. Nous évoquons d'abord un nouvel algorithme de gestion du compromis exploration-exploitation destiné à simplifier l'algorithme F-RMax. Puis nous détaillons une application de l'instance SPIMDDI à la gestion d'unités dans un jeu vidéo de stratégie en temps réel. / In decision theoretic planning, the factored framework (Factored Markovian Decision Process, FMDP) has produced several efficient algorithms in order to resolve large sequential decision making under uncertainty problems. The efficiency of this algorithms relies on data structures such as decision trees or algebraïc decision diagrams (ADDs). These planification technics are exploited in Reinforcement Learning by the architecture SDyna in order to resolve large and unknown problems. However, state-of-the-art learning and planning algorithms used in SDyna require the problem to be specified uniquely using binary variables and/or to use improvable data structure in term of compactness. In this book, we present our research works that seek to elaborate and to use a new data structure more efficient and less restrictive, and to integrate it in a new instance of the SDyna architecture. In a first part, we present the state-of-the-art modeling tools used in the algorithms that tackle large sequential decision making under uncertainty problems. We detail the modeling using decision trees and ADDs. Then we introduce the Ordered and Reduced Graphical Representation of Function, a new data structure that we propose in this thesis to deal with the various problems concerning the ADDs. We demonstrate that ORGRFs improve on ADDs to model large problems. In a second part, we go over the resolution of large sequential decision under uncertainty problems using Dynamic Programming. After the introduction of the main algorithms, we see in details the factored alternative. We indicate the improvable points of these factored versions. We describe our new algorithm that improve on these points and exploit the ORGRFs previously introduced. In a last part, we speak about the use of FMDPs in Reinforcement Learning. Then we introduce a new algorithm to learn the new datastrcture we propose. Thanks to this new algorithm, a new instance of the SDyna architecture is proposed, based on the ORGRFs : the SPIMDDI instance. We test its efficiency on several standard problems from the litterature. Finally, we present some works around this new instance. We detail a new algorithm for efficient exploration-exploitation compromise management, aiming to simplify F-RMax. Then we speak about an application of SPIMDDI to the managements of units in a strategic real time video game.
|
Page generated in 0.0476 seconds