<p dir="ltr">This thesis presents probably approximately correct (PAC) bounds on estimates of the transition kernels of Controlled Markov chains (CMC’s). CMC’s are a natural choice for modelling various industrial and medical processes, and are also relevant to reinforcement learning (RL). Learning the transition dynamics of CMC’s in a sample efficient manner is an important question that is open. This thesis aims to close this gap in knowledge in literature.</p><p dir="ltr">In Chapter 2, we lay the groundwork for later chapters by introducing the relevant concepts and defining the requisite terms. The two subsequent chapters focus on non-parametric estimation. </p><p dir="ltr">In Chapter 3, we restrict ourselves to a finitely supported CMC with d states and k controls and produce a general theory for minimax sample complexity of estimating the transition matrices.</p><p dir="ltr">In Chapter 4 we demonstrate the applicability of this theory by using it to recover the sample complexities of various controlled Markov chains, as well as RL problems.</p><p dir="ltr">In Chapter 5 we move to a continuous state and action spaces with compact supports. We produce a robust, non-parametric test to find the best histogram based estimator of the transition density; effectively reducing the problem into one of model selection based on empricial processes.</p><p dir="ltr">Finally, in Chapter 6 we move to a parametric and Bayesian regime, and restrict ourselves to Markov chains. Under this setting we provide a PAC-Bayes bound for estimating model parameters under tempered posteriors.</p>
Identifer | oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/24735312 |
Date | 06 December 2023 |
Creators | Imon Banerjee (17555685) |
Source Sets | Purdue University |
Detected Language | English |
Type | Text, Thesis |
Rights | CC BY 4.0 |
Relation | https://figshare.com/articles/thesis/PROBABLY_APPROXIMATELY_CORRECT_BOUNDS_FOR_ESTIMATING_MARKOV_TRANSITION_KERNELS/24735312 |
Page generated in 0.0022 seconds