• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

PROBABLY APPROXIMATELY CORRECT BOUNDS FOR ESTIMATING MARKOV TRANSITION KERNELS

Imon Banerjee (17555685) 06 December 2023 (has links)
<p dir="ltr">This thesis presents probably approximately correct (PAC) bounds on estimates of the transition kernels of Controlled Markov chains (CMC’s). CMC’s are a natural choice for modelling various industrial and medical processes, and are also relevant to reinforcement learning (RL). Learning the transition dynamics of CMC’s in a sample efficient manner is an important question that is open. This thesis aims to close this gap in knowledge in literature.</p><p dir="ltr">In Chapter 2, we lay the groundwork for later chapters by introducing the relevant concepts and defining the requisite terms. The two subsequent chapters focus on non-parametric estimation. </p><p dir="ltr">In Chapter 3, we restrict ourselves to a finitely supported CMC with d states and k controls and produce a general theory for minimax sample complexity of estimating the transition matrices.</p><p dir="ltr">In Chapter 4 we demonstrate the applicability of this theory by using it to recover the sample complexities of various controlled Markov chains, as well as RL problems.</p><p dir="ltr">In Chapter 5 we move to a continuous state and action spaces with compact supports. We produce a robust, non-parametric test to find the best histogram based estimator of the transition density; effectively reducing the problem into one of model selection based on empricial processes.</p><p dir="ltr">Finally, in Chapter 6 we move to a parametric and Bayesian regime, and restrict ourselves to Markov chains. Under this setting we provide a PAC-Bayes bound for estimating model parameters under tempered posteriors.</p>

Page generated in 0.1001 seconds