Markov chain Monte Carlo (MCMC) is an extremely popular class of algorithms for
computing summaries of posterior distributions. One problem for MCMC in the so-called Big Data regime is the growing computational cost of most MCMC algorithms. Most popular and basic MCMC algorithms, like Metropolis-Hastings algorithm (MH) and Gibbs algorithm, have to take the full data set into account in every iteration. In Big Data case, it is a fact that datasets of more than 100 GB are now fairly common. The running time of standard MCMC on such large datasets is prohibitively long.
To solve this problem, some papers develop algorithms that use only a subset of the data at each step to obtain an approximate or exact posterior distribution. Korattikara et al (2013) merely estimates the transition probabilities of a typical MH chain using a subset of the data at each step of the chain, with some controllable error. The FireFly Monte Carlo (FLYMC) algorithm, presented by Maclaurin and Adams, augments the original dataset and only explicitly evaluates an “active" subset in each step. They show that the marginal distribution of the FLYMC algorithm at stationarity in fact still equal to the posterior distribution of interest. However, Both of the above two papers and other literature in this thesis are restrained to a special kind of posteriors with "product-form" likelihoods. Such posteriors require all data points are conditionally independent and under the same likelihood.
However, what problem we want to solve is targeting a uniform distribution on a convex hull. In this case, \product-form" is not applicable. The reason why we focus on this problem is in statistics we sometimes face the problem to compute the volume of distributions which have a convex hull shape or their shape is able to transformed into a convex hull. It is impossible to compute via decomposing and reducing convex hulls of high dimension. According to Barany et al in 1987, the ratio of the estimated upper and lower bound of the volume of a certain convex hull is quite big. It is not possible to estimate the volume well, either. Fast-mixing Markov chains are basically the only way to actually do volume computations.
The initial work in this thesis is to de ne a data-augmentation algorithm along the lines of FLYMC. We also introduce an auxiliary random variable to mark subsets. However, as our situation is more complicated, we also have one more variable to help selecting subsets than FLYMC algorithm. For the extra variable, we utilize pseudo-marginal algorithm (PMMH), which allows us to replace interest parameter's distribution conditional on augmented variable by an estimator. Although our algorithm is not a standard case because our estimator is biased, bounds of the individual approximating measure of the parameter of interest is able to be directly translated into bounds of the error in the stationary measure of the algorithm.
After fi nishing an implementable algorithm, we then use two tricks including Locality
Sensitive Hash function (LSH) and Taylor's expansion to improve the original algorithm. LSH helps raise the e ciency of proposing new samples of the augmented variable. Taylor's expansion is able to produce a more accurate estimator of the parameter of interest.
Our main theoretical result is a bound on the pointwise bias of our estimator, which
results in a bound on the total error of the chain's stationary measure. We prove the total error will converge under a certain condition. Our simulation results illustrate this, and we use a large collection of simulations to illustrate some tips on how to choose parameters and length of chains in real cases.
Identifer | oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/39529 |
Date | 20 August 2019 |
Creators | Wang, Pengfei |
Contributors | Smith, Aaron |
Publisher | Université d'Ottawa / University of Ottawa |
Source Sets | Université d’Ottawa |
Language | English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Page generated in 0.0021 seconds