Return to search

Some Advances in Local Approximate Gaussian Processes

Nowadays, Gaussian Process (GP) has been recognized as an indispensable statistical tool in computer experiments. Due to its computational complexity and storage demand, its application in real-world problems, especially in "big data" settings, is quite limited. Among many strategies to tailor GP to such settings, Gramacy and Apley (2015) proposed local approximate GP (laGP), which constructs approximate predictive equations by constructing small local designs around the predictive location under certain criterion. In this dissertation, several methodological extensions based upon laGP are proposed. One methodological contribution is the multilevel global/local modeling, which deploys global hyper-parameter estimates to perform local prediction. The second contribution comes from extending the laGP notion of "locale" to a set of predictive locations, along paths in the input space. These two contributions have been applied in the satellite drag emulation, which is illustrated in Chapter 3. Furthermore, the multilevel GP modeling strategy has also been applied to synthesize field data and computer model outputs of solar irradiance across the continental United States, combined with inverse-variance weighting, which is detailed in Chapter 4. Last but not least, in Chapter 5, laGP's performance has been tested on emulating daytime land surface temperatures estimated via satellites, in the settings of irregular grid locations. / Doctor of Philosophy / In many real-life settings, we want to understand a physical relationship/phenomenon. Due to limited resources and/or ethical reasons, it is impossible to perform physical experiments to collect data, and therefore, we have to rely upon computer experiments, whose evaluation usually requires expensive simulation, involving complex mathematical equations. To reduce computational efforts, we are looking for a relatively cheap alternative, which is called an emulator, to serve as a surrogate model. Gaussian process (GP) is such an emulator, and has been very popular due to fabulous out-of-sample predictive performance and appropriate uncertainty quantification. However, due to computational complexity, full GP modeling is not suitable for “big data” settings. Gramacy and Apley (2015) proposed local approximate GP (laGP), the core idea of which is to use a subset of the data for inference and further prediction at unobserved inputs. This dissertation provides several extensions of laGP, which are applied to several real-life “big data” settings. The first application, detailed in Chapter 3, is to emulate satellite drag from large simulation experiments. A smart way is figured out to capture global input information in a comprehensive way by using a small subset of the data, and local prediction is performed subsequently. This method is called “multilevel GP modeling”, which is also deployed to synthesize field measurements and computational outputs of solar irradiance across the continental United States, illustrated in Chapter 4, and to emulate daytime land surface temperatures estimated by satellites, discussed in Chapter 5.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/97245
Date03 October 2019
CreatorsSun, Furong
ContributorsStatistics, Gramacy, Robert B., Leman, Scotland C., Ferreira, Marco A. R., Higdon, David
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeDissertation
FormatETD, application/pdf, application/pdf, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0022 seconds