Return to search

Computer Experimental Design for Gaussian Process Surrogates

With a rapid development of computing power, computer experiments have gained popularity in various scientific fields, like cosmology, ecology and engineering. However, some computer experiments for complex processes are still computationally demanding. A surrogate model or emulator, is often employed as a fast substitute for the simulator. Meanwhile, a common challenge in computer experiments and related fields is to efficiently explore the input space using a small number of samples, i.e., the experimental design problem. This dissertation focuses on the design problem under Gaussian process surrogates. The first work demonstrates empirically that space-filling designs disappoint when the model hyperparameterization is unknown, and must be estimated from data observed at the chosen design sites. A purely random design is shown to be superior to higher-powered alternatives in many cases. Thereafter, a new family of distance-based designs are proposed and their superior performance is illustrated in both static (one-shot design) and sequential settings. The second contribution is motivated by an agent-based model(ABM) of delta smelt conservation. The ABM is developed to assist in a study of delta smelt life cycles and to understand sensitivities to myriad natural variables and human interventions. However, the input space is high-dimensional, running the simulator is time-consuming, and its outputs change nonlinearly in both mean and variance. A batch sequential design scheme is proposed, generalizing one-at-a-time variance-based active learning, as a means of keeping multi-core cluster nodes fully engaged with expensive runs. The acquisition strategy is carefully engineered to favor selection of replicates which boost statistical and computational efficiencies. Design performance is illustrated on a range of toy examples before embarking on a smelt simulation campaign and downstream high-fidelity input sensitivity analysis. / Doctor of Philosophy / With a rapid development of computing power, computer experiments have gained popularity in various scientific fields, like cosmology, ecology and engineering. However, some computer experiments for complex processes are still computationally demanding. Thus, a statistical model built upon input-output observations, i.e., a so-called surrogate model or emulator, is needed as a fast substitute for the simulator. Design of experiments, i.e., how to select samples from the input space under budget constraints, is also worth studying. This dissertation focuses on the design problem under Gaussian process (GP) surrogates. The first work demonstrates empirically that commonly-used space-filling designs disappoint when the model hyperparameterization is unknown, and must be estimated from data observed at the chosen design sites. Thereafter, a new family of distance-based designs are proposed and their superior performance is illustrated in both static (design points are allocated at one shot) and sequential settings (data are sampled sequentially). The second contribution is motivated by a stochastic computer simulator of delta smelt conservation. This simulator is developed to assist in a study of delta smelt life cycles and to understand sensitivities to myriad natural variables and human interventions. However, the input space is high-dimensional, running the simulator is time-consuming, and its outputs change nonlinearly in both mean and variance. An innovative batch sequential design method is proposed, generalizing one-at-a-time sequential design to one-batch-at-a-time scheme with the goal of parallel computing. The criterion for subsequent data acquisition is carefully engineered to favor selection of replicates which boost statistical and computational efficiencies. The design performance is illustrated on a range of toy examples before embarking on a smelt simulation campaign and downstream input sensitivity analysis.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/99886
Date01 September 2020
CreatorsZhang, Boya
ContributorsStatistics, Gramacy, Robert B., House, Leanna L., Higdon, David, Deng, Xinwei
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Detected LanguageEnglish
TypeDissertation
FormatETD, application/pdf
RightsIn Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0019 seconds