Return to search

Statistically Efficient Methods for Computation-Aware Uncertainty Quantification and Rare-Event Optimization

The thesis covers two fundamental topics that are important across the disciplines of operations research, statistics and even more broadly, namely stochastic optimization and uncertainty quantification, with the common theme to address both statistical accuracy and computational constraints. Here, statistical accuracy encompasses the precision of estimated solutions in stochastic optimization, as well as the tightness or reliability of confidence intervals. Computational concerns arise from rare events or expensive models, necessitating efficient sampling methods or computation procedures.

In the first half of this thesis, we study stochastic optimization that involves rare events, which arises in various contexts including risk-averse decision-making and training of machine learning models. Because of the presence of rare events, crude Monte Carlo methods can be prohibitively inefficient, as it takes a sample size reciprocal to the rare-event probability to obtain valid statistical information about the rare-event. To address this issue, we investigate the use of importance sampling (IS) to reduce the required sample size. IS is commonly used to handle rare events, and the idea is to sample from an alternative distribution that hits the rare event more frequently and adjusts the estimator with a likelihood ratio to retain unbiasedness. While IS has been long studied, most of its literature focuses on estimation problems and methodologies to obtain good IS in these contexts. Contrary to these studies, the first half of this thesis provides a systematic study on the efficient use of IS in stochastic optimization. In Chapter 2, we propose an adaptive procedure that converts an efficient IS for gradient estimation to an efficient IS procedure for stochastic optimization. Then, in Chapter 3, we provide an efficient IS for gradient estimation, which serves as the input for the procedure in Chapter 2.

In the second half of this thesis, we study uncertainty quantification in the sense of constructing a confidence interval (CI) for target model quantities or prediction. We are interested in the setting of expensive black-box models, which means that we are confined to using a low number of model runs, and we also lack the ability to obtain auxiliary model information such as gradients. In this case, a classical method is batching, which divides data into a few batches and then constructs a CI based on the batched estimates. Another method is the recently proposed cheap bootstrap that is constructed on a few resamples in a similar manner as batching.

These methods could save computation since they do not need an accurate variability estimator which requires sufficient model evaluations to obtain. Instead, they cancel out the variability when constructing pivotal statistics, and thus obtain asymptotically valid t-distribution-based CIs with only few batches or resamples. The second half of this thesis studies several theoretical aspects of these computation-aware CI construction methods. In Chapter 4, we study the statistical optimality on CI tightness among various computation-aware CIs. Then, in Chapter 5, we study the higher-order coverage errors of batching methods. Finally, Chapter 6 is a related investigation on the higher-order coverage and correction of distributionally robust optimization (DRO) as another CI construction tool, which assumes an amount of analytical information on the model but bears similarity to Chapter 5 in terms of analysis techniques.

Identiferoai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/kt5b-1852
Date January 2024
CreatorsHe, Shengyi
Source SetsColumbia University
LanguageEnglish
Detected LanguageEnglish
TypeTheses

Page generated in 0.0081 seconds