Global ETD Search

Return to search

Statistical Design of Sequential Decision Making Algorithms

Sequential decision-making is a fundamental class of problem that motivates algorithm designs of online machine learning and reinforcement learning. Arguably, the resulting online algorithms have supported modern online service industries for their data-driven real-time automated decision making. The applications span across different industries, including dynamic pricing (Marketing), recommendation (Advertising), and dosage finding (Clinical Trial). In this dissertation, we contribute fundamental statistical design advances for sequential decision-making algorithms, leaping progress in theory and application of online learning and sequential decision making under uncertainty including online sparse learning, finite-armed bandits, and high-dimensional online decision making. Our work locates at the intersection of decision-making algorithm designs, online statistical machine learning, and operations research, contributing new algorithms, theory, and insights to diverse fields including optimization, statistics, and machine learning.
 
In part I, we contribute a theoretical framework of continuous risk monitoring for regularized online statistical learning. Such theoretical framework is desirable for modern online service industries on monitoring deployed model's performance of online machine learning task. In the first project (Chapter 1), we develop continuous risk monitoring for the online Lasso procedure and provide an always-valid algorithm for high-dimensional dynamic pricing problems. In the second project (Chapter 2), we develop continuous risk monitoring for online matrix regression and provide new algorithms for rank-constrained online matrix completion problems. Such theoretical advances are due to our elegant interplay between non-asymptotic martingale concentration theory and regularized online statistical machine learning.
 
In part II, we contribute a bootstrap-based methodology for finite-armed bandit problems, termed Residual Bootstrap exploration. Such a method opens a possibility to design model-agnostic bandit algorithms without problem-adaptive optimism-engineering and instance-specific prior-tuning. In the first project (Chapter 3), we develop residual bootstrap exploration for multi-armed bandit algorithms and shows its easy generalizability to bandit problems with complex or ambiguous reward structure. In the second project (Chapter 4), we develop a theoretical framework for residual bootstrap exploration in linear bandit with fixed action set. Such methodology advances are due to our development of non-asymptotic theory for the bootstrap procedure.
 
In part III, we contribute application-driven insights on the exploration-exploitation dilemma for high-dimensional online decision-making problems. Such insights help practitioners to implement effective high-dimensional statistics methods to solve online decisionmaking problems. In the first project (Chapter 5), we develop a bandit sampling scheme for online batch high-dimensional decision making, a practical scenario in interactive marketing, and sequential clinical trials. In the second project (Chapter 6), we develop a bandit sampling scheme for federated online high-dimensional decision-making to maintain data decentralization and perform collaborated decisions. These new insights are due to our new bandit sampling design to address application-driven exploration-exploitation trade-offs effectively.

10.25394/pgs.19668630.v1

Statistics

Decision Making

Sequential decision making

LASSO regression models

Bandit Algorithms

high-dimensional statistics

bootstrap resampling method

exploration-exploitation trade-off

Dynamic pricing

multi armed bandit

Contextual bandits

regularization method

Online Machine Learning

statistical learning methods

martingale

Identifer	oai:union.ndltd.org:purdue.edu/oai:figshare.com:article/19668630
Date	27 April 2022
Creators	Chi-hua Wang (12469251)
Source Sets	Purdue University
Detected Language	English
Type	Text, Thesis
Rights	CC BY 4.0
Relation	https://figshare.com/articles/thesis/Statistical_Design_of_Sequential_Decision_Making_Algorithms/19668630

Page generated in 0.0023 seconds

Statistical Design of Sequential Decision Making Algorithms

Description

Links & Downloads

Tags

Additional Fields