Return to search

Tractable approximation algorithms for high dimensional sequential optimization problems,

Sequential decision making problems are ubiquitous in a number of research areas such as operations research, finance, engineering and computer science. The main challenge with these problems comes from the fact that, firstly, there is uncertainty about the future. And secondly, decisions have to be made over a period of time, sequentially. These problems, in many cases, are modeled as Markov Decision Process (MDP). Most real-life MDPs are ‘high dimensional’ in nature making them challenging from a numerical point of view. We consider a number of such high dimensional MDPs. In some cases such problems can be approximately solved using Approximate Dynamic Programming. In other cases problem specific analysis can be solved to device tractable policies that are near-optimal. In Chapter 2, we presents a novel and practical non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful, dimension-independent approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a viable replacement to state of the art parametric ADP algorithms, freeing the designer from carefully specifying an approximation architecture. We accomplish this by ‘kernelizing’ a recent mathematical program for ADP (the ‘smoothed’ approximate LP) proposed by [Desai et al., 2011]. In Chapter 3, we consider a class of stochastic control problems where the action space at each time can be described by a class of matching or, more generally, network flow polytopes. Special cases of this class of dynamic matching problems include many problems that are well-studied in the literature, such as: (i) online keyword matching in Internet advertising (the adwords problem); (ii) the bipartite matching of donated kidneys from cadavers to recipients; and (iii) the allocation of donated kidneys through exchanges over cycles of live donor-patient pairs. We provide an approximate dynamic program (ADP) algorithm for dynamic matching with stochastic arrivals and departures. Our framework is more general than the methods prevalent in the literature in that it is applicable to a broad range of problems characterized by a variety of action polytopes and generic arrival and departure processes. In Chapter 4, we consider the problem of A-B testing when the impact of the treatment is marred by a large number of covariates. Randomization can be highly inefficient in such settings, and thus we consider the problem of optimally allocating test subjects to either treatment with a view to maximizing the efficiency of our estimate of the treatment effect. Our main contribution is a tractable algorithm for this problem in the online setting, where subjects arrive, and must be assigned, sequentially. We characterize the value of optimized allocations relative to randomized allocations and show that this value grows large as the number of covariates grows. In particular, we show that there is a lot to be gained from ‘optimizing’ the process of A-B testing relative to the simple randomized trials that are the mainstay of A-B testing in the ‘big data’ regime of modern e-commerce applications, where the number of covariates is often comparable to the number of experimental trials.

Identiferoai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/D8JQ10W8
Date January 2016
CreatorsBhat, Nikhil
Source SetsColumbia University
LanguageEnglish
Detected LanguageEnglish
TypeTheses

Page generated in 0.0025 seconds