Global ETD Search

Return to search

Online and offline learning in operations

Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, September, 2020 / Cataloged from PDF version of thesis. / Includes bibliographical references (pages 213-219). / With the rapid advancement of information technology and accelerated development of data science, the importance of integrating data into decision-making has never been stronger. In this thesis, we propose data-driven algorithms to incorporate learning from data in three operations problems, concerning both online learning and offline learning settings. First, we study a single product pricing problem with demand censoring in an offline data-driven setting. In this problem, a retailer is given a finite level of inventory, and faces a random demand that is price sensitive in a linear fashion with unknown parameters and distribution. Any unsatisfied demand is lost and unobservable. The retailer's objective is to use offline censored demand data to find an optimal price, maximizing her expected revenue with finite inventories. / We characterize an exact condition for the identifiability of near-optimal algorithms, and propose a data-driven algorithm that guarantees near-optimality in the identifiable case and approaches best-achievable optimality gap in the unidentifiable case. Next, we study the classic multi-period joint pricing and inventory control problem in an offline data-driven setting. We assume the demand functions and noise distributions are unknown, and propose a data-driven approximation algorithm, which uses offline demand data to solve the joint pricing and inventory control problem. We establish a polynomial sample complexity bound, the number of data samples needed to guarantee a near-optimal profit. A simulation study suggests that the data-driven algorithm solves the dynamic program effectively. Finally, we study an online learning problem for product selection in urban warehouses managed by fast-delivery retailers. We distill the problem into a semi-bandit model with linear generalization. / There are n products, each with a feature vector of dimension T. In each of the T periods, a retailer selects K products to offer, where T is much greater than T or b. We propose an online learning algorithm that iteratively shrinks the upper confidence bounds within each period. Compared to the standard UCB algorithm, we prove the new algorithm reduces the most dominant regret term by a factor of d, and experiments on datasets from Alibaba Group suggest it lowers the total regret by at least 10%.. / by Li Wang. / Ph. D. / Ph.D. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center

Operations Research Center.

Identifer	oai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/129080
Date	January 2020
Creators	Wang, Li,Ph D.Massachusetts Institute of Technology.
Contributors	David Simchi-Levi., Massachusetts Institute of Technology. Operations Research Center., Massachusetts Institute of Technology. Operations Research Center
Publisher	Massachusetts Institute of Technology
Source Sets	M.I.T. Theses and Dissertation
Language	English
Detected Language	English
Type	Thesis
Format	219 pages, application/pdf
Rights	MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided., http://dspace.mit.edu/handle/1721.1/7582

Page generated in 0.0024 seconds

Online and offline learning in operations

Description

Links & Downloads

Tags

Additional Fields