Return to search

Statistical learning for decision making : interpretability, uncertainty, and inference

Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2015. / This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. / Cataloged from student-submitted PDF version of thesis. / Includes bibliographical references (pages 183-196). / Data and predictive modeling are an increasingly important part of decision making. Here we present advances in several areas of statistical learning that are important for gaining insight from large amounts of data, and ultimately using predictive models to make better decisions. The first part of the thesis develops methods and theory for constructing interpretable models from association rules. Interpretability is important for decision makers to understand why a prediction is made. First we show how linear mixtures of rules can be used to make sequential predictions. Then we develop Bayesian Rule Lists, a method for learning small, ordered lists of rules. We apply Bayesian Rule Lists to a large database of patient medical histories and produce a simple, interpretable model that solves an important problem in healthcare, with little sacrifice to accuracy. Finally, we prove a uniform generalization bound for decision lists. In the second part of the thesis we focus on decision making from sales transaction data. We develop models and inference procedures for using transaction data to estimate quantities such as willingness-to-pay and lost sales due to stock unavailability. We develop a copula estimation procedure for making optimal bundle pricing decisions. We then develop a Bayesian hierarchical model for inferring demand and substitution behaviors from transaction data with stockouts. We show how posterior sampling can be used to directly incorporate model uncertainty into the decisions that will be made using the model. In the third part of the thesis we propose a method for aggregating relevant information from across the Internet to facilitate informed decision making. Our contributions here include an important theoretical result for Bayesian Sets, a popular method for identifying data that are similar to seed examples. We provide a generalization bound that holds for any data distribution, and moreover is independent of the dimensionality of the feature space. This result justifies the use of Bayesian Sets on high-dimensional problems, and also explains its good performance in settings where its underlying independence assumption does not hold. / by Benjamin Letham. / Ph. D.

Identiferoai:union.ndltd.org:MIT/oai:dspace.mit.edu:1721.1/98569
Date January 2015
CreatorsLetham, Benjamin
ContributorsCynthia Rudin., Massachusetts Institute of Technology. Operations Research Center., Massachusetts Institute of Technology. Operations Research Center.
PublisherMassachusetts Institute of Technology
Source SetsM.I.T. Theses and Dissertation
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Format196 pages, application/pdf
RightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission., http://dspace.mit.edu/handle/1721.1/7582

Page generated in 0.0077 seconds