Global ETD Search

291	Machine learning for problems with missing and uncertain data with applications to personalized medicine Pawlowski, Colin. January 2019 (has links) This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. / Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019 / Cataloged from student-submitted PDF version of thesis. / Includes bibliographical references (pages 205-215). / When we try to apply statistical learning in real-world applications, we frequently encounter data which include missing and uncertain values. This thesis explores the problem of learning from missing and uncertain data with a focus on applications in personalized medicine. In the first chapter, we present a framework for classification when data is uncertain that is based upon robust optimization. We show that adding robustness in both the features and labels results in tractable optimization problems for three widely used classification methods: support vector machines, logistic regression, and decision trees. Through experiments on 75 benchmark data sets, we characterize the learning tasks for which adding robustness provides the most value. In the second chapter, we develop a family of methods for missing data imputation based upon predictive methods and formal optimization. / We present formulations for models based on K-nearest neighbors, support vector machines, and decision trees, and we develop an algorithm OptImpute to find high quality solutions which scales to large data sets. In experiments on 84 benchmark data sets, we show that OptImpute outperforms state-of-the-art methods in both imputation accuracy and performance on downstream tasks. In the third chapter, we develop MedImpute, an extension of OptImpute specialized for imputing missing values in multivariate panel data. This method is tailored for data sets that have multiple observations of the same individual at different points in time. In experiments on the Framingham Heart Study and Dana Farber Cancer Institute electronic health record data, we demonstrate that MedImpute improves the accuracy of models predicting 10-year risk of stroke and 60-day risk of mortality for late-stage cancer patients. / In the fourth chapter, we develop a method for tensor completion which leverages noisy side information available on the rows and/or columns of the tensor. We apply this method to the task of predicting anti-cancer drug response at particular dosages. We demonstrate significant gains in out-of-sample accuracy filling in missing values on two large-scale anticancer drug screening data sets with genomic side information. / by Colin Pawlowski. / Ph. D. / Ph.D. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center Operations Research Center.
292	Advances in data-driven models for transportation Ng, Yee Sian. January 2019 (has links) This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. / Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019 / Cataloged from student-submitted PDF version of thesis. / Includes bibliographical references (pages 163-176). / With the rising popularity of ride-sharing and alternative modes of transportation, there has been a renewed interest in transit planning to improve service quality and stem declining ridership. However, it often takes months of manual planning for operators to redesign and reschedule services in response to changing needs. To this end, we provide four models of transportation planning that are based on data and driven by optimization. A key aspect is the ability to provide certificates of optimality, while being practical in generating high-quality solutions in a short amount of time. We provide approaches to combinatorial problems in transit planning that scales up to city-sized networks. In transit network design, current tractable approaches only consider edges that exist, resulting in proposals that are closely tethered to the original network. We allow new transit links to be proposed and account for commuters transferring between different services. In integrated transit scheduling, we provide a way for transit providers to synchronize the timing of services in multimodal networks while ensuring regularity in the timetables of the individual services. This is made possible by taking the characteristics of transit demand patterns into account when designing tractable formulations. We also advance the state of the art in demand models for transportation optimization. In emergency medical services, we provide data-driven formulations that outperforms their probabilistic counterparts in ensuring coverage. This is achieved by replacing independence assumptions in probabilistic models and capturing the interactions of services in overlapping regions. In transit planning, we provide a unified framework that allows us to optimize frequencies and prices jointly in transit networks for minimizing total waiting time. / by Yee Sian Ng. / Ph. D. / Ph.D. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center Operations Research Center.
293	Detecting food safety risks and human tracking using interpretable machine learning methods/ Zhu, Jessica H. January 2019 (has links) This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. / Thesis: S.M., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019 / Cataloged from student-submitted PDF version of thesis. / Includes bibliographical references (pages 75-80). / Black box machine learning methods have allowed researchers to design accurate models using large amounts of data at the cost of interpretability. Model interpretability not only improves user buy-in, but in many cases provides users with important information. Especially in the case of the classification problems addressed in this thesis, the ideal model should not only provide accurate predictions, but should also inform users of how features affect the results. My research goal is to solve real-world problems and compare how different classification models affect the outcomes and interpretability. To this end, this thesis is divided into two parts: food safety risk analysis and human trafficking detection. The first half analyzes the characteristics of supermarket suppliers in China that indicate a high risk of food safety violations. Contrary to expectations, supply chain dispersion, internal inspections, and quality certification systems are not found to be predictive of food safety risk in our data. The second half focuses on identifying human trafficking, specifically sex trafficking, advertisements hidden amongst online classified escort service advertisements. We propose a novel but interpretable keyword detection and modeling pipeline that is more accurate and actionable than current neural network approaches. The algorithms and applications presented in this thesis succeed in providing users with not just classifications but also the characteristics that indicate food safety risk and human trafficking ads. / by Jessica H. Zhu. / S.M. / S.M. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center Operations Research Center.
294	Resource scheduling and optimization in dynamic and complex transportation settings Mellou, Konstantina. January 2019 (has links) This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. / Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019 / Cataloged from student-submitted PDF version of thesis. / Includes bibliographical references (pages 145-151). / Resource optimization has always been a challenge both in traditional fields, such as logistics, and particularly so in most emerging systems in the sharing economy. These systems are by definition founded on the sharing of resources among users, which naturally creates many coordination needs as well as challenges to ensure enough resource supply to cover customer demand. This thesis addresses these challenges in the application of vehicle sharing systems, as well as in the context of multi-operation companies that provide a wide range of services to their users. More specifically, the first part of this thesis focuses on models and algorithms for the optimization of bike sharing systems. Shortage of bikes and docks is a common issue in bike sharing systems, and, to tackle this problem, operators use a fleet of vehicles to redistribute bikes across the network. / We study multiple aspects of these operations, and develop models that can capture all user trips that are performed successfully in the system, as well as algorithms that generate complete redistribution plans for the operators to maximize the served demand, in running times that are fast enough to allow real-time information to be taken into account. Furthermore, we propose an approach for the estimation of the actual user demand which takes into account both the lost demand (users that left the system due to lack of bikes or docks) and shifted demand (users that had to walk to nearby stations to find available resources). More accurate demand representations can then be used to inform better decisions for the daily operations, as well as the long-term planning of the system. The second part of this thesis is focused on schedule generation for resources of large companies that must support a complex set of operations. / Different operation types come with a variety of constraints and requirements that need to be taken into account. Moreover, specialized employees with a variety of skills and experience levels are required, along with an heterogeneous fleet of vehicles with various properties (e.g., refrigerator vehicles). We introduce the Complex Event Scheduling Problem (CESP), which captures known problems such as pickup-and-delivery and technician scheduling as special cases. We then develop a unified optimization framework for CESP, which relies on a combination of metaheuristics (ALNS) and Linear Programming. Our experiments show that our framework scales to large problem instances, and may help companies and organizations improve operation efficiency (e.g., reduce fleet size). / by Konstantina Mellou. / Ph. D. / Ph.D. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center Operations Research Center.
295	Demand uncensored : car-sharing mobility services using data-driven and simulation-based techniques / Car-sharing mobility services using data-driven and simulation-based techniques Fields, Evan(Evan Jerome) January 2019 (has links) Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2019 / Cataloged from PDF version of thesis. / Includes bibliographical references (pages 141-145). / In the design and operation of urban mobility systems, it is often desirable to understand patterns in traveler demand. However, demand is typically unobserved and must be estimated from available data. To address this disconnect, we begin by proposing a method for recovering an unknown probability distribution given a censored or truncated sample from that distribution. The proposed method is a novel and conceptually simple detruncation technique based on sampling the observed data according to weights learned by solving a simulation-based optimization problem; this method is especially appropriate in cases where little analytic information about the unknown distribution is available but the truncation process can be simulated. / The proposed method is compared to the ubiquitous maximum likelihood (MLE) method in a variety of synthetic validation experiments where it is found that the proposed method performs slightly worse than perfectly specified MLE and competitively with slight misspecified MLE. We then describe a novel car-sharing simulator which captures many of the important interactions between supply, demand, and system utilization while remaining simple and computationally efficient. In collaboration with Zipcar, a leading car-sharing operator in the United States, we demonstrate the usefulness of our detruncation method combined with our simulator via a pair of case studies. These tools allow us to estimate demand for round trip car-sharing services in the Boston and New York metropolitan areas, and the inferred demand distributions contain actionable insights. / Finally, we extend the detruncation method to cover cases where data is noisy, missing, or must be combined from different sources such as web or mobile applications. In synthetic validation experiments, the extended method is benchmarked against kernel density estimation (KDE) with Gaussian kernels. We find that the proposed method typically outperforms KDE, especially when the distribution to be estimated is not unimodal. With this extended method we consider the added utility of search data when estimating demand for car-sharing. / by Evan Fields. / Ph. D. / Ph.D. Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center Operations Research Center.
296	Examining financial puzzles from an evolutionary perspective Guo, Kenrick January 2006 (has links) Thesis (S.M.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2006. / Includes bibliographical references (leaves 74-79). / In this thesis, we examine some puzzles in finance from an evolutionary perspective. We first provide a literature review of evolutionary psychology, and discuss three main findings; the frequentist hypothesis, applications from risk-sensitive optimal foraging theory, and the cheater detection hypothesis. Next we introduce some of the most-researched puzzles in the finance literature. Examples include overreaction, loss aversion, and the equity premium puzzle. Following this, we discuss risk-sensitive optimal foraging theory further and examine some of the financial puzzles using the framework of risk-sensitive foraging. Finally, we develop a dynamic patch selection model which gives the patch selection strategy that maximizes an organism's long-run probability of survival. It is from this optimal patch strategy that we observe loss aversion. Throughout the thesis, we stress the following: humans' behavior in financial markets is neither inherently irrational, nor is it rational. Rather the puzzles occur as a consequence of evolutionarily-optimal cognitive mechanisms being utilized in environments other than the ancestral domain in which they evolved to adapt in. / by Kenrick Guo. / S.M. Operations Research Center.
297	UAV mission planning under uncertainty / Unmanned Aerial Vehicles mission planning under uncertainty Sakamoto, Philemon January 2006 (has links) Thesis (S.M.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2006. / Includes bibliographical references (p. 205-209). / With the continued development of high endurance Unmanned Aerial Vehicles (UAV) and Unmanned Combat Aerial Vehicles (UCAV) that are capable of performing autonomous fiunctions across the spectrum of military operations, one can envision a future military in which Air Component Commanders control forces comprised exclusively of unmanned vehicles. In order to properly manage and fully realize the capabilities of this UAV force, a control system must be in place that directs UAVs to targets and coordinates missions in a manner that provides an efficient allocation of resources. Additionally, a mission planner should account for the uncertainty inherent in the operations. Uncertainty, or stochasticity, manifests itself in most operations known to man. In the battlefield, such unknowns are especially real; the phenomenon is known as the fog of war. A good planner should develop plans that provide an efficient allocation of resources and take advantage of the system's true potential, while still providing ample "robustness" ill plans so that they are more likely executable and for a longer period of time. / (cont.) In this research, we develop a UAV Mission Planner that couples the scheduling of tasks with the assignment of these tasks to UAVs, while maintaining the characteristics of longevity and efficiency in its plans. The planner is formulated as a Mixed Integer Program (MIP) that incorporates the Robust Optimization technique proposed by Bertsimas and Sim [12]. / by Philemon Sakamoto. / S.M. Operations Research Center.
298	No-arbitrage bounds on American Put Options with a single maturity Shah, Premal (Premal Y.) January 2006 (has links) Thesis (S.M.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2006. / Includes bibliographical references (p. 63-64). / We consider in this thesis the problem of pricing American Put Options in a model-free framework where we do not make any assumptions about the price dynamics of the underlying except those implied by the no-arbitrage conditions. Our goal is to obtain bounds on the price of an American put option with a given strike and maturity directly from the prices of other American put options with the same maturity but different strikes and the current price of the underlying. We proceed by first investigating the structural properties of the price curve of American Put Options of a fixed maturity and derive necessary and sufficient conditions that strike - price pairs of these options must satisfy in order to exclude arbitrage. Using these conditions, we can find tight bounds on the price of the option of interest by solving a very tractable Linear Programming Problem. We then apply the methods developed to real market data. We observe that the quality of bounds that we obtain compares well with the quoted bid-ask spreads in most cases. / by Premal Shah. / S.M. Operations Research Center.
299	Inventory planning for low demand items in online retailing Chhaochhria, Pallav January 2007 (has links) Thesis (S.M.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2007. / Includes bibliographical references (p. 81). / A large online retailer strategically stocks inventory for SKUs with low demand. The motivations are to provide a wide range of selections and faster customer fulfillment service. We assume the online retailer has the technological capability to manage and control the inventory globally: all warehouses act as one to serve the global demand simultaneously. The online retailer will utilize its entire inventory, regardless of location, to serve demand. We study inventory allocation and order fulfillment policies among warehouses for low-demand SKUs at an online retailer. Thus, given the global demand and an order fulfillment policy, there are tradeoffs involving inventory holding costs, transportation costs, and backordering costs in determining the optimal system inventory level and allocation of inventory to warehouses. For the case of Poisson demand and constant replenishment lead time, we develop methods to approximate the key system performance metrics like transshipment, backorders and average system inventory for one-for-one replenishment policies when warehouses hold exactly one unit of inventory. We run computational experiments to test the accuracy of the approximation. We develop extensions for cases when more than one unit of inventory is held at a warehouse. / (cont.) We then use these results to develop guidelines for inventory stocking and order fulfillment policies for online retailers. We also compare warehouse allocation policies for conditions when an order arrives but the preferred warehouse does not have stock although there is stock at more than one other location in the system. We develop intuition about the performance of these policies and run simulations to verify our hypotheses about these policies. / by Pallav Chhaochhria. / S.M. Operations Research Center.
300	Dynamic planning under uncertainty for theater airlift operations Martin, Kiel M. (Kiel Michael) January 2007 (has links) Thesis (S.M.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2007. / Includes bibliographical references (p. 92-93). / In this thesis, we analyze intratheater airlift operations, and propose methods to improve the planning process. The United States Air Mobility Command is responsible for the air component of the world wide U.S. military logistics network. Due to the current conflict in Iraq, a small cell within Air Mobility Command, known as Theater Direct Delivery, is responsible for supporting ongoing operations by assisting with intratheater airlift. We develop a mathematical programming approach to schedule airlift missions that pick up and deliver prioritized cargo within time windows. In our approach, we employ composite variables to represent entire missions and associated decisions, with each decision variable including information pertaining to the mission routing and scheduling, and assigned aircraft and cargo. We compare our optimization-based approach to one using a greedy heuristic that is representative of the current planning process. Using measures of efficiency and effectiveness, we evaluate and compare the performance of these different approaches. Finally, we adjust selected parameters of our model and measure the resulting changes in operating performance of our solutions, and the required computational effort to generate the solutions. / by Kiel M. Martin. / S.M. Operations Research Center.

Search results