1 |
Framework to Evaluate Entropy Based Data Fusion Methods in Supply Chain ManagementTran, Huong Thi 12 1900 (has links)
This dissertation explores data fusion methodology to deduce an overall inference from the data gathered from multiple heterogeneous sources. Typically, if there existed a data source in which the data were reliable and unbiased, then data fusion would not be necessary. Data fusion methodology combines data form multiple diverse sources so that the desired information - such as the population mean - is improved despite redundancies, inaccuracies, biases, and inflated variability in the data. Examples of data fusion include estimating average demand from similar sources, and integrating fatality counts from different media sources after a catastrophe. The approach in this study combines "inputs" from distinct sources so that the information is "fused." Another way of describing this process is "data integration." Important assumptions are 1. Several sources provide "inputs" for information used to estimate parameters of a probability distribution. 2. Since distributions for the data from the sources are heterogeneous, some sources are less reliable. 3. Distortions, bias, censorship, and systematic errors may be more prominent in data from certain sources. 4. The sample size of sources data, number of "inputs," may be very small. Examples of information from multiple sources are abundant: traffic information from sensors at intersections, multiple economic indicators from various sources, demand data for product using similar retail stores as sources, polling data from various sources, and disaster count of fatalities from different media sources after a catastrophic event. This dissertation seeks to address a gap in the operations literature by addressing three research questions regarding entropy base data fusion (EBDF) approaches to estimation. Three separate, but unifying, essays address the research questions for this dissertation. Essay 1 provides an overview of supporting literature for the research questions. A numerical analysis of airline maximum wait time data illustrates the underlying issues involved in EBDF methods. This essay addresses the research question: Why consider alternative entropy-based weighting methods? Essay 2 introduces 13 data fusion methods. A Monte Carlo simulation study examines the performance of these methods in estimating the mean parameter of a population with either a normal or lognormal distribution. This essay addresses the following research questions: 1. Can an alternative formulation for Shannon's entropy enhance the performance of Sheu (2010)'s data fusion approach? 2. Do symmetric and skewed distributions affect the 13 data fusion methods differently? 3. Do negative and positive biases affect the performance of the 13 methods differently? 4. Do entropy based data fusion methods outperform non-entropy based data fusion methods? 5. Which data fusion methods are recommended for symmetric and skewed data sets when no bias is present? What is the recommendation under conditions of few data sources? Essay 3 explores the use of the data fusion method estimates of the population mean in a newsvendor problem. A Monte Carlo simulation study investigates the accuracy of the using the estimates provided in Essay 2 as the parameter estimate for the distribution of demand that follows an exponential distribution. This essay addresses the following research questions: 1. Do data fusion methods with relatively strong performance in estimating the parameter mean estimate also provide relatively strong performance in estimating the optimal demand under a given ratio of overage and underage costs? 2. Do any of the data fusion methods deteriorate or improve with the introduction of positive and negative bias? 3. Do the alternative entropy formulations to Shannon's entropy enhance the performance of the methods on a relative basis? 4. Is the relative rank ordering performance of the data fusion methods different in Essay 2 and Essay 3 in the resulting performances of the methods? The contribution of this research is to introduce alternative EBDF methods, and to establish a framework for using EBDF methods in supply chain decision making. A comparative Monte Carlo simulation analysis study will provide a basis to investigate the robustness of the proposed data fusion methods for estimation of population parameters in a newsvendor problem with known distribution, but unknown parameter. A sensitivity analysis is conducted to determine the effect of multiple sources, sample size, and distributions.
|
Page generated in 0.1115 seconds