Global ETD Search

1	Defining and predicting fast-selling clothing options Jesperson, Sara January 2019 (has links) This thesis aims to find a definition of fast-selling clothing options and to find a way to predict them using only a few weeks of sale data as input. The data used for this project contain daily sales and intake quantity for seasonal options, with sale start 2016-2018, provided by the department store chain Åhléns. A definition is found to describe fast-selling clothing options as those having sold a certain percentage of their intake after a fixed number of days. An alternative definition based on cluster affiliation is proven less effective. Two predictive models are tested, the first one being a probabilistic classifier and the second one being a k-nearest neighbor classifier, using the Euclidean distance. The probabilistic model is divided into three steps: transformation, clustering, and classification. The time series are transformed with B-splines to reduce dimensionality, where each time series is represented by a vector with its length and B-spline coefficients. As a tool to improve the quality of the predictions, the B-spline vectors are clustered with a Gaussian mixture model where every cluster is assigned one of the two labels fast-selling or ordinary, thus dividing the clusters into disjoint sets: one containing fast-selling clusters and the other containing ordinary clusters. Lastly, the time series to be predicted are assumed to be Laplace distributed around a B-spline and using the probability distributions provided by the clustering, the posterior probability for each class is used to classify the new observations. In the transformation step, the number of knots for the B-splines are evaluated with cross-validation and the Gaussian mixture models, from the clustering step, are evaluated with the Bayesian information criterion, BIC. The predictive performance of both classifiers is evaluated with accuracy, precision, and recall. The probabilistic model outperforms the k-nearest neighbor model with considerably higher values of accuracy, precision, and recall. The performance of each model is improved by using more data to make the predictions, most prominently with the probabilistic model. Apparel retail B-splines Gaussian mixture model Probabilistic prediction Sales prediction Probability Theory and Statistics Sannolikhetsteori och statistik
2	Prediction Performance of Survival Models Yuan, Yan January 2008 (has links) Statistical models are often used for the prediction of future random variables. There are two types of prediction, point prediction and probabilistic prediction. The prediction accuracy is quantified by performance measures, which are typically based on loss functions. We study the estimators of these performance measures, the prediction error and performance scores, for point and probabilistic predictors, respectively. The focus of this thesis is to assess the prediction performance of survival models that analyze censored survival times. To accommodate censoring, we extend the inverse probability censoring weighting (IPCW) method, thus arbitrary loss functions can be handled. We also develop confidence interval procedures for these performance measures. We compare model-based, apparent loss based and cross-validation estimators of prediction error under model misspecification and variable selection, for absolute relative error loss (in chapter 3) and misclassification error loss (in chapter 4). Simulation results indicate that cross-validation procedures typically produce reliable point estimates and confidence intervals, whereas model-based estimates are often sensitive to model misspecification. The methods are illustrated for two medical contexts in chapter 5. The apparent loss based and cross-validation estimators of performance scores for probabilistic predictor are discussed and illustrated with an example in chapter 6. We also make connections for performance. inverse probability of censoring weights absolute error misclassification error ROC curve point prediction probabilistic prediction Statistics
3	Prediction Performance of Survival Models Yuan, Yan January 2008 (has links) Statistical models are often used for the prediction of future random variables. There are two types of prediction, point prediction and probabilistic prediction. The prediction accuracy is quantified by performance measures, which are typically based on loss functions. We study the estimators of these performance measures, the prediction error and performance scores, for point and probabilistic predictors, respectively. The focus of this thesis is to assess the prediction performance of survival models that analyze censored survival times. To accommodate censoring, we extend the inverse probability censoring weighting (IPCW) method, thus arbitrary loss functions can be handled. We also develop confidence interval procedures for these performance measures. We compare model-based, apparent loss based and cross-validation estimators of prediction error under model misspecification and variable selection, for absolute relative error loss (in chapter 3) and misclassification error loss (in chapter 4). Simulation results indicate that cross-validation procedures typically produce reliable point estimates and confidence intervals, whereas model-based estimates are often sensitive to model misspecification. The methods are illustrated for two medical contexts in chapter 5. The apparent loss based and cross-validation estimators of performance scores for probabilistic predictor are discussed and illustrated with an example in chapter 6. We also make connections for performance. inverse probability of censoring weights absolute error misclassification error ROC curve point prediction probabilistic prediction Statistics
4	Data-driven decision support in digital retailing Sweidan, Dirar January 2023 (has links) In the digital era and advent of artificial intelligence, digital retailing has emerged as a notable shift in commerce. It empowers e-tailers with data-driven insights and predictive models to navigate a variety of challenges, driving informed decision-making and strategic formulation. While predictive models are fundamental for making data-driven decisions, this thesis spotlights binary classifiers as a central focus. These classifiers reveal the complexities of two real-world problems, marked by their particular properties. Specifically, binary decisions are made based on predictions, relying solely on predicted class labels is insufficient because of the variations in classification accuracy. Furthermore, prediction outcomes have different costs associated with making different mistakes, which impacts the utility. To confront these challenges, probabilistic predictions, often unexplored or uncalibrated, is a promising alternative to class labels. Therefore, machine learning modelling and calibration techniques are explored, employing benchmark data sets alongside empirical studies grounded in industrial contexts. These studies analyse predictions and their associated probabilities across diverse data segments and settings. The thesis found, as a proof of concept, that specific algorithms inherently possess calibration while others, with calibrated probabilities, demonstrate reliability. In both cases, the thesis concludes that utilising top predictions with the highest probabilities increases the precision level and minimises the false positives. In addition, adopting well-calibrated probabilities is a powerful alternative to mere class labels. Consequently, by transforming probabilities into reliable confidence values through classification with a rejection option, a pathway emerges wherein confident and reliable predictions take centre stage in decision-making. This enables e-tailers to form distinct strategies based on these predictions and optimise their utility. This thesis highlights the value of calibrated models and probabilistic prediction and emphasises their significance in enhancing decision-making. The findings have practical implications for e-tailers leveraging data-driven decision support. Future research should focus on producing an automated system that prioritises high and well-calibrated probability predictions while discarding others and optimising utilities based on the costs and gains associated with the different prediction outcomes to enhance decision support for e-tailers. / <p>The current thesis is a part of the industrial graduate school in digital retailing (INSiDR) at the University of Borås and funded by the Swedish Knowledge Foundation.</p> Digital Retailing Decision Support Probabilistic Prediction Calibration Product Returns Customer Churn Binary Classification Scikit-Learn Other Computer and Information Science Annan data- och informationsvetenskap Computer Sciences Datavetenskap (datalogi) Computer Systems Datorsystem Software Engineering Programvaruteknik Business Administration Företagsekonomi

1

Page generated in 0.0842 seconds