Return to search

Predicting Risk of Delays in Postal Deliveries with Neural Networks and Gradient Boosting Machines / Predicering av risk för förseningar av leveranser med neurala nätverk och gradient boosting machines

This thesis conducts a study on a data set from the Swedish and Danish postal service Postnord, comparing an artificial neural network (ANN) and a gradient boosting machine (GBM) for predicting delays in package deliveries. The models are evaluated based on F1-score for the important class which represents the data points that are delayed and needed to be identified. The GBM is already implemented and tuned using grid search by Postnord, the ANN is tuned using sequential model based optimization with the tree Parzen estimator function. Furthermore, it is trained using dynamic resampling to handle the imbalanced data set. Even with several measures implemented to handle the class imbalance, the ANN performs poorly when tested on unseen data, unlike the GBM. The GBM has high precision (84%) and decent recall (24%), which produces a F1-score of 0.38. The ANN has high recall (62%) but extremely low precision (5%) which gives a F1-score of 0.08, indicating that it is biased to predict sample as delayed when it is in time. The GBM has a natural handling of class imbalance unlike the ANN, and even with measures taken to improve the ANN and its handling of class imbalance, GBM performs better.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-169478
Date January 2020
CreatorsSöderholm, Matilda
PublisherLinköpings universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0018 seconds