An ongoing challenge for most businesses is to filter out potential customers from their audience. This thesis proposes a method that takes advantage of user data to classify po- tential customers from random visitors to a website. The method is based on the Predictive Lead Scoring method that segments customers based on their likelihood of purchasing a product. Our method, however, aims to predict user conversion, that is predicting whether a user has the potential to become a customer or not. Six supervised machine learning models have been used to carry out the classifica- tion task. To account for the high imbalance in the input data, multiple resampling meth- ods have been applied to the training data. The combination of classifier and resampling method with the highest average precision score has been selected as the best model. In addition, this thesis tries to quantify the effect of feature weights by evaluating some feature ranking and weighting schemes. Using the schemes, several sets of weights have been produced and evaluated by training a KNN classifier on the weighted features. The change in average precision obtained from the original KNN (without weighting) is used as the reference for measuring the performance of ranking and weighting schemes.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-176433 |
Date | January 2021 |
Creators | Etminan, Ali |
Publisher | Linköpings universitet, Statistik och maskininlärning |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.002 seconds