In this thesis, we examine machine learning as a tool for predicting new cus- tomers in a B2B-sales context. Using only publicly available information, we try to solve the problem using two different approaches: 1) a naive clustering based classifier built on K-means and 2) PU-learning with a random forests- adapter. We test these models with different sets of features and evaluate them using statistical measures and a discussion of the business implications. Our main findings conclude that the PU-learning could produce results that are satisfactorily for the purpose of improving the sales process, with the best case of being 4.8 times better than a random baseline classifier. However, the clustering based classifier was not good enough, producing only marginally better results than a random classifier in its best case. We also find that us- ing more variables improved the models, even in high-dimensional spaces with over 60 variables.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:kth-210256 |
Date | January 2017 |
Creators | Norlin, Patrik, Paulsrud, Viktor |
Publisher | KTH, Skolan för datavetenskap och kommunikation (CSC) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0019 seconds