Global ETD Search

Return to search

Churn Prediction

Churn analysis is an important tool for companies as it can reduce the costs that are related to customer churn. Churn prediction is the process of identifying users before they churn, this is done by implementing methods on collected data in order to ﬁnd patterns that can be helpful when predicting new churners in the future.The objective of this report is to identify churners with the use of surveys collected from diﬀerent golfclubs, their members and guests. This was accomplished by testing several diﬀerent supervised machine learning algorithms in order to ﬁnd the diﬀerent classes and to see which supervised algorithms are most suitable for this kind of data.The margin of success was to have a greater accuracy than the percentage of major class in the datasetThe data was processed using label encoding, ONE-hot encoding and principal component analysis and was split into 10 folds, 9 training folds and 1 testing fold ensuring cross validation when iterated 10 times rearranging the test and training folds. Each algorithm processed the training data to create a classiﬁer which was tested on the test data.The classiﬁers used for the project was K nearest neighbours, Support vector machine, multi-layer perceptron, decision trees and random forest.The diﬀerent classiﬁers generally had an accuracy of around 72% and the best classiﬁer which was random forest had an accuracy of 75%. All the classiﬁers had an accuracy above the margin of success.K-folding, confusion-matrices, classiﬁcation report and other internal crossvalidation techniques were performed on the the data to ensure the quality of the classiﬁer.The project was a success although there is a strong belief that the bottleneck for the project was the quality of the data in terms of new legislation when collecting and storing data that results in redundant and faulty data. / Churn analys är ett viktigt verktyg för företag då det kan reducera kostnaderna som är relaterade till kund churn. Churn prognoser är processen av att identiﬁera användare innan de churnas, detta är gjort med implementering av metoder på samlad data för att hitta mönster som är hjälpsamma när framtida användare ska prognoseras. Objektivet med denna rapport är att identiﬁera churnare med användning av enkäter samlade från golfklubbar och deras kunder och gäster. Det är uppnå att igenom att testa ﬂera olika kontrollerade maskinlärnings algoritmer för att jämföra vilken algoritm som passar bäst. Felmarginalen uppgick till att ha en större träﬀsäkerhet än procenthalten av den dominanta klassen i datasetet. Datan behandlades med label encoding, ONE-hot encoding och principial komponent analys och delades upp i 10 delar, 9 träning och 1 test del för att säkerställa korsvalidering. Varje algoritm behandlade träningsdatan för att skapa att klassiﬁerare som sedan testades på test datan. Klassiﬁerarna som användes för projekted innefattar K nearest neighbours, Support vector machine, multi-layer perceptron, decision trees och random forest. De olika klassiﬁerarna hade en generell träﬀssäkerhet omkring 72%, där den bästa var random forest med en träﬀssäkerhet på 75%. Alla klassiﬁerare hade en träffsäkerhet än den felmarginal som st¨alldes. K-folding, confusion matrices, classiﬁcation report och andra interna korsvaliderings tekniker användes för att säkerställa kvaliteten på klassiﬁeraren. Projektet var lyckat, men det ﬁnns misstanke om att ﬂaskhalsen för projektet låg inom kvaliteten på datan med hänsyn på villkor för ny lagstiftning vid insamling och lagring av data som leder till överﬂödiga och felaktiga uppgifter.

http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-41236

Identifer	oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:hh-41236
Date	January 2019
Creators	Åkermark, Alexander, Hallefält, Mattias
Publisher	Högskolan i Halmstad, Akademin för informationsteknologi, Högskolan i Halmstad, Akademin för informationsteknologi
Source Sets	DiVA Archive at Upsalla University
Language	English
Detected Language	Swedish
Type	Student thesis, info:eu-repo/semantics/bachelorThesis, text
Format	application/pdf
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds

Churn Prediction

Description

Links & Downloads

Tags

Additional Fields