Global ETD Search

21	Users & Usage : Hur kan minskad churn uppnås imolntjänster med abonnemangsupplägg genom att analysera användarbeteenden WENNEBORG, ERIK January 2015 (has links) Syftet med denna studie är att ta fram ett förslag på en iterativ process för företag somtillhandahåller abonnemangsbaserade tjänster att användas för ett kontinuerligt lärande om användare och användande. Studien föreslår även sätt att nyttja det lärandet för att reducera churn. Huvuddelen av studien bedrivs som ett case på Storytel. Genom narrativa intervjuer med personal, fältstudier hos Storytel, analys av befintlig applikationsdata och utvärdering av verktyg kombinerat med teori kring churn och segmentering har förslag på verktyg och process att använda i fortsatt arbete för Storytel tagits fram. Studien visar samband mellan churn och användning och ger förslag på hur nya verktyg kan integreras på befintliga plattformar. Den visar värdet av att mäta och analysera användardata i mer generella termer och ger tips för att nå framgång med projekt av denna typ. Churn segmentering molntjänst mobilapplikation abonnemangstjänst Economics and Business Ekonomi och näringsliv
22	Measuring and improving the performance of the bitcoin network Imtiaz, Muhammad Anas 26 January 2022 (has links) The blockchain technology promises innovation by moving away from conventional centralized architectures, where trust is placed in a small number of actors, to a decentralized environment where a collection of actors must work together to maintain consensus in the overall system. Blockchain offers security and pseudo-anonymity to its adopters, through the use of various cryptographic methods. While much attention has focused on creating new applications that make use of this technology, equal importance must be given to studying naturally occurring phenomena in existing blockchain ecosystems and mitigating their effects where harmful. In this dissertation, we develop a novel open-source log-to-file system that provides the ability to record information relevant to events as they take place in live blockchain networks. Specifically, our open-source software facilitates in-situ measurements on full nodes in the live Bitcoin and Bitcoin Cash blockchain networks. This measurement framework sheds new light on many phenomena that were previously unknown or scarcely studied. First, we examine the presence and impact of churn, namely nodes joining and leaving, on the behavior of the Bitcoin network. Our data analysis over a two-month period shows that a large number of Bitcoin nodes churn at least once. We perform statistical distribution fitting to this churn and emulate it in our measurement nodes to evaluate the impact of churn on the performance of the Bitcoin protocol. From our experiments, we find that blocks received by churning nodes experience as much as five times larger propagation delay than those received by non-churning nodes. We introduce and evaluate a novel synchronization scheme to mitigate such effects on the performance of the protocol. Our empirical evaluation shows that blocks received by churning nodes that synchronize their mempools with peers have roughly half the delay in propagation experienced by those that do not synchronize their mempools. We next evaluate and compare the performance of three block relay protocols, namely the default protocol, and the more recent compact block and Graphene protocols. This evaluation is conducted over full nodes running the Bitcoin Unlimited client (which is used in conjunction with the Bitcoin Cash network). We find that in most scenarios, the Graphene block relay protocol outperforms the other two in terms of the block propagation delay and the amount of total communication associated with block relay. An exception is when nodes churn frequently and spend a significant fraction of time off the network, in which case the compact block relay protocol performs best. In-depth analyses reveal subtle inefficiencies of the protocols. Thus, in the case of frequent churns, the Graphene block relay protocol performs as many as two extra round-trips of communication to recover information necessary to reconstruct blocks. Likewise, an inspection of the compact block relay protocol indicates that the full transactions included in the initial block message are either unnecessary or insufficient for the successful reconstruction of blocks. Finally, we investigate the occurrence of orphan transactions which are those whose parental income sources are missing at the time that they are processed. These transactions typically languish in a local buffer until they are evicted or all their parents are discovered, at which point they may be propagated further. Our data reveals that slightly less than half of orphan transactions end up being included in the blockchain. Surprisingly, orphan transactions tend to have fewer parents on average than non-orphan transactions, and their missing parents have a lower fee, a larger size, and a lower transaction fee per byte than all other received transactions. Moreover, the network overhead incurred by these orphan transactions can be significant when using the default orphan memory pool size (i.e., 100 transactions), although this overhead can be made negligible if the pool size is simply increased to 1,000 transactions. In summary, this dissertation demonstrates the importance of characterizing the inner behavior of the peer-to-peer network underlying a blockchain. While our results primarily focus on the Bitcoin network and its variants, this work provides foundations that should prove useful for studying and characterizing other blockchains. Computer engineering Churn Orphan transactions P2P Peer-to-peer
23	Predicting user churn on streaming services using recurrent neural networks / Förutsägande av användarens avbrott på strömmande tjänster med återkommande neurala nätverk Martins, Helder January 2017 (has links) Providers of online services have witnessed a rapid growth of their user base in the last few years. The phenomenon has attracted an increasing number of competitors determined on obtaining their own share of the market. In this context, the cost of attracting new customers has increased significantly, raising the importance of retaining existing clients. Therefore, it has become progressively more important for the companies to improve user experience and ensure they keep a larger share of their users active in consuming their product. Companies are thus compelled to build tools that can identify what prompts customers to stay and also identify the users intent on abandoning the service. The focus of this thesis is to address the problem of predicting user abandonment, also known as "churn", and also detecting motives for user retention on data provided by an online streaming service. Classical models like logistic regression and random forests have been used to predict the churn probability of a customer with a fair amount of precision in the past, commonly by aggregating all known information about a user over a time period into a unique data point. On the other hand, recurrent neural networks, especially the long short-term memory (LSTM) variant, have shown impressive results for other domains like speech recognition and video classification, where the data is treated as a sequence instead. This thesis investigates how LSTM models perform for the task of predicting churn compared to standard nonsequential baseline methods when applied to user behavior data of a music streaming service. It was also explored how different aspects of the data, like the distribution between the churning and retaining classes, the size of user event history and feature representation influences the performance of predictive models. The obtained results show that LSTMs has a comparable performance to random forest for churn detection, while being significantly better than logistic regression. Additionally, a framework for creating a dataset suitable for training predictive models is provided, which can be further explored as to analyze user behavior and to create retention actions that minimize customer abandonment. / Leverantörer av onlinetjänster har bevittnat en snabb användartillväxt under de senaste åren. Denna trend har lockat ett ökande antal konkurrenter som vill ta del av denna växande marknad. Detta har resulterat i att kostnaden för att locka nya kunder ökat avsevärt, vilket även ökat vikten av att behålla befintliga kunder. Det har därför gradvis blivit viktigare för företag att förbättra användarupplevelsen och se till att de behåller en större andel avanvändarna aktiva. Företag har därför ett starkt intresse avatt bygga verktyg som kan identifiera vad som driver kunder att stanna eller vad som får dem lämna. Detta arbete fokuserar därför på hur man kan prediktera att en användare är på väg att överge en tjänst, så kallad “churn”, samt identifiera vad som driver detta baserat på data från en onlinetjänst. Klassiska modeller som logistisk regression och random forests har tidigare använts på aggregerad användarinformation över en given tidsperiod för att med relativt god precision prediktera sannolikheten för att en användare kommer överge produkten. Under de senaste åren har dock sekventiella neurala nätverk (särskilt LSTM-varianten Long Short Term Memory), där data istället behandlas som sekvenser, visat imponerande resultat för andra domäner såsom taligenkänning och videoklassificering. Detta arbete undersöker hur väl LSTM-modeller kan användas för att prediktera churn jämfört med traditionella icke-sekventiella metoder när de tillämpas på data över användarbeteende från en musikstreamingtjänst. Arbetet undersöker även hur olika aspekter av data påverkar prestandan av modellerna inklusive distributionen mellan gruppen av användare som överger produkten mot de som stannar, längden av användarhändelseshistorik och olika val av användarfunktioner för modeller och användardatan. De erhållna resultaten visar att LSTM har en jämförbar prestanda med random forest för prediktering av användarchurn samt är signifikant bättre än logistisk regression. LSTMs visar sig således vara ett lämpligt val för att förutsäga churn på användarnivå. Utöver dessa resultat utvecklades även ett ramverk för att skapa dataset som är lämpliga för träning av prediktiva modeller, vilket kan utforskas ytterligare för att analysera användarbeteende och för att skapa förbättrade åtgärder för att behålla användare och minimera antalet kunder som överger tjänsten. churn prediction streaming services LSTM RNN Computer Sciences Datavetenskap (datalogi)
24	Reálná úloha dobývání znalostí / Actual role of knowledge discovery in databases Pešek, Jiří January 2012 (has links) The thesis "Actual role of knowledge discovery in databases˝ is concerned with churn prediction in mobile telecommunications. The issue is based on real data of a telecommunication company and it covers all steps of data mining process. In accord with the methodology CRISP-DM, the work looks thouroughly at the following stages: business understanding, data understanding, data preparation, modeling, evaluation and deployment. As far as a system for knowledge discovery in databases is concerned, the tool IBM SPSS Modeler was selected. The introductory chapter of the theoretical part familiarises the reader with the issue of so called churn management, which comprises the given assignment; the basic concepts related to data mining are defined in the chapter as well. The attention is also given to the basic types of tasks of knowledge discovery of databasis and algorithms that are pertinent to the selected assignment (decision trees, regression, neural network, bayesian network and SVM). The methodology describing phases of knowledge discovery in databases is included in a separate chapter, wherein the methodology of CRIPS-DM is examined in greater detail, since it represents the foundation for the solution of our practical assignment. The conclusion of the theoretical part also observes comercial or freely available systems for knowledge discovery in databases.
25	Where do you save most money on refactoring? / Var sparar du mest pengar på refaktorering? Siverland, Susanne January 2014 (has links) A mature code-base of 1 300 000 LOC for a period of 20 months has been examined. This paper investigates if churn is a significant factor in finding refactoring candidates. In addition it looks at the variables Lines of Code (LOC), Technical Debt (TD), Duplicated lines and Complexity to find out if any of these indicators can inform a coder as to what to refactor. The result is that churn is the strongest variable out of the studied variables followed by LOC and TD. / En kodbas på 1 300 000 rader kod har undersökts under 20 månader. Denna uppsats undersöker om kodens användningsfrekvens är en signifikant faktor för att finna refaktoreringskandidater. Uppsatsen tittar även antal kodrader, teknisk skuld, antal duplicerade kodrader och komplexitet för att undersöka om dessa indikatorer kan informera en programmerare om vad som ska refaktoreras. Resultatet är att kodens användningsfrekvens är den starkaste variabeln följt av antal kodrader samt teknisk skuld. refactoring technical debt TD complexity lines of code LOC legacy code duplicated lines code-churn churn Software Engineering Programvaruteknik
26	[en] A NON-PARAMETRIC PROBABILISTIC COUNTERFACTUAL APPROACH TO ASSESS A RETAILER S TRANSACTIONAL POTENTIAL / [pt] UMA ABORDAGEM CONTRAFACTUAL PROBABILÍSTICA NÃO PARAMÉTRICA PARA AVALIAR O POTENCIAL TRANSACIONAL DE UM VAREJISTA LEONARDO DOMINGUES 23 August 2022 (has links) [pt] No contexto da indústria de adquirência, uma adquirente é uma empresa que facilita a comunicação entre um varejista (online ou loja física) e os bancos emissores. Para um adquirente, é crucial determinar o potencial transacional de cada varejista para orientar estratégias adequadas de precificação e gestão de risco. Neste trabalho, propomos uma estrutura para avaliar adequadamente o potencial transacional de qualquer varejista usando as transações de seus pares. A estrutura proposta é baseada na construção de um contrafactual probabilístico que usa a regressão não paramétrica do kernel Nadaraya-Watson para modelar diferentes padrões sazonais, tendências e ciclos de negócios. Propomos uma metodologia integrada de processamento de dados para separar e validar os dados não afetados por intervenções para construir nosso modelo contrafactual probabilístico não paramétrico. O framework proposto é um poderoso sistema de suporte à decisão para gestão de receitas de uma adquirente, com aplicações diretas para precificação, detecção de churn e, de forma mais geral, gerenciamento de receita. Os resultados empíricos corroboram a eficácia do método em relação aos benchmarks relevantes. / [en] In the payment industry context, a merchant acquirer is a firm that facilitates communication between a retailer (online or brick–and–mortar store) and the issuing banks. For an acquirer, it is crucial to determine the transactional potential of each retailer to guide proper pricing and risk management strategies. In this work, we propose a framework to properly assess the transactional potential of any retailer using the transactions of its peers. The proposed framework is based on the construction of a probabilistic counterfactual that uses non-parametric Nadaraya-Watson kernel regression to model differing seasonal patterns, trends and business cycles. We propose an integrated data processing methodology to separate and validate the data not affected by interventions to construct our non-parametric probabilistic counterfactual model. The proposed framework is a powerful decision support system for a merchant acquirer revenue management, with direct applications to pricing, churn detection and, more generally, revenue management. Empirical results corroborate the effectiveness of the method against relevant benchmarks. [pt] CONTRAFACTUAL [pt] CHURN NAO-CONTRATUAL [pt] ADQUIRENTE [pt] GERENCIAMENTO DE RECEITA [en] COUNTERFACTUAL [en] NON-CONTRACTUAL CHURN [en] ACQUIRER [en] REVENUE MANAGEMENT
27	Análise preditiva de Churn com ênfase em técnicas de Machine Learning: uma revisão Schneider, Pedro Henrique 27 July 2016 (has links) Submitted by Pedro Henrique Schneider (pedro.hesch@gmail.com) on 2016-09-09T15:00:58Z No. of bitstreams: 1 Dissertação de Mestrado versB - Pedro Schneider.pdf: 3405337 bytes, checksum: f452667b92fb078d3ef982a694d30db3 (MD5) / Approved for entry into archive by Janete de Oliveira Feitosa (janete.feitosa@fgv.br) on 2016-09-26T12:55:03Z (GMT) No. of bitstreams: 1 Dissertação de Mestrado versB - Pedro Schneider.pdf: 3405337 bytes, checksum: f452667b92fb078d3ef982a694d30db3 (MD5) / Approved for entry into archive by Maria Almeida (maria.socorro@fgv.br) on 2016-10-17T16:18:06Z (GMT) No. of bitstreams: 1 Dissertação de Mestrado versB - Pedro Schneider.pdf: 3405337 bytes, checksum: f452667b92fb078d3ef982a694d30db3 (MD5) / Made available in DSpace on 2016-10-17T16:18:27Z (GMT). No. of bitstreams: 1 Dissertação de Mestrado versB - Pedro Schneider.pdf: 3405337 bytes, checksum: f452667b92fb078d3ef982a694d30db3 (MD5) Previous issue date: 2016-07-27 / In the last two decades, the growth of the Internet and its associated technologies, are transforming the way of the relationship between companies and their clients. In general, the acquisition of a new customer is much more expensive for a company than the retention of a current one. Thus, customer retention studies or Churn management has become more important for companies. This study represents the review and classi cation of literature on applications of Machine Learning techniques to build predictive models of customers loss, also called Churn. The objective of this study was collecting the largest possible number of documents on the subject within the proposed methodology and classi es them as per application areas, year of publication, Machine Learning techniques applied, journals and repositories used and in uence level of the documents. And thus, bringing to the light the existing studies in this eld of activity, consolidating what is the state of the art of research in this area, and signi cantly contribute as a reference for future applications and researches in this area. Although, the study has not been the rst in the literature of Machine Learning related to the loss of customer or customer retention in the way of literature review, it was the rst, among the ones we have found, with focus on documents studying, not exclusively, loss or retention of customers by Machine Learning techniques, and without any kind of restriction. Furthermore it was the rst to classify documents by in uence, through the quotations from each document. As a nal database was collected and analyzed 80 documents, from which were found as main application areas: Telecommunications, Financial, Newspapers, Retail, among others. As per Machine Learning techniques applied, the most applied techniques founded related to the problem, were the following: Logistic Regression, Decision Tree and Neural Networks, among others. And based on the results, this kind of study is dated since 2000. / Nas últimas duas décadas, o crescimento da internet e suas tecnologias associadas, vêm transformando a forma de relacionamento entre as empresas e seus clientes. Em geral, a aquisição de um novo cliente custa muito mais caro para uma empresa que a retenção do mesmo. Desta forma, estudos de retenção de clientes, ou gerenciamento do Churn, se tornaram mais importantes para as empresas. O presente trabalho consiste na revisão e classificação da literatura sobre aplicações de técnicas com ênfase em Machine Learning para construir modelos preditivos de perda de clientes, também chamada de Churn. O objetivo do trabalho foi reunir o maior número possível de documentos sobre o assunto, dentro da metodologia proposta, e classificá-los quanto às áreas de aplicação, ano de publicação, técnicas de Machine Learning aplicadas, periódicos e repositórios utilizados, nível de influência dos documentos e desta forma trazer à luz os estudos já existentes nesse campo de atuação, consolidando o que há do estado da arte em pesquisas desta área, e de forma significativa contribuir como uma referência para futuras aplicações e pesquisas nesta área. Embora o trabalho não tenha sido o primeiro na literatura de Machine Learning relacionado a perda ou retenção de clientes na linha de revisão literária, foi o primeiro encontrado com foco em documentos que estudam, não exclusivamente, a perda ou retenção de clientes por técnicas de Machine Learning e sem nenhum tipo de restrições. Da mesma forma foi o primeiro a classificar os documentos por influência através das citações entre os documentos. Assim, como base final para o trabalho, analisou-se 80 documentos, onde foram encontradas como principais áreas de aplicação: Telecomunicações, Financeiras, Jornais, Varejo entre outras. Constataram-se como técnicas de Machine Learning mais utilizadas para o problema em questão: Regressão Logística, Árvores de Decisão e Redes Neurais, entre outras. E ainda, de acordo com os resultados obtidos, notou-se que ano 2000 tende a ser um marco para esta pesquisa, pois foi a data mais antiga para a qual foi encontrado um artigo nesse trabalho. Churn Análise Preditiva de Churn Retenção de clientes Machine learning Aprendizagem de máquina Data mining Mineração de dados Revisão Matemática Mineração de dados (Computação) Aprendizado do computador
28	Binary Classification for Predicting Customer Churn Axén, Maja, Karlberg, Jennifer January 2020 (has links) Predicting when a customer is about to turn to a competitor can be difficult, yet extremely valuable from a business perspective. The moment a customer stops being considered a customer is known as churn, a widely researched topic in several industries when dealing with subscription-services. However, in industries with non-subscription services and products, defining churn can be a daunting task and the existing literature does not fully cover this field. Therefore, this thesis can be seen as a contribution to current research, specially when not having a set definition for churn. A definition for churn, adjusted to DIAKRIT’s business, is created. DIAKRIT is a company working in the real estate industry, which faces many challenges, such as a huge seasonality. The prediction was approached as a supervised problem, where three different Machine Learning methods were used: Logistic Regression, Random Forest and Support Vector Machine. The variables used in the predictions are predominantly activity data. With a relatively high accuracy and AUC-score, Random Forest was concluded to be the most reliable model. It is however clear that the model cannot separate between the classes perfectly. It was also visible that the Random Forest model produces a relatively high precision. Thereby, it can be settled that even though the model is not flawless the customers predicted to churn are very likely to churn. / Att prediktera när en kund är påväg att vända sig till en konkurrent kan vara svårt, dock kan det visa sig extremt värdefullt ur ett affärsperspektiv. När en kund slutar vara kund benäms det ofta som kundbortfall eller ”churn”. Detta är ett ämne som är brett forskat på i flertalet olika industrier, men då ofta i situationer med prenumenationstjänster. När man inte har en prenumerationstjänst försvåras uppgiften att definera churn och existerande studier brister i att analysera detta. Denna uppsats kan därför ses som ett bidrag till nuvarande litteratur, i synnerhet i fall där ingen tydlig definition för churn existerar. En definition för churn, anpassad efter DIAKRIT och deras affärsstruktur har skapats i det här projektet. DIAKRIT är verksamma i fastighetsbranschen, en industri som har flera utmaningar, bland annat en extrem säsongsvariaton. För att genomföra prediktionerna användes tre olika maskininlärningamodeller: Logistisk Regression, Random Forest och Support Vector Machine. De variabler som användes är mestadels aktivitetsdata. Med relativt hög noggranhet och AUC-värde anses Random Forest vara mest pålitlig. Modellen kan dock inte separera mellan de två klasserna perfekt. Random Forest modellen visade sig också genera en hög precision. Därför kan slutsatsen dras att även om modellen inte är felfri verkar det som att kunderna predikterade som churn mest sannolikt kommer churna. Churn Machine Learning Prediction Logistic Regression Random Forest Support Vector Machine Customer Profitability Customer Attrition User Churn User Retention Real estate industry Mathematics Matematik
29	Churn inom SaaS : En fallstudie om betydelsefulla kundattribut inom ett SaaS-företag med B2B kunder / Churn in SaaS : A case study of significant customer attributes in a SaaS company with B2B customers Jonson, Filip, Hedvall, Love January 2021 (has links) Software as a service (SaaS) är en affärsmodell som syftar till att användaren prenumererar på en mjukvara Mjukvaran levereras över internet vilket medför att användaren inte behöver tänka på mjukvaruuppdateringar och driftunderhåll av servrar. Churn innebär att användaren avslutar sin prenumeration hos ett företag och därmed slutar vara kund. Förvärv av nya kunder är en dyr process, som kan kosta upp till fem gånger mer än att sälja till en redan befintlig kund. Tidigare forskning inom churn har främst varit koncentrerad till telekombolag. Undersökningar har specialiserats på maskininlärningsmetoder för att studera churn. Tidigare studier beskriver att det finns begränsad forskning för churn inom SaaS-företag med B2B kunder. De studier som har undersökt churn har främst varit fallstudier där olika kundattribut har studerats utifrån generella- och beteendekundattribut. Studien har i samarbete med ett SaaS-företag undersökt flera kundattribut på ett lönehanteringssystem. Syftet har varit att undersöka vilka kundattribut som är intressanta att ta ut statistik på när churn studeras. Ovanstående ska medföra att det studerade företaget kan införskaffa insikter och arbeta mer med datadrivna beslut. För att förstå vilka kunder som väljer att avsluta sin prenumeration behövs data samlas in om kunderna. En kvantitativ fallstudie utfördes genom att undersöka flera kundattribut hos de kunder som har churnat. Undersökningen utfördes med modellen CRISP-DM för att genomföra dataanalysen på ett systematiskt tillvägagångsätt. Undersökningen studerade kundattribut utifrån variablerna generella- och beteendekundattribut. Dataanalysen genomfördes med hjälp av Python-kod och resultatet presenterades med grafer och tabeller. Studiens resultat visade att vissa värden på följande kundattribut var överrepresenterade vid churn: Kundtyper, Bolagsform, Antal Anställda, Licenser, Antal skickade specifikationer och inloggning. Tidigare forskning har undersökt olika kundattribut och funnit att de kan behöva anpassas för det studerade företaget. / Software as a service (SaaS) is a business model that aims the user to subscribe to a software. The software is delivered over the internet, which means that the user does not have to consider updates and operational maintenance of servers. Churn means that the user cancels his subscription with a company and thereby stops to be a customer. Acquiring new customers is an expensive process, which can cost up to five times more than selling to an existing customer. Previous research in churn has mainly been concentrated in the telecommunications industry. In the mentioned area, churn has long been a problem for companies. Research has concentrated on machine learning methods for studying churn. Previous research describes that there are limited studies in churn with SaaS as a business model. Studies about churn have mainly been case studies where different attributes have been studied based on general and behavioral customer attributes. This study has in collaboration with a SaaS company, examined several customer attributes on a salary management program. The purpose has been to investigate which customer attributes that are interesting to collect statistics when churn is studied. This should enable that the studied company can acquire insights and work more with data-driven decisions. To understand which customers that unsubscribe, data needs to be collected about the customers. A quantitative case study was performed by examining several customer attributes of the customers who have churned. The survey was carried out with the CRISP-DM model to accomplish the data analysis in a systematic approach. The survey studied customer attributes based on the variables general and behavioral customer attributes. The data analysis was performed using Python code and the results were presented with graphs and tables. The results of the study showed that certain values of the following customer attributes were overrepresented in churn: Customer types, Business type, Number of Employees, Licenses, Number of specifications sent and Login. Previous research has examined various customer attributes and found that they may need to be adapted for the studied company. Churn SaaS CRISP-DM General customer attributes Behavioral customer attributes Churn SaaS CRISP-DM Generella kundattribut Beteendekundattribut Information Systems
30	Customer Churn Prediction for PC Games : Probability of churn predicted for big-spenders usingsupervised machine learning / Kundchurn prediktering för PC-spel : Sannolikheten av churn förutsagd för spelaresom spenderar mycket pengar med övervakad maskininlärning Tryggvadottir, Valgerdur January 2019 (has links) Paradox Interactive is a Swedish video game developer and publisher which has players all around the world. Paradox’s largest platform in terms of amount of players and revenue is the PC. The goal of this thesis was to make a churn predic-tion model to predict the probability of players churning in order to know which players to focus on in retention campaigns. Since the purpose of churn prediction is to minimize loss due to customers churning the focus was on big-spenders (whales) in Paradox PC games. In order to define which players are big-spenders the spending for players over a 12 month rolling period (from 2016-01-01 until 2018-12-31) was investigated. The players spending more than the 95th-percentile of the total spending for each pe-riod were defined as whales. Defining when a whale has churned, i.e. stopped being a big-spender in Paradox PC games, was done by looking at how many days had passed since the players bought something. A whale has churned if he has not bought anything for the past 28 days. When data had been collected about the whales the data set was prepared for a number of di˙erent supervised machine learning methods. Logistic Regression, L1 Regularized Logistic Regression, Decision Tree and Random Forest were the meth-ods tested. Random Forest performed best in terms of AUC, with AUC = 0.7162. The conclusion is that it seems to be possible to predict the probability of churning for Paradox whales. It might be possible to improve the model further by investi-gating more data and fine tuning the definition of churn. / Paradox Interactive är en svensk videospelutvecklare och utgivare som har spelare över hela världen. Paradox största plattform när det gäller antal spelare och intäk-ter är PC:n. Målet med detta exjobb var att göra en churn-predikterings modell för att förutsäga sannolikheten för att spelare har "churnat" för att veta vilka spelare fokusen ska vara på i retentionskampanjer. Eftersom syftet med churn-prediktering är att minimera förlust på grund av kunderna som "churnar", var fokusen på spelare som spenderar mest pengar (valar) i Paradox PC-spel.För att definiera vilka spelare som är valar undersöktes hur mycket spelarna spenderar under en 12 månaders rullande period (från 2016-01-01 till 2018-12-31). Spelarna som spenderade mer än 95:e percentilen av den totala spenderingen för varje period definierades som valar. För att definiera när en val har "churnat", det vill säga slutat vara en kund som spenderar mycket pengar i Paradox PC-spel, tittade man på hur många dagar som gått sedan spelarna köpte någonting. En val har "churnat" om han inte har köpt något under de senaste 28 dagarna.När data hade varit samlad om valarna var datan förberedd för ett antal olika maskininlärningsmetoder. Logistic Regression, L1 Regularized Logistic Regression, Decision Tree och Random Forest var de metoder som testades. Random Forest var den metoden som gav bäst resultat med avseende på AUC, med AUC = 0, 7162. Slutsatsen är att det verkar vara möjligt att förutsäga sannolikheten att Paradox valar "churnar". Det kan vara möjligt att förbättra modellen ytterligare genom att undersöka mer data och finjustera definitionen av churn. Customer churn prediction whales data analysis machine learning binary classification. Kund churn prediktering valar dataanalys maskinlärning binär klas-sificering. Mathematics Matematik

Search results