• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • Tagged with
  • 3
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Multi-Class Classification for Predicting Customer Satisfaction : Application of machine learning methods to predict customer satisfaction at IKEA

Backerholm, Stina, Börjesjö, Malin January 2023 (has links)
Gaining a comprehensive understanding of the features that contribute to customer satisfaction after contact with IKEA’s Remote Customer Meeting Points (RCMPs) is essential for implementing effective remedial measures in the future. The aim of this project is to investigate if it is possible to find key features that influence customer satisfaction and to use these to predict customer satisfaction. The task has been approached as a multi-class classification problem, with the objective of classifying the observations into five distinct levels of customer satisfaction. The study utilized three models, Multinomial Logistic Regression, Random Forest, and Extreme Gradient Boosting, to investigate these possibilities. Based on the methods used and the available data, the results indicate that it is currently not feasible to accurately identify key features or predict customer satisfaction. / Att förstå vilka faktorer som bidrar till kundnöjdhet efter en kontakt med IKEAs RCMPs är avgörande för att kunna genomföra effektiva åtgärder i framtiden. Syftet med detta projekt är att undersöka om det är möjligt att hitta nyckelfaktorer som påverkar kundnöjdhet och använda dessa för att prediktera kundnöjdhet. Uppgiften har angripits som ett multi-klass klassificeringsproblem, med syftet att klas- sificera observationerna i fem olika nivåer av kundnöjdhet. Studien har utvärderat tre olika modeller, Multinomial Logistic Regression, Random Forest och Extreme Gradient Boosting, för att undersöka dessa möjligheter. Baserat på de använda metoderna med tillgängliga data, indikerar resultaten att det för tillfället inte är möjligt att identifiera nyckelfaktorer eller prediktera kundnöjdhet med hög noggrannhet.
2

Comparison of Machine Learning Techniques when Estimating Probability of Impairment : Estimating Probability of Impairment through Identification of Defaulting Customers one year Ahead of Time / En jämförelse av maskininlärningstekniker för uppskattning av Probability of Impairment : Uppskattningen av Probability of Impairment sker genom identifikation av låntagare som inte kommer fullfölja sina återbetalningsskyldigheter inom ett år

Eriksson, Alexander, Långström, Jacob January 2019 (has links)
Probability of Impairment, or Probability of Default, is the ratio of how many customers within a segment are expected to not fulfil their debt obligations and instead go into Default. This is a key metric within banking to estimate the level of credit risk, where the current standard is to estimate Probability of Impairment using Linear Regression. In this paper we show how this metric instead can be estimated through a classification approach with machine learning. By using models trained to find which specific customers will go into Default within the upcoming year, based on Neural Networks and Gradient Boosting, the Probability of Impairment is shown to be more accurately estimated than when using Linear Regression. Additionally, these models provide numerous real-life implementations internally within the banking sector. The new features of importance we found can be used to strengthen the models currently in use, and the ability to identify customers about to go into Default let banks take necessary actions ahead of time to cover otherwise unexpected risks. / Titeln på denna rapport är En jämförelse av maskininlärningstekniker för uppskattning av Probability of Impairment. Uppskattningen av Probability of Impairment sker genom identifikation av låntagare som inte kommer fullfölja sina återbetalningsskyldigheter inom ett år. Probability of Impairment, eller Probability of Default, är andelen kunder som uppskattas att inte fullfölja sina skyldigheter som låntagare och återbetalning därmed uteblir. Detta är ett nyckelmått inom banksektorn för att beräkna nivån av kreditrisk, vilken enligt nuvarande regleringsstandard uppskattas genom Linjär Regression. I denna uppsats visar vi hur detta mått istället kan uppskattas genom klassifikation med maskininlärning. Genom användandet av modeller anpassade för att hitta vilka specifika kunder som inte kommer fullfölja sina återbetalningsskyldigheter inom det kommande året, baserade på Neurala Nätverk och Gradient Boosting, visas att Probability of Impairment bättre uppskattas än genom Linjär Regression. Dessutom medför dessa modeller även ett stort antal interna användningsområden inom banksektorn. De nya variabler av intresse vi hittat kan användas för att stärka de modeller som idag används, samt förmågan att identifiera kunder som riskerar inte kunna fullfölja sina skyldigheter låter banker utföra nödvändiga åtgärder i god tid för att hantera annars oväntade risker.
3

Anomaly Detection in Categorical Data with Interpretable Machine Learning : A random forest approach to classify imbalanced data

Yan, Ping January 2019 (has links)
Metadata refers to "data about data", which contains information needed to understand theprocess of data collection. In this thesis, we investigate if metadata features can be usedto detect broken data and how a tree-based interpretable machine learning algorithm canbe used for an effective classification. The goal of this thesis is two-fold. Firstly, we applya classification schema using metadata features for detecting broken data. Secondly, wegenerate the feature importance rate to understand the model’s logic and reveal the keyfactors that lead to broken data. The given task from the Swedish automotive company Veoneer is a typical problem oflearning from extremely imbalanced data set, with 97 percent of data belongs healthy dataand only 3 percent of data belongs to broken data. Furthermore, the whole data set containsonly categorical variables in nominal scales, which brings challenges to the learningalgorithm. The notion of handling imbalanced problem for continuous data is relativelywell-studied, but for categorical data, the solution is not straightforward. In this thesis, we propose a combination of tree-based supervised learning and hyperparametertuning to identify the broken data from a large data set. Our methods arecomposed of three phases: data cleaning, which is eliminating ambiguous and redundantinstances, followed by the supervised learning algorithm with random forest, lastly, weapplied a random search for hyper-parameter optimization on random forest model. Our results show empirically that tree-based ensemble method together with a randomsearch for hyper-parameter optimization have made improvement to random forest performancein terms of the area under the ROC. The model outperformed an acceptableclassification result and showed that metadata features are capable of detecting brokendata and providing an interpretable result by identifying the key features for classificationmodel.

Page generated in 0.0569 seconds