Return to search

Evaluation of Machine Learning techniques for Master Data Management

In organisations, duplicate customer master data present a recurring problem. Duplicate records can result in errors, complication, and inefficiency since they frequently result from dissimilar systems or inadequate data integration. Since this problem is made more complicated by changing client information over time, prompt detection and correction are essential. In addition to improving data quality, eliminating duplicate information also improves business processes, boosts customer confidence, and makes it easier to make wise decisions. This master’s thesis explores machine learning’s application to the field of Master Data Management. The main objective of the project is to assess how machine learning may improve the accuracy and consistency of master data records. The project aims to support the improvement of data quality within enterprises by managing issues like duplicate customer data. One of the research topics of study is if machine learning can be used to improve the accuracy of customer data, and another is whether it can be used to investigate scientific models for customer analysis when cleaning data using machine learning. Dimension identification, appropriate algorithm selection, appropriate parameter value selection, and output analysis are the four steps in the study's process. As a ground truth for our project, we came to conclusion that 22,000 is the correct number of clusters for our clustering algorithms which represents the number of unique customers. Saying this, the best performing algorithm based on number of clusters and the silhouette score metric turned out the be KMEANS with 22,000 clusters and a silhouette score of 0.596, followed by BIRCH with 22,000 number of clusters and a silhouette score of 0.591.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:his-23239
Date January 2023
CreatorsToçi, Fatime
PublisherHögskolan i Skövde, Institutionen för informationsteknologi
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0015 seconds