In this study, we introduce several approaches to analyze large volumes of business descriptions by applying machine learning clustering and classification algorithms. The goal is to efficiently classify these descriptions, reducing the search scope and allowing for better business insights and decision-making processes. By using unlabeled business description data, we apply Agglomerative Hierarchical Clustering (AHC), K-means, and Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithms. Various preprocessing techniques, parameters and cluster numbers are employed for each method, aiming to maximize the number of overlapping and get the right similarity scores within the resulting clusters. The best number of overlapping are obtained using AHC, followed by K-means and DBSCAN, based on the implemented evaluation metrics. The conclusions drawn from this project have the potential to improve and contribute to the development of automated systems for business description analysis. Furthermore, this research opens the way for further exploration and enhancements in the application of machine learning techniques to business analytics.
Identifer | oai:union.ndltd.org:UPSALLA1/oai:DiVA.org:miun-48770 |
Date | January 2023 |
Creators | Orabi Alkhen, Wisam |
Publisher | Mittuniversitetet, Institutionen för data- och elektroteknik (2023-) |
Source Sets | DiVA Archive at Upsalla University |
Language | English |
Detected Language | English |
Type | Student thesis, info:eu-repo/semantics/bachelorThesis, text |
Format | application/pdf |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0021 seconds