Spelling suggestions: "subject:"gig data clustering"" "subject:"gig data klustering""
1 |
AN EMPIRICAL STUDY OF AN INNOVATIVE CLUSTERING APPROACH TOWARDS EFFICIENT BIG DATA ANALYSISBowers, Jacob Robert 01 May 2024 (has links) (PDF)
The dramatic growth of big data presents formidable challenges for traditional clustering methodologies, which often prove unwieldy and computationally expensive when processing vast quantities of data. This study explores a novel clustering approach exemplified by Sow & Grow, a density-based clustering algorithm akin to DBSCAN developed to address the issues inherent to big data by enabling end-users to strategically allocate computational resources toward regions of noted interest. Achieved through a unique procedure of seeding points and subsequently fostering their growth into coherent clusters, this method significantly reduces computational waste by ignoring insignificant segments of the dataset and provides information relevant to the end user. The implementation of this algorithm developed as part of this research showcases promising results in various experimental settings, exhibiting notable speedup over conventional clustering methods. Additionally, the incorporation of dynamic load balancing further enhances the algorithm's performance, ensuring optimal resource utilization across parallel processing threads when handling superclusters or unbalanced data distributions. Through a detailed study of the theoretical underpinnings of this innovative clustering approach and the limitations of traditional clustering techniques, this research demonstrates the practical utility of the Sow & Grow algorithm in expediting the clustering processes while providing results pertinent to end users.
|
Page generated in 0.1123 seconds