<p> Because of the massive increase for streams available and being produced, the areas of data mining and machine learning have become increasingly popular. This takes place as companies, organizations and industries seek out optimal methods and techniques for processing these large data sets. Machine learning is a branch of artificial intelligence that involves creating programs that autonomously perform different data mining techniques when exposed to data streams. The study evaluates at two very different domains in an effort to provide a better and more optimized applicable method of clustering than is currently being used. We examine the use of data mining in healthcare, as well as the use of these techniques in the social media domain. Testing the proposed technique on these two drastically different domains offers us valuable insights into the performance of the proposed technique across domains. </p><p> This study aims at reviewing the existing methods of clustering and presenting an enhanced k-means clustering algorithm by using a novel method called Optimize Cluster Distance (OCD) applied to social media domain. This (OCD) method maximizes the distance between clusters by pair-wise re-clustering to enhance the quality of the clusters. For the healthcare domain, the k-means was applied along with Self Organizing Map (SOM) to get an optimal number of clusters. The possibility of getting bad positions of centroids in k-means was solved by applying the Genetic algorithm to the k-means in social media and healthcare domains. The OCD was applied again to enhance the quality of the produced clusters. In both domains, compared to the conventional k-means, the analysis shows that the proposed k-means is accurate and achieves better clustering performance along with valuable insights for each cluster. The approach is unsupervised, scalable and can be applied to various domains.</p><p>
Identifer | oai:union.ndltd.org:PROQUEST/oai:pqdtoai.proquest.com:10239708 |
Date | 02 February 2017 |
Creators | Alsayat, Ahmed Mosa |
Publisher | Bowie State University |
Source Sets | ProQuest.com |
Language | English |
Detected Language | English |
Type | thesis |
Page generated in 0.0156 seconds