Clustering techniques play an important role in exploratory pattern analysis, unsupervised pattern recognition and image segmentation applications. Clustering algorithms are computationally intensive in nature. This thesis proposes new parallel algorithms for Single Link and Complete Link hierarchical clustering. The parallel algorithms have been mapped on a SIMD machine model with a linear interconnection network. The model consists of a linear array of N (number of patterns to be clustered) processing elements (PEs), interfaced to a host machine and the interconnection network provides inter-PE and PE-to-host/host-to-PE communication. For single link clustering, each PE maintains a sorted list of its first logN nearest neighbors and the host maintains a heap of the root elements of all the PEs. The determination of the smallest entry in the distance matrix and update of the distance matrix is achieved in O(logN) time. In the case of complete link clustering, each PE maintains a heap data structure of the inter pattern distances. This significantly reduces the computation time for the determination of the smallest entry in the distance matrix during each iteration, from O(N2) to O(N), as the root element in each PE gives its nearest neighbor. The proposed algorithms are faster and simpler than previously known algorithms for hierarchical clustering. For clustering a data set with N patterns, using N PEs, the computation time for the single link clustering algorithm is shown to be O(NlogN) and the time complexity for the complete link clustering algorithm is shown to be O(N2). The parallel algorithms have been verified through simulations on the Intel iPSC/2 parallel machine.
Identifer | oai:union.ndltd.org:USF/oai:scholarcommons.usf.edu:etd-1608 |
Date | 08 March 2007 |
Creators | Arumugavelu, Shankar |
Publisher | Scholar Commons |
Source Sets | University of South Flordia |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Graduate Theses and Dissertations |
Rights | default |
Page generated in 0.0052 seconds