Global ETD Search

1	SIMD Algorithms for Single Link and Complete Link Pattern Clustering Arumugavelu, Shankar 08 March 2007 (has links) Clustering techniques play an important role in exploratory pattern analysis, unsupervised pattern recognition and image segmentation applications. Clustering algorithms are computationally intensive in nature. This thesis proposes new parallel algorithms for Single Link and Complete Link hierarchical clustering. The parallel algorithms have been mapped on a SIMD machine model with a linear interconnection network. The model consists of a linear array of N (number of patterns to be clustered) processing elements (PEs), interfaced to a host machine and the interconnection network provides inter-PE and PE-to-host/host-to-PE communication. For single link clustering, each PE maintains a sorted list of its first logN nearest neighbors and the host maintains a heap of the root elements of all the PEs. The determination of the smallest entry in the distance matrix and update of the distance matrix is achieved in O(logN) time. In the case of complete link clustering, each PE maintains a heap data structure of the inter pattern distances. This significantly reduces the computation time for the determination of the smallest entry in the distance matrix during each iteration, from O(N2) to O(N), as the root element in each PE gives its nearest neighbor. The proposed algorithms are faster and simpler than previously known algorithms for hierarchical clustering. For clustering a data set with N patterns, using N PEs, the computation time for the single link clustering algorithm is shown to be O(NlogN) and the time complexity for the complete link clustering algorithm is shown to be O(N2). The parallel algorithms have been verified through simulations on the Intel iPSC/2 parallel machine. Hierarchical clustering Pattern recognition Parallel algorithms Pattern matrix Proximity matrix American Studies Arts and Humanities
2	Making sense of genotype x environment interaction of Pinus radiata in New Zealand McDonald, Timothy Myles January 2009 (has links) In New Zealand, a formal tree improvement and breeding programme for Pinus radiata (D.Don) commenced in 1952. A countrywide series of progeny trials was progressively established on over seventy sites, and is managed by the Radiata Pine Breeding Company (RPBC). Diameter at breast height data from the series were used to investigate genotype x environment interaction with a view to establishing the need for partitioning breeding and deployment efforts for P. radiata. Nearly 300,000 measurements made this study one of the largest for genotype x environment interaction ever done. Bivariate analyses were conducted between all pairs of sites to determine genetic correlations between sites. Genetic correlations were used to construct a proximity matrix by subtracting each correlation from unity. The process of constructing the matrix highlighted issues of low connectivity between sites; whereby meaningful correlations between sites were established with just 5 % of the pairs. However, nearly two-thirds of these genetic correlations were between -1.0 and 0.6, indicating the presence of strong genotype x environment interactions. A technique known as multiple regression on resemblance matrices was carried out by regressing a number of environmental correlation matrices on the diameter at breast height correlation matrix. Genotype x environment interactions were found to be driven by extreme maximum temperatures (t-statistic of 2.03 against critical t-value of 1.96 at 95 % confidence level). When tested on its own, altitude was significant with genetic correlations between sites at the 90 % confidence level (t-statistic of 1.92 against critical t-value of 1.645). In addition, a method from Graph Theory using proximity thresholds was utilised as a form of clustering. However, this study highlighted the existence of high internal cohesion within trial series, and high external isolation between trial series. That is, grouping of sites (in terms of diameter) was observed to be a reflection of the series of trials for which each site was established. This characteristic is particularly unhelpful for partitioning sites into regions of similar propensity to genotype x environment interaction, as the genotype x environment effect is effectively over-ridden by the genotype effect. Better cohesion between past, present and future trial series, and more accurate bioclimatic data should allow more useful groupings of sites to be extracted from the data. Given this, however, it is clear that there are a large number of interactive families contained in the RPBC dataset. It is concluded that partitioning of New Zealand’s P. radiata breeding programme cannot be ruled out as an advantageous option. genotype x environment interaction regionalisation proximity matrix multiple regression on matrices Graph Theory proximity thresholds

Search results

SIMD Algorithms for Single Link and Complete Link Pattern Clustering

Making sense of genotype x environment interaction of Pinus radiata in New Zealand