Global ETD Search

91	Investigating Selection Criteria of Constrained Cluster Analysis: Applications in Forestry Corral, Gavin Richard 16 December 2014 (has links) Forest measurements are inherently spatial. Soil productivity varies spatially at fine scales and tree growth responds by changes in growth-age trajectories. Measuring spatial variability is a perquisite to more effective analysis and statistical testing. In this study, current techniques of partial redundancy analysis and constrained cluster analysis are used to explore how spatial variables determine structure in a managed regular spaced plantation. We will test for spatial relationships in the data and then explore how those spatial relationships are manifested into spatially recognizable structures. The objectives of this research are to measure, test, and map spatial variability in simulated forest plots. Partial redundancy analysis was found to be a good method for detecting the presence or absence of spatial relationships (~95% accuracy). We found that the Calinski-Harabasz method consistently performed better at detecting the correct number of clusters when compared to several other methods. While there is still more work that can be done we believe that constrained cluster analysis has promising applications in forestry and that the Calinski-Harabasz criterion will be most useful. / Master of Science Simulation Redundancy Analysis Cluster Analysis Forestry
92	Robust clustering algorithms Gupta, Pramod 05 April 2011 (has links) One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across any different fields ranging from computational biology to social sciences to computer vision in part because they are simple and their output is easy to interpret. However, many of these algorithms lack any performance guarantees when the data is noisy, incomplete or has outliers, which is the case for most real world data. It is well known that standard linkage algorithms perform extremely poorly in presence of noise. In this work we propose two new robust algorithms for bottom-up agglomerative clustering and give formal theoretical guarantees for their robustness. We show that our algorithms can be used to cluster accurately in cases where the data satisfies a number of natural properties and where the traditional agglomerative algorithms fail. We also extend our algorithms to an inductive setting with similar guarantees, in which we randomly choose a small subset of points from a much larger instance space and generate a hierarchy over this sample and then insert the rest of the points to it to generate a hierarchy over the entire instance space. We then do a systematic experimental analysis of various linkage algorithms and compare their performance on a variety of real world data sets and show that our algorithms do much better at handling various forms of noise as compared to other hierarchical algorithms in the presence of noise. Robust algorithms Hierarchical clustering Unsupervised learning Clustering Machine learning Cluster analysis Cluster analysis Computer programs Algorithms
93	Hodnocení úspěšnosti koeficientů pro stanovení optimálního počtu shluků ve shlukové analýze / The evaluation of coefficients when determining the optimal number of clusters in cluster analysis Novák, Miroslav January 2014 (has links) The objective of this thesis is the evaluation of selected coefficients of the cluster analysis when determining the optimal number of clusters. The analytical evaluation is performed on 20 independent real datasets. The analysis is made in statistical SYSTAT 13.1 Software. The application of coefficients RMSSTD, CHF, PTS, DB and Dunn's index on real datasets is the main part of this thesis, because the issue of evaluating the results of clustering is not devoted sufficient attention in scientific publications. The main goal is whether the selected coefficients of clustering can be applied in the real situations. The second goal is to compare selected clustering methods and their corresponding metrics when determining the optimal number of clusters. In conclusion, it is found that the optimal number of clusters determined by the coefficients mentioned above cannot be considered to be correct since, after application to the real data, none of the selected coefficients overcome the success rate of 40%, hence, the use of these coefficients in practice is very limited. Based on the practical analysis, the best method in identifying the known number of clusters is the average linkage in connection with the Euclidean distance, while the worst is the Ward's method in connection with the Euclidean distance.
94	Shluková analýza jako nástroj klasifikace objektů / Cluster analysis as a tool for object classification Vanišová, Adéla January 2012 (has links) The aim of this thesis is to examine the cluster analysis ability segment the data set by selected methods. The data sets are consisting of quantitative variables. The basic criterion for the data sets is that the number of classes has to be known and the next criterion is that the membership of all object to each class has to be known too. Execution of the cluster analysis was based on knowledge about the number of classes. Classified objects to individual clusters were compared with its original classes. The output was the relative success of classification by selected methods. Cluster analysis methods are not able to determine an optimal number of clusters. Estimates of the optimal number of clusters were the second step in analysis for each data set. The ability of selected criteria identify the original number of classes was analyzed by comparing numbers of original classes and numbers of optimal clusters. The main contribution of this thesis is the validation of the ability of selected cluster analysis methods to identify similar objects and verify the ability of selected criteria to estimate the number of clusters corresponding to the real file distribution. Moreover, this work provides a structured overview of the basic cluster analysis methods and indicators for estimating the optimal number of clusters.
95	Advanced query processing on spatial networks Yiu, Man-lung., 姚文龍. January 2006 (has links) published_or_final_version / abstract / Computer Science / Doctoral / Doctor of Philosophy Nearest neighbor analysis (Statistics) Database management. Cluster analysis.
96	Tabu search-based techniques for clustering data sets 黃頌詩, Wong, Chung-sze. January 2001 (has links) published_or_final_version / Mathematics / Master / Master of Philosophy Computer algorithms. Cluster analysis. Data mining. Database searching.
97	Neurocognitive profiles in autism spectrum disorder Wagner, Amanda E. 07 October 2014 (has links) The current research project examines the performance of a group of high functioning young adult males with autism spectrum disorders on standardized measures of neurocognitive functioning to determine whether distinct cognitive profiles of strengths and weaknesses emerge. Neuropsychological test data across various domains: general cognitive ability, visuospatial processing, verbal learning and memory, visual learning and memory, working memory, reasoning, cognitive flexibility, attention, receptive language, expressive language, social and emotional processing, and fine motor skills was examined. Data were analyzed using cluster analysis to assess for the presence and nature of unique clusters/subgroups based on neuropsychological test performance. Three unique clusters were derived from the analyses. This study highlights the well-documented heterogeneity across the spectrum of autism and suggests a method for parsing a heterogeneous sample of ASD subjects into smaller and more meaningful homogeneous groups using standardized neuropsychological assessments. / text Autism spectrum disorder Neuropsychology Cognitive profile Cluster analysis
98	New algorithms for EST clustering. Ptitsyn, Andrey January 2000 (has links) Expressed sequence tag database is a rich and fast growing source of data for gene expression analysis and drug discovery. Clustering of raw EST data is a necessary step for further analysis and one of the most challenging problems of modem computational biology. Cluster analysis Data processing Cluster analysis Computer programs Algorithms
99	Predictability associated with the downstream impact of the extratropical transition of tropical cyclones Reeves, Justin Martin. 06 1900 (has links) Since an extratropical transition (ET) of a decaying tropical cyclone (TC) often results in a fast-moving, rapidly developing extratropical cyclone and amplification of synoptic-scale systems far downstream, proper forecasting of ET events is critical to forecast accuracy over large ocean regions. Past studies have linked forecast accuracy to the phasing of a decaying TC with favorable midlatitudes conditions. Because ET events are sensitive to the analyzed initial conditions, this phasing is examined using 11 member ensemble predictions available four times daily from the National Centers for Environmental Prediction, which were combined into a single 44 member ensemble based on a common forecast verification time. Recurring ET patterns within the 44 member ensemble were objectively identified using a combination of EOF and cluster analysis. Ensemble spread first appears near the point where the TC moves into the midlatitudes and then propagates downstream. Although ensemble spread in the forecast fields was large at extended forecast intervals, the ensemble spread, and the number of ET patterns identified in successive EPS predictions, decreased as the ET process became better defined. Within 48 hours of the ET event, the ensemble prediction system properly identified the ET pattern with a minimum ensemble spread. Similar to Klein et al. (2002), the shifts in the initial position of the TC and the subsequent dynamical coupling can explain differences between weak and strong ET reintensifications. Cyclones Tropics Functions, Orthogonal Cluster analysis Ocean-atmosphere interaction
100	Styles in business process modeling: an exploration and a model Pinggera, Jakob, Soffer, Pnina, Fahland, Dirk, Weidlich, Matthias, Zugal, Stefan, Weber, Barbara, Reijers, Hajo A., Mendling, Jan 07 1900 (has links) (PDF) Business process models are an important means to design, analyze, implement, and control business processes. As with every type of conceptual model, a business process model has to meet certain syntactic, semantic, and pragmatic quality requirements to be of value. For many years, such quality aspects were investigated by centering on the properties of the model artifact itself. Only recently, the process of model creation is considered as a factor that influences the resulting model's quality. Our work contributes to this stream of research and presents an explorative analysis of the process of process modeling (PPM). We report on two large-scale modeling sessions involving 115 students. In these sessions, the act of model creation, i.e., the PPM, was automatically recorded. We conducted a cluster analysis on this data and identified three distinct styles of modeling. Further, we investigated how both task- and modeler-specific factors influence particular aspects of those modeling styles. Based thereupon, we propose a model that captures our insights. It lays the foundations for future research that may unveil how high-quality process models can be established through better modeling support and modeling instruction. (authors' abstract)

Search results