Global ETD Search

211	Investigating Aspects of Visual Clustering in the Organization of Personal Digital Document Collections Badesh, Hoda 13 March 2013 (has links) Organizing personal collections of digital documents can be frustrating for two main reasons. First, the effort required to work with the folder system on personal computers and the possible misplacement and loss of documents. Second, the lack of effective organization and management tools for personal collections of digital documents. The research in this thesis investigated specific visualization and clustering features intended for organizing collections of documents and built in a prototype interface that was compared to a baseline interface from previous research. The results showed that those features helped users with: 1) the initial classification of documents into clusters during the supervised stage; 2) the modification of clusters; 3) the cluster labeling process; 4) the presentation of the final set of organized documents; 5) the efficiency of the organization process, and 6) achieving better accuracy in the clusters created for organizing the documents.
212	Cluster Analyses to Assess Weight Loss Maintenance: An Application of Clustering in Nutrigenomics Wong, Monica 25 August 2011 (has links) Within nutrigenomics, clustering using data generated by microarray gene expression profiles can be used to identify sub-populations of subjects that respond differently to a given diet intervention. The use of clustering analyses is promising in obesity-related research as personalized nutrition is gaining popularity. This thesis focuses on clustering a human subcutaneous adipose tissue gene expression data set obtained during a low-calorie diet intervention to aid in the prediction of 6-month weight loss maintenance. The aims of the study were (1) to identify the best performing clustering method for clustering samples, (2) to identify differential responders to the low-calorie diet, and (3) to identify the biological pathways affected during the low-calorie diet by weight maintainers and weight regainers. MCLUST performed the best when clustering samples using relative weight change and either fasting insulin or insulin resistance change. Furthermore, it identified differences in the regulation of pathways between weight maintainers and regainers. clustering nutrigenomics weight loss maintenance microarray
213	Cross-Validation for Model Selection in Model-Based Clustering O'Reilly, Rachel 04 September 2012 (has links) Clustering is a technique used to partition unlabelled data into meaningful groups. This thesis will focus on the area of clustering called model-based clustering, where it is assumed that data arise from a finite number of subpopulations, each of which follows a known statistical distribution. The number of groups and shape of each group is unknown in advance, and thus one of the most challenging aspects of clustering is selecting these features. Cross-validation is a model selection technique which is often used in regression and classification, because it tends to choose models that predict well, and are not over-fit to the data. However, it has rarely been applied in a clustering framework. Herein, cross-validation is applied to select the number of groups and covariance structure within a family of Gaussian mixture models. Results are presented for both real and simulated data. / Ontario Graduate Scholarship Program
214	Symbiotic Evolutionary Subspace Clustering (S-ESC) Vahdat, Ali R. 08 November 2013 (has links) Subspace clustering identifies the attribute support for each cluster as well as identifying the location and number of clusters. In the most general case, attributes associated with each cluster could be unique. A multi-objective evolutionary method is proposed to identify the unique attribute support of each cluster while detecting its data instances. The proposed algorithm, Symbiotic Evolutionary Subspace Clustering (S-ESC) borrows from symbiosis in the sense that each clustering solution is defined in terms of a host, which is formed by a number of co-evolved cluster centroids (or symbionts). Symbionts define clusters and therefore attribute subspaces, whereas hosts define sets of clusters to constitute a non-degenerate clustering solution. The symbiotic representation of S-ESC is the key to making it scalable to high-dimensional datasets, while a subsampling process makes it scalable to large-scale datasets. Performance of the S-ESC algorithm was found to be robust across a common parameterization utilized throughout. Subspace clustering Symbiosis
215	An Investigation of Nano-voids in Aluminum by Small-angle X-ray Scattering Westfall, Luke Aidan 28 April 2008 (has links) Small angle x-ray scattering (SAXS) with synchrotron radiation was used to characterize nano-sized voids in different nominally pure aluminum (Al) alloys produced by quenching. The scattering signal from nano-voids is shown to be predictable from SAXS theory, and the information related to the void population confirm past experiments and reveal new details about quench-void formation in Al. Specifically, voids were produced in 99.97 at.% to 99.9994 at.% Al alloys by infrared heating to 450 – 625 °C followed by controlled rapid quenching at 10^3 to 10^5 °C/s. For changing processing conditions, the size of voids varied between 5 to 11 nm, and the density of voids varied by over an order of magnitude. Results from SAXS were consistent with TEM observations performed on the same specimens, indicating that synchrotron SAXS can be reliably used to characterize nano-voids produced in quenched Al. Factors determined to affect voids were consistent with previous studies, except that the present nano-voids dissolved after only 3 min. at 145 °C, indicating that quenched nano-voids are less stable than previously determined. SAXS also showed that void size is sensitive to quench temperature and quench rate. The activation energies for void nucleation and growth were determined to be 0.75 ± 0.10 and 0.19 ± 0.03 eV/at., respectively, confirming that hydrogen and di-vacancies take part in nucleation and growth during quenching. It was concluded that the non-linear tail of the quench curve plays a crucial role in void formation, and that voids form when long range diffusion is inhibited. This information can be utilized to design new Al alloys that limit incipient void formation, which is detrimental to properties such as formability. / Thesis (Master, Mechanical and Materials Engineering) -- Queen's University, 2008-04-25 15:17:30.211 / Natural Sciences and Engineering Research Council of Canada; General Motors of Canada Limited SAXS aluminum nano-voids vacancy clustering
216	Using Cluster Analysis, Cluster Validation, and Consensus Clustering to Identify Subtypes Shen, Jess Jiangsheng 26 November 2007 (has links) Pervasive Developmental Disorders (PDDs) are neurodevelopmental disorders characterized by impairments in social interaction, communication and behaviour [Str04]. Given the diversity and varying severity of PDDs, diagnostic tools attempt to identify homogeneous subtypes within PDDs. The diagnostic system Diagnostic and Statistical Manual of Mental Disorders - Fourth Edition (DSM-IV) divides PDDs into five subtypes. Several limitations have been identified with the categorical diagnostic criteria of the DSM-IV. The goal of this study is to identify putative subtypes in the multidimensional data collected from a group of patients with PDDs, by using cluster analysis. Cluster analysis is an unsupervised machine learning method. It offers a way to partition a dataset into subsets that share common patterns. We apply cluster analysis to data collected from 358 children with PDDs, and validate the resulting clusters. Notably, there are many cluster analysis algorithms to choose from, each making certain assumptions about the data and about how clusters should be formed. A way to arrive at a meaningful solution is to use consensus clustering to integrate results from several clustering attempts that form a cluster ensemble into a unified consensus answer, and can provide robust and accurate results [TJPA05]. In this study, using cluster analysis, cluster validation, and consensus clustering, we identify four clusters that are similar to – and further refine  three of the five subtypes defined in the DSM-IV. This study thus confirms the existence of these three subtypes among patients with PDDs. / Thesis (Master, Computing) -- Queen's University, 2007-11-15 23:34:36.62 / OGS, QGA Cluster analysis Cluster validation Consensus clustering Autism
217	Identification and application of extract class refactorings in object-oriented systems Fokaefs, Marios-Eleftherios Unknown Date No description available. refactoring software reengineering object-oriented programming clustering
218	LRS Seimo narių grupavimas pagal balsavimą ir balsavimo kitimo aptikimas / Lithuanian Parliament members grouping by their voting behavior and it’s change detection Bytautas, Kęstutis 20 June 2012 (has links) Politikai įvairiai deklaruoja savo elgesį, todėl vienintelis būdas juos kontroliuoti – stebėjimas. Šiame darbe yra analizuojamas LRS darbas, susijęs su balsavimais. Stengiamasi atsakyti į klausimą: ar informacinių technologijų įrankiai gali leisti nustatyti ar Seimo narių priklausomybė partijai (frakcijai) ar pozicijai (opozicijai) lemia jų balsavimą? Pagrindiniai darbo tikslai – Seimo narių grupavimas ir balsavimo kitimo aptikimas. Apžvelgiama 2008-2012 metų Seimo kadencijos veikla, atlikta balsavimų statistinė analizė, taip pat apžvelgti kiti tyrimai, susiję su parlamentinėmis veiklomis. Seimo narių grupavimui taikome klasterizavimo metodus. Klasterizavimas gali būti apibrėžiamas kaip objektų suskirstymas į grupes (klasterius), kuriose objektų skirtumai yra kuo mažesni, o tarp grupių skirtumai - kuo didesni. Darbe apžvelgiami įvairūs klasterizavimo metodai, jų veikimo principai, aprašomi atstumų tarp objektų skaičiavimo metodai, kokybės įvertinimo kriterijai. Balsavimų duomenys saugomi MySQL duomenų bazėje, todėl sukurtas įrankis duomenų apdorojimui. Aprašomi visi darbo etapai: naudoti įrankiai, balsavimo kodavimas, balsavimų skaidymas į periodus. Tyrimams atlikti pasirinkti k-Means, hierarchiniai tolimiausio kaimyno, vidutinių atstumų, artimiausio kaimyno klasterizavimo metodai. Objektų panašumams įvertinti naudojami Euklido (ang. Euclidean) ir Manheteno (angl. Manhattan) atstumų skaičiavimo metodai. Klasterizavimo kokybės įvertinimui naudojame PURITY, RAND, NMI metodus... [toliau žr. visą tekstą] / Politicians declare their behavior in different ways, so the only way to control it - monitoring. In this thesis tools for Lithuanian Parliament Members voting behavior are analyzed. The question is following: can Information technologies tool help to determine how membership in a faction or the position (opposition) is related with voting behavior? The main objectives of this work are Lithuanian Parliament members grouping by their voting behavior and its' change detection. In the thesis the 2008-2012 of the Parliament activities are analysed using statistical voting analysis. We use clustering for grouping members of the Parliament. A loose definition of clustering could be the process of organizing objects into groups whose members are similar in some way. A cluster (group) is a collection of objects which are similar between them and are dissimilar to the objects belonging to other clusters. We overviewed different clustering methods and their principles of operation, described the distance between the objects of calculation methods, quality evaluation criteria in this work. Voting data is stored in MySQL database, hence a tool was created for data processing. We describe all the stages of the work: the use of tools, coding of the votes, division of the votes into the periods. The following techniques were chosen: K-Means, Hierarchical Clustering with Complete (furthest neighbor), Average, Single (nearest neighbor) linkage. We use Euclidean and Manhattan methods for... [to full text] Informatics Klasterizavimas Balsavimas Analizė Clustering Voting Grouping
219	Modelling severe asthma variation Newby, Christopher James January 2013 (has links) Asthma is a heterogeneity disease that is mostly managed successfully using bronchodilators and anti-inflammatory drugs. Around 10%-15% of asthmatics however have difficult or severe asthma which is less responsive to treatments. Asthma and in particular severe asthma are now thought of a description of symptoms which may contain possible sub-groups with possible different pathologies which could be useful for targeting different drugs for different sub-groups. However little statistical work has been carried out to determine these sub-phenotypes. Studies have been carried out to partition severe asthma variables in to a number of sub-groups but the algorithms used in these studies are not based on statistical inference and it is difficult to select the number of best fitting sub-groups using such methods. It is also unclear where the clusters or sub-groups returned are actual sub-groups or reflect a bigger non-normal distribution. In the thesis we have developed a statistical model that combines factor analysis, a method used to obtain independent factors to describe processes allowing for variation over variables, and infinite mixture modelling, a process that involves determining the most probable number of mixtures or clusters thus allowing for variation over individuals. This model created is a Dirichlet process normal mixture latent variable model DPNMLVN and it is capable of determining the correct number of mixtures over each factor. The model was tested with simulations and used to analysis two severe asthma datasets and a cancer clinical trial. Sub-groups were found that reflect a high Eosinophilic group and an average eosinophilic group, a late onset older non atopic group and a highly atopic younger early onset group. In the clinical trial data 3 distinct mixtures were found relating to existing biomarkers not used in the mixture analysis. 616.238
220	Application of Clustering Method based on Orthogonal Procrustes Analysis to Analysis of Questionnaire Data Furuhashi, Takeshi, Yamaga, Shinichiro, Yoshikawa, Tomohiro January 2008 (has links) Session ID: TH-A4-3 / Joint 4th International Conference on Soft Computing and Intelligent Systems and 9th International Symposium on advanced Intelligent Systems, September 17-21, 2008, Nagoya University, Nagoya, Japan Orthogonal Procrustes Analysis Clustering Questionnaire Data

Search results