Spelling suggestions: "subject:"4cluster 2analysis."" "subject:"4cluster 3analysis.""
161 |
Algorithms for estimating the cluster tree of a density /Nugent, Rebecca, January 2006 (has links)
Thesis (Ph. D.)--University of Washington, 2006. / Vita. Includes bibliographical references (p. 107-111).
|
162 |
Assessing and quantifying clusteredness: The OPTICS CordilleraRusch, Thomas, Hornik, Kurt, Mair, Patrick 22 June 2018 (has links) (PDF)
This article provides a framework for assessing and quantifying "clusteredness" of a data representation.
Clusteredness is a global univariate property defined as a layout diverging from equidistance of points
to the closest neighboring point set. The OPTICS algorithm encodes the global clusteredness as a pair of
clusteredness-representative distances and an algorithmic ordering. We use this to construct an index for
quantification of clusteredness, coined the OPTICS Cordillera, as the norm of subsequent differences over
the pair. We provide lower and upper bounds and a normalization for the index. We show the index captures
important aspects of clusteredness such as cluster compactness, cluster separation, and number of
clusters simultaneously. The index can be used as a goodness-of-clusteredness statistic, as a function over
a grid or to compare different representations. For illustration, we apply our suggestion to dimensionality
reduced 2D representations of Californian counties with respect to 48 climate change related variables.
Online supplementary material is available (including an R package, the data and additional mathematical
details).
|
163 |
Utilização de técnicas de análise de agrupamento do risco de geada no Estado do Paraná para a cultura do milho safrinhaMartins, Rogério Mendonça [UNESP] 30 April 2008 (has links) (PDF)
Made available in DSpace on 2014-06-11T19:31:39Z (GMT). No. of bitstreams: 0
Previous issue date: 2008-04-30Bitstream added on 2014-06-13T20:22:37Z : No. of bitstreams: 1
martins_rm_dr_botfca.pdf: 613773 bytes, checksum: b121f6136c26e7ab5b103fbea6b35277 (MD5) / This work became relevant for verifying the favorable areas for the cultivation of winter corn in the State of Paraná, offering a methodology which led to a better understanding of the agrometeorological variability in the State, providing annual information by decennials, in 22 regions analyzed by a temperature historical data base, diagnosing the homogeneous areas to identify favorable ones for the cultivation of winter corn. To reach this objective, this study adopted the cluster analysis technique through data from IAPAR – Londrina. During the analysis, the agglomerative (bottom-up) hierarchical technique and three clusters methods were used. The historical series were constituted by the nearest neighbor, the farthest neighbor and the non-weighted method to the pairs of arithmetic means. As like the clusters’ synthesis, the nearest and farthest neighbors’ method results showed the development of 4 groups, resulting in 5 groups for the non-weighed method to the pairs of means. The profile graph showed that in all ten simulations there was greater risk of frost in the simulations conducted the latest. Through clustering, locations with the same temperature characteristic were identified, and the simulations provided a basis for best sowing period. / O presente trabalho tornou-se relevante por verificar as áreas aptas para o cultivo do milho safrinha no Estado do Paraná, tendo como objetivo oferecer uma metodologia que possa contribuir para compreensão da variabilidade agrometeorológica desse Estado, fornecendo informações anuais por decêndios em 22 regiões analisadas por meio de banco de dados históricos de temperatura, diagnosticando as áreas homogêneas para identificar as regiões propícias ao cultivo do milho safrinha. Para atingir este objetivo trabalhou-se com a técnica de análise de agrupamento, por meio de um conjunto de dados fornecido pelo IAPAR – Londrina. Na análise utilizou-se a técnica hierárquica aglomerativa e três métodos de agrupamento. A série histórica constitui-se do vizinho mais próximo, vizinho mais distante e método não ponderado aos pares de médias aritméticas. Como síntese dos agrupamentos, os resultados mostraram a formação de quatro grupos para o método do vizinho mais próximo e vizinho mais distante, formando cinco grupos para o método não ponderado aos pares de médias. Observou-se no gráfico de perfil que nas dez simulações houve um risco maior de geada para as simulações mais tardias. Através dos agrupamentos identificou-se as localidades com a mesma característica de temperatura e as simulações ofereceram um embasamento para a melhor época do plantio.
|
164 |
Utilização de técnicas de análise de agrupamento do risco de geada no Estado do Paraná para a cultura do milho safrinha /Martins, Rogério Mendonça, 1968- January 2008 (has links)
Resumo: O presente trabalho tornou-se relevante por verificar as áreas aptas para o cultivo do milho safrinha no Estado do Paraná, tendo como objetivo oferecer uma metodologia que possa contribuir para compreensão da variabilidade agrometeorológica desse Estado, fornecendo informações anuais por decêndios em 22 regiões analisadas por meio de banco de dados históricos de temperatura, diagnosticando as áreas homogêneas para identificar as regiões propícias ao cultivo do milho safrinha. Para atingir este objetivo trabalhou-se com a técnica de análise de agrupamento, por meio de um conjunto de dados fornecido pelo IAPAR - Londrina. Na análise utilizou-se a técnica hierárquica aglomerativa e três métodos de agrupamento. A série histórica constitui-se do vizinho mais próximo, vizinho mais distante e método não ponderado aos pares de médias aritméticas. Como síntese dos agrupamentos, os resultados mostraram a formação de quatro grupos para o método do vizinho mais próximo e vizinho mais distante, formando cinco grupos para o método não ponderado aos pares de médias. Observou-se no gráfico de perfil que nas dez simulações houve um risco maior de geada para as simulações mais tardias. Através dos agrupamentos identificou-se as localidades com a mesma característica de temperatura e as simulações ofereceram um embasamento para a melhor época do plantio. / Abstract : This work became relevant for verifying the favorable areas for the cultivation of winter corn in the State of Paraná, offering a methodology which led to a better understanding of the agrometeorological variability in the State, providing annual information by decennials, in 22 regions analyzed by a temperature historical data base, diagnosing the homogeneous areas to identify favorable ones for the cultivation of winter corn. To reach this objective, this study adopted the cluster analysis technique through data from IAPAR - Londrina. During the analysis, the agglomerative (bottom-up) hierarchical technique and three clusters methods were used. The historical series were constituted by the nearest neighbor, the farthest neighbor and the non-weighted method to the pairs of arithmetic means. As like the clusters' synthesis, the nearest and farthest neighbors' method results showed the development of 4 groups, resulting in 5 groups for the non-weighed method to the pairs of means. The profile graph showed that in all ten simulations there was greater risk of frost in the simulations conducted the latest. Through clustering, locations with the same temperature characteristic were identified, and the simulations provided a basis for best sowing period. / Orientador: Sheila Zambello de Pinho / Coorientador: Sérgio Luiz Gonçalves / Banca: Lidia Raquel de Carvalho / Banca: Maristela Simões do Carmo / Banca: Vaderli Marino Melen / Banca: Vandir Medri / Doutor
|
165 |
Use of an area sampling frame to identify the spatial distribution of livestock in the Gauteng ProvinceVon Hagen, Craig 29 January 2009 (has links)
M.Sc. / In South Africa, there are no reliable statistics regarding animal numbers and distribution. The goal, therefore, of this research is to provide the framework and procedure for obtaining these statistics efficiently and accurately. Available sampling methods and sampling frames were investigated and it was decided to carry out a sample survey because the Gauteng Province consists of a large number of holdings (land parcels). In the Gauteng Province, where a complete list of farmers or land owners is not available, it was decided to use an area sampling frame. Once the choice of sample design was made, the survey objectives were defined according to the clients’ needs. The sampling frame was constructed using various land parcel layers. These land parcels were merged, using GIS software, into one continuous layer of land parcels. They were then stratified to reduce the variance of the variable (animals) under study over the entire area, using area of land parcel and land-cover. The sample size was then calculated and the land parcels were selected randomly for survey purposes. The survey was conducted between September and December 1999 and the questionnaires were input into a database for the estimation procedures. The closed estimation procedure was used because it is the only possible option if the data surveyed are referenced to the land parcel (and not to a farm that includes several land parcels). The area frame sampling methodology worked well for cattle, sheep, horses, pigs and dogs/cats and to a lesser extent for goats, donkeys and game. The area frame method did not work well for poultry (because of extremely high values in a few land parcels), ostriches or mules (these are rare in the province). Spatial distributions and density distributions were then interpolated from the animal counts taken in the survey and they give a general idea of the location of animals. The distributions of cattle, sheep, horses, pigs and dogs/cats are reliable. The distributions of the rest are distorted due to extreme counts in a few land parcels but a general idea of concentrations can still be inferred. Considering that no historical data exists and that the overall goal of this research was to get an idea of animal numbers and the distribution of animals in Gauteng province, it can be considered successful, in that decision- makers now have a reliable source of information from which good decisions can be made.
|
166 |
Some algorithmic studies in high-dimensional categorical data clustering and selection number of clustersLi, Junjie 01 January 2008 (has links)
No description available.
|
167 |
Using cluster analysis to quantify systematicity in a face image sorting taskCampbell, Alison 29 August 2017 (has links)
Open sorting tasks that include multiple face images of the same person require participants to make identity judgments in order to group images of the same person. When participants are unfamiliar with the identity, natural variation in the images due to changes in lighting, expression, pose, and age lead participants to divide images of the same person into different “identity” piles. Although this task is being increasingly used in current research to assess unfamiliar face perception, no previous work has examined whether there is systematicity across participants in how identity groups are composed. A cluster analysis was performed using two variations of the original face sorting task. Results identify groups of images that tend to be grouped across participants and even across changes in task format. These findings suggest that participants responded to similar signals such as tolerable change and similarity across images when ascribing identity to unfamiliar faces. / Graduate
|
168 |
Linear clustering with application to single nucleotide polymorphism genotypingYan, Guohua 11 1900 (has links)
Single nucleotide polymorphisms (SNPs) have been increasingly popular for
a wide range of genetic studies. A high-throughput genotyping technologies
usually involves a statistical genotype calling algorithm. Most calling
algorithms in the literature, using methods such as k-means and mixturemodels,
rely on elliptical structures of the genotyping data; they may fail
when the minor allele homozygous cluster is small or absent, or when the
data have extreme tails or linear patterns.
We propose an automatic genotype calling algorithm by further developing
a linear grouping algorithm (Van Aelst et al., 2006). The proposed
algorithm clusters unnormalized data points around lines as against around
centroids. In addition, we associate a quality value, silhouette width, with
each DNA sample and a whole plate as well. This algorithm shows promise
for genotyping data generated from TaqMan technology (Applied Biosystems).
A key feature of the proposed algorithm is that it applies to unnormalized
fluorescent signals when the TaqMan SNP assay is used. The
algorithm could also be potentially adapted to other fluorescence-based SNP
genotyping technologies such as Invader Assay.
Motivated by the SNP genotyping problem, we propose a partial likelihood
approach to linear clustering which explores potential linear clusters
in a data set. Instead of fully modelling the data, we assume only the signed
orthogonal distance from each data point to a hyperplane is normally distributed.
Its relationships with several existing clustering methods are discussed.
Some existing methods to determine the number of components in a
data set are adapted to this linear clustering setting. Several simulated and
real data sets are analyzed for comparison and illustration purpose. We also
investigate some asymptotic properties of the partial likelihood approach.
A Bayesian version of this methodology is helpful if some clusters are
sparse but there is strong prior information about their approximate locations
or properties. We propose a Bayesian hierarchical approach which is
particularly appropriate for identifying sparse linear clusters. We show that
the sparse cluster in SNP genotyping datasets can be successfully identified
after a careful specification of the prior distributions. / Science, Faculty of / Statistics, Department of / Graduate
|
169 |
On two tests for multivariate normalityWong, Hoi-lam 01 January 1993 (has links)
No description available.
|
170 |
Obesity with radiological changes or depression was associated with worse knee outcome in general population: a cluster analysis in the Nagahama study / 膝痛の関連因子を用いた変形性膝関節症のクラスター解析:ながはまスタディNigoro, Kazuya 24 May 2021 (has links)
京都大学 / 新制・課程博士 / 博士(医学) / 甲第23379号 / 医博第4748号 / 新制||医||1052(附属図書館) / 京都大学大学院医学研究科医学専攻 / (主査)教授 石見 拓, 教授 戸口田 淳也, 教授 中山 健夫 / 学位規則第4条第1項該当 / Doctor of Medical Science / Kyoto University / DFAM
|
Page generated in 0.0763 seconds