Spelling suggestions: "subject:"partition clustering"" "subject:"partitioned clustering""
1 |
Dissimilarity Plots. A Visual Exploration Tool for Partitional Clustering.Hahsler, Michael, Hornik, Kurt January 2009 (has links) (PDF)
For hierarchical clustering, dendrograms provide convenient and powerful visualization. Although many visualization methods have been suggested for partitional clustering, their usefulness deteriorates quickly with increasing dimensionality of the data and/or they fail to represent structure between and within clusters simultaneously. In this paper we extend (dissimilarity) matrix shading with several reordering steps based on seriation. Both methods, matrix shading and seriation, have been well-known for a long time. However, only recent algorithmic improvements allow to use seriation for larger problems. Furthermore, seriation is used in a novel stepwise process (within each cluster and between clusters) which leads to a visualization technique that is independent of the dimensionality of the data. A big advantage is that it presents the structure between clusters and the micro-structure within clusters in one concise plot. This not only allows for judging cluster quality but also makes mis-specification of the number of clusters apparent. We give a detailed discussion of the construction of dissimilarity plots and demonstrate their usefulness with several examples. / Series: Research Report Series / Department of Statistics and Mathematics
|
2 |
Development of a hierarchical k-selecting clustering algorithm – application to allergy.Malm, Patrik January 2007 (has links)
<p>The objective with this Master’s thesis was to develop, implement and evaluate an iterative procedure for hierarchical clustering with good overall performance which also merges features of certain already described algorithms into a single integrated package. An accordingly built tool was then applied to an allergen IgE-reactivity data set. The finally implemented algorithm uses a hierarchical approach which illustrates the emergence of patterns in the data. At each level of the hierarchical tree a partitional clustering method is used to divide data into k groups, where the number k is decided through application of cluster validation techniques. The cross-reactivity analysis, by means of the new algorithm, largely arrives at anticipated cluster formations in the allergen data, which strengthen results obtained through previous studies on the subject. Notably, though, certain unexpected findings presented in the former analysis where aggregated differently, and more in line with phylogenetic and protein family relationships, by the novel clustering package.</p>
|
3 |
Development of a hierarchical k-selecting clustering algorithm – application to allergy.Malm, Patrik January 2007 (has links)
The objective with this Master’s thesis was to develop, implement and evaluate an iterative procedure for hierarchical clustering with good overall performance which also merges features of certain already described algorithms into a single integrated package. An accordingly built tool was then applied to an allergen IgE-reactivity data set. The finally implemented algorithm uses a hierarchical approach which illustrates the emergence of patterns in the data. At each level of the hierarchical tree a partitional clustering method is used to divide data into k groups, where the number k is decided through application of cluster validation techniques. The cross-reactivity analysis, by means of the new algorithm, largely arrives at anticipated cluster formations in the allergen data, which strengthen results obtained through previous studies on the subject. Notably, though, certain unexpected findings presented in the former analysis where aggregated differently, and more in line with phylogenetic and protein family relationships, by the novel clustering package.
|
4 |
Abordagens meta-heurísticas para clusterização de dados e segmentação de imagensQueiroga, Eduardo Vieira 17 February 2017 (has links)
Submitted by Fernando Souza (fernandoafsou@gmail.com) on 2017-08-14T11:28:15Z
No. of bitstreams: 1
arquivototal.pdf: 7134434 bytes, checksum: a99ec0d172a3be38a844f44b70616b16 (MD5) / Made available in DSpace on 2017-08-14T11:28:15Z (GMT). No. of bitstreams: 1
arquivototal.pdf: 7134434 bytes, checksum: a99ec0d172a3be38a844f44b70616b16 (MD5)
Previous issue date: 2017-02-17 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / Many computational problems are considered to be hard due to their combinatorial
nature. In such cases, the use of exaustive search techniques for solving medium and
large size instances becomes unfeasible. Some data clustering and image segmentation
problems belong to NP-Hard class, and require an adequate treatment by means of heuristic
techniques such as metaheuristics. Data clustering is a set of problems in the fields
of pattern recognition and unsupervised machine learning which aims at finding groups
(or clusters) of similar objects in a benchmark dataset, using a predetermined measure of
similarity. The partitional clustering problem aims at completely separating the data in
disjont and non-empty clusters. For center-based clustering methods, the minimal intracluster
distance criterion is one of the most employed. This work proposes an approach
based on the metaheuristic Continuous Greedy Randomized Adaptive Search Procedure (CGRASP).
High quality results were obtained through comparative experiments between
the proposed method and other metaheuristics from the literature. In the computational
vision field, image segmentation is the process of partitioning an image in regions of interest
(set of pixels) without allowing overlap. Histogram thresholding is one of the simplest
types of segmentation for images in grayscale. Thes Otsu’s method is one of the most
populars and it proposes the search for the thresholds that maximize the variance between
the segments. For images with deep levels of gray, exhaustive search techniques demand a
high computational cost, since the number of possible solutions grows exponentially with
an increase in the number of thresholds. Therefore, metaheuristics have been playing
an important role in finding good quality thresholds. In this work, an approach based
on Quantum-behaved Particle Swarm Optimization (QPSO) were investigated for multilevel
thresholding of available images in the literature. A local search based on Variable
Neighborhood Descent (VND) was proposed to improve the convergence of the search for
the thresholds. An specific application of thresholding for electronic microscopy images
for microstructural analysis of cementitious materials was investigated, as well as graph
algorithms to crack detection and feature extraction. / Muitos problemas computacionais s˜ao considerados dif´ıceis devido `a sua natureza
combinat´oria. Para esses problemas, o uso de t´ecnicas de busca exaustiva para resolver
instˆancias de m´edio e grande porte torna-se impratic´avel. Quando modelados como
problemas de otimiza¸c˜ao, alguns problemas de clusteriza¸c˜ao de dados e segmenta¸c˜ao de
imagens pertencem `a classe NP-Dif´ıcil e requerem um tratamento adequado por m´etodos
heur´ısticos. Clusteriza¸c˜ao de dados ´e um vasto conjunto de problemas em reconhecimento
de padr˜oes e aprendizado de m´aquina n˜ao-supervisionado, cujo objetivo ´e encontrar grupos
(ou clusters) de objetos similares em uma base de dados, utilizando uma medida de
similaridade preestabelecida. O problema de clusteriza¸c˜ao particional consiste em separar
completamente os dados em conjuntos disjuntos e n˜ao vazios. Para m´etodos de clusteriza
¸c˜ao baseados em centros de cluster, minimizar a soma das distˆancias intracluster ´e
um dos crit´erios mais utilizados. Para tratar este problema, ´e proposta uma abordagem
baseada na meta-heur´ıstica Continuous Greedy Randomized Adaptive Search Procedure
(C-GRASP). Resultados de alta qualidade foram obtidos atrav´es de experimentos envolvendo
o algoritmo proposto e outras meta-heur´ısticas da literatura. Em vis˜ao computacional,
segmenta¸c˜ao de imagens ´e o processo de particionar uma imagem em regi˜oes
de interesse (conjuntos de pixels) sem que haja sobreposi¸c˜ao. Um dos tipos mais simples
de segmenta¸c˜ao ´e a limiariza¸c˜ao do histograma para imagens em n´ıvel de cinza. O
m´etodo de Otsu ´e um dos mais populares e prop˜oe a busca pelos limiares que maximizam
a variˆancia entre os segmentos. Para imagens com grande profundidade de cinza, t´ecnicas
de busca exaustiva possuem alto custo computacional, uma vez que o n´umero de solu¸c˜oes
poss´ıveis cresce exponencialmente com o aumento no n´umero de limiares. Dessa forma, as
meta-heur´ısticas tem desempenhado um papel importante em encontrar limiares de boa
qualidade. Neste trabalho, uma abordagem baseada em Quantum-behaved Particle Swarm
Optimization (QPSO) foi investigada para limiariza¸c˜ao multin´ıvel de imagens dispon´ıveis
na literatura. Uma busca local baseada em Variable Neighborhood Descent (VND) foi
proposta para acelerar a convergˆencia da busca pelos limiares. Al´em disso, uma aplica¸c˜ao
espec´ıfica de segmenta¸c˜ao de imagens de microscopia eletrˆonica para an´alise microestrutural
de materiais ciment´ıcios foi investigada, bem como a utiliza¸c˜ao de algoritmos em
grafos para detec¸c˜ao de trincas e extra¸c˜ao de caracter´ısticas de interesse.
|
5 |
Classify part of day and snow on the load of timber stacks : A comparative study between partitional clustering and competitive learningNordqvist, My January 2021 (has links)
In today's society, companies are trying to find ways to utilize all the data they have, which considers valuable information and insights to make better decisions. This includes data used to keeping track of timber that flows between forest and industry. The growth of Artificial Intelligence (AI) and Machine Learning (ML) has enabled the development of ML modes to automate the measurements of timber on timber trucks, based on images. However, to improve the results there is a need to be able to get information from unlabeled images in order to decide weather and lighting conditions. The objective of this study is to perform an extensive for classifying unlabeled images in the categories, daylight, darkness, and snow on the load. A comparative study between partitional clustering and competitive learning is conducted to investigate which method gives the best results in terms of different clustering performance metrics. It also examines how dimensionality reduction affects the outcome. The algorithms K-means and Kohonen Self-Organizing Map (SOM) are selected for the clustering. Each model is investigated according to the number of clusters, size of dataset, clustering time, clustering performance, and manual samples from each cluster. The results indicate a noticeable clustering performance discrepancy between the algorithms concerning the number of clusters, dataset size, and manual samples. The use of dimensionality reduction led to shorter clustering time but slightly worse clustering performance. The evaluation results further show that the clustering time of Kohonen SOM is significantly higher than that of K-means.
|
Page generated in 0.1188 seconds