281 |
Análise de agrupamentos para o reconhecimento de padrões de infestação de aracnídeos em zonas urbanas /Biazi, Angelo Henrique. January 2015 (has links)
Orientador: Fernando Frei / Banca: Jaime de Oliveira Gomes / Banca: Sérgio Nascimento Stampar / Resumo: Os aracnídeos conseguiram sucesso ao longo do processo evolutivo graças à alta capacidade de adaptação. Dentro deste grupo, as aranhas e escorpiões apresentam ampla distribuição no planeta, e podem trazer problemas de saúde aos seres humanos. Para avaliar os problemas de infestação provocados por estes animais é necessário o uso de ferramentas estatísticas que possam aferir quais são os fatores que favorecem ou perpetuam a aparição, presença e proliferação destes animais. Desta forma, este trabalho tem como objetivo apresentar a Análise de Agrupamentos para determinar padrões de infestação, fazendo com que locais aparentemente distintos possam ser reunidos em grupos semelhantes o que pode trazer benefícios para políticas de saúde. Os 25 pontos geográficos de coleta utilizados na amostragem foram reunidos em seis grupos com características distintas, dentre os quais, três foram formados por um único ponto de coleta (outliers). Os grupos obtidos apresentaram relação entre os pontos que os formam e as características ambientais dos locais, demonstrando que a distribuição das famílias de aracnídeos nos diferentes gradientes urbanos é influenciada pelas condições do ambiente. Assim, a ocupação dos espaços urbanizados por aracnídeos peçonhentos torna-se um problema para a saúde pública, necessitando de políticas de controle epidemiológico através de monitoramento e mapeamento das áreas de risco que podem ser melhor avaliadas com o uso da Análise de Agrupamentos. Palavras chave: Aranha, Escorpião, Epidemiologia, Análise de Agrupamentos / Abstract: Arachnids have been successful along the evolutionary process thanks to their high adaptability. Within this group, spiders and scorpions are widely distributed on the planet and may affect human health. In order to evaluate the infestation problems caused by these animals, it is necessary to use statistical tools that can assess the factors that favor or perpetuate their appearance, presence and proliferation. Thus, this work aims to present cluster analysis in order to determine infestation patterns, allowing seemingly distinct sites to be grouped into similar clusters, which can benefit health policies. The 25 geographic collection sites used for sampling were divided into six groups with different characteristics, among which three were formed by a single collection point (outliers). The groups obtained showed a relationship with the environmental characteristics of the sites, indicating that the distribution of arachnid families in the different urban gradients is influenced by environmental conditions. Therefore, the occupation of urbanized areas by venomous arachnids becomes a problem for public health, requiring epidemiological control policies through monitoring and mapping of risk areas, which can be better assessed with the use of cluster analysis / Mestre
|
282 |
Segmentace trhu s využitím metod shlukové shlukové analýzy / The Market Segmentation with using Cluster Analysis MethodsMICHALOVÁ, Jitka January 2008 (has links)
This diploma work is called ``The Market Segmentation with using Cluster Analysis Methods``. This segmentation is made in the marketplace of tourism. The cluster analysis methods are classified into hierarchical and no hierarchical ones. The hierarchical methods are divided into the following methods: The nearest neighbour metod, The furthest neighbour metod, The weighted group metod. MacQueen K-means and Wishart methods are no hierarchical methods. In the work we can find the division of one informant into 5 groups according to two methods ( The furthest neighbour metod and MacQueen K-means). Each group has its own representative which consists of the most frequent answers. This information is exploited at sorting clients in turism.
|
283 |
Seleção de grupos a partir de hierarquias: uma modelagem baseada em grafos / Clusters selection from hierarchies: a graph-based modelFrancisco de Assis Rodrigues dos Anjos 28 June 2018 (has links)
A análise de agrupamento de dados é uma tarefa fundamental em mineração de dados e aprendizagem de máquina. Ela tem por objetivo encontrar um conjunto finito de categorias que evidencie as relações entre os objetos (registros, instâncias, observações, exemplos) de um conjunto de dados de interesse. Os algoritmos de agrupamento podem ser divididos em particionais e hierárquicos. Uma das vantagens dos algoritmos hierárquicos é conseguir representar agrupamentos em diferentes níveis de granularidade e ainda serem capazes de produzir partições planas como aquelas produzidas pelos algoritmos particionais, mas para isso é necessário que seja realizado um corte (por exemplo horizontal) sobre o dendrograma ou hierarquia dos grupos. A escolha de como realizar esse corte é um problema clássico que vem sendo investigado há décadas. Mais recentemente, este problema tem ganho especial importância no contexto de algoritmos hierárquicos baseados em densidade, pois somente estratégias mais sofisticadas de corte, em particular cortes não-horizontais denominados cortes locais (ao invés de globais) conseguem selecionar grupos de densidades diferentes para compor a solução final. Entre as principais vantagens dos algoritmos baseados em densidade está sua robustez à interferência de dados anômalos, que são detectados e deixados de fora da partição final, rotulados como ruído, além da capacidade de detectar clusters de formas arbitrárias. O objetivo deste trabalho foi adaptar uma variante da medida da Modularidade, utilizada amplamente na área de detecção de comunidades em redes complexas, para que esta possa ser aplicada ao problema de corte local de hierarquias de agrupamento. Os resultados obtidos mostraram que essa adaptação da modularidade pode ser uma alternativa competitiva para a medida de estabilidade utilizada originalmente pelo algoritmo estado-da-arte em agrupamento de dados baseado em densidade, HDBSCAN*. / Cluster Analysis is a fundamental task in Data Mining and Machine Learning. It aims to find a finite set of categories that evidences the relationships between the objects (records, instances, observations, examples) of a data set of interest. Clustering algorithms can be divided into partitional and hierarchical. One of the advantages of hierarchical algorithms is to be able to represent clusters at different levels of granularity while being able to produce flat partitions like those produced by partitional algorithms. To achieve this, it is necessary to perform a cut (for example horizontal) through the dendrogram or cluster tree. How to perform this cut is a classic problem that has been investigated for decades. More recently, this problem has gained special importance in the context of density-based hierarchical algorithms, since only more sophisticated cutting strategies, in particular nonhorizontal cuts (instead of global ones) are able to select clusters with different densities to compose the final solution. Among the main advantages of density-based algorithms is their robustness to noise and their capability to detect clusters of arbitrary shape. The objective of this work was to adapt a variant of the Q Modularity measure, widely used in the realm of community detection in complex networks, so that it can be applied to the problem of local cuts through cluster hierarchies. The results show that the proposed measure can be a competitive alternative to the stability measure, originally used by the state-of-the-art density-based clustering algorithm HDBSCAN*.
|
284 |
Uso do teste de Scott-Knott e da análise de agrupamentos, na obtenção de grupos de locais para experimentos com cana-de-açúcar / Scott-Knott test and cluster analysis use in the obtainment of placement groups for sugar cane experimentsCristiane Mariana Rodrigues da Silva 15 February 2008 (has links)
O Centro de Tecnologia Canavieira (CTC), situado na cidade de Piracicaba, é uma associação civil de direito privado, criada em agosto de 2004, com o objetivo de realizar pesquisa e desenvolvimento em novas tecnologias para aplicação nas atividades agrícolas, logísticas e industriais dos setores canavieiro e sucroalcooleiro e desenvolver novas variedades de cana-de-açúcar. Há 30 anos, são feitos experimentos, principalmente no estado de São Paulo onde se concentra a maior parte dessas unidades produtoras associadas. No ano de 2004 foram instalados ensaios em 11 destas Unidades Experimentais dentro do estado de São Paulo, e há a necessidade de se saber se é possível a redução deste número, visando aos aspectos econômicos. Se se detectarem grupos de Unidades com dados muito similares, pode-se reduzir o número destas, reduzindo-se, conseqüentemente, o custo dessas pesquisas, e é através do teste estatístico de Scott-Knott e da Análise de Agrupamento, que essa similaridade será comprovada. Este trabalho tem por objetivo, aplicar as técnicas da Análise de Agrupamento (\"Cluster Analisys\") e o teste de Scott-Knott na identificação da existência de grupos de Unidades Industriais, visando à diminuição do número de experimentos do Centro de Tecnologia Canavieira (CTC) e, por conseguinte, visando ao menor custo operacional. Os métodos de comparação múltipla baseados em análise de agrupamento univariada, têm por objetivo separar as médias de tratamentos que, para esse estudo foram médias de locais, em grupos homogêneos, pela minimização da variação dentro, e maximização entre grupos e um desses procedimentos é o teste de Scott-Knott. A análise de agrupamento permite classificar indivíduos ou objetos em subgrupos excludentes, em que se pretende, de uma forma geral, maximizar a homogeneidade de objetos ou indivíduos dentro de grupos e maximizar a heterogeneidade entre os grupos, sendo que a representação desses grupos é feita num gráfico com uma estrutura de árvore denominado dendrograma. O teste de Scott- Knott, é um teste para Análise Univariada, portanto, mais indicado quando se tem apenas uma variável em estudo, sendo que a variável usada foi TPH5C, por se tratar de uma variável calculada a partir das variáveis POL, TCH e FIB. A Análise de Agrupamento, através do Método de Ligação das Médias, mostrou-se mais confiável, pois possuía-se, nesse estudo, três variáveis para análise, que foram: TCH (tonelada de cana por hectare), POL (porcentagem de açúcar), e FIB (porcentagem de fibra). Comparando-se o teste de Scott-Knott com a Análise de Agrupamentos, confirmam-se os agrupamentos entre os locais L020 e L076 e os locais L045 e L006. Conclui-se, portanto, que podem ser eliminadas dos experimentos duas unidades experimentais, optando por L020 (Ribeirão Preto) ou L076 (Assis), e L045 (Ribeirão Preto) ou L006 (Região de Jaú), ficando essa escolha, a critério do pesquisador, podendo assim, reduzir seu custo operacional. / The Centre of Sugar Cane Technology (CTC), placed at the city of Piracicaba, is a private right civilian association, created in August of 2004, aiming to research and develop new technologies with application in agricultural and logistic activities, as well as industrial activities related to sugar and alcohol sectors, such as the development of new sugar cane varieties. Experiments have been made for 30 years, mainly at the state of São Paulo, where most of the associated unities of production are located. At the year of 2004, experiments were installed in 11 of those Experimental Unities within the state of São Paulo, and there is the need to know if it is possible the reduction of this number, aiming at the economical aspects. If it were detected groups of Unities with very similar data, it would be possible to eliminate some of these Unities, diminishing, consequently, the researches cost, and it is through the Scott-Knott statistical test and the Cluster Analysis that this similarity may be corroborated. This work aims to apply the Cluster Analysis techniques and the Scott-Knott test to the identification of the existence of groups of Industrial Unities, aiming at the reduction of the CTC\'s experiments number and, consequently, aiming at the smaller operational cost. The methods of multiple comparison based on univariate cluster analysis aim to split the treatments means in homogenous groups, for this work were used the placement groups means, through the minimization of the variation within, and the maximization amongst groups; one of these methods is the Scott-Knott test. The cluster analysis allows the classification of individual or objects in excludent groups; again, the idea is to maximize the homogeneity of objects or individual within groups and to maximize the heterogeneity amongst groups, being that these groups are represented by a tree structured graphic by the name of dendogram. The Scott-Knott test is a Univariate Analysis test, therefore is appropriate for studies with only one variable of interest. The Cluster Analysis, through the Linkage of Means Method, proved to be more reliable, for, in this case, there were three variables of interest for analysis, and these were: TCH (weight, in tons, of sugar cane by hectare), POL (percentage of sugar) and FIB (percentage of fiber). By comparing the Scott-Knott test with the Cluster Analysis, two pairs of clustering are confirmed, these are: placements L020 and L076; and L045 and L006. Therefore it is concluded that two of the experimental unities may be removed, one can choose from L020 (Ribeirão Preto) or L076 (Assis), and L045 (Ribeirão Preto) or L006 (Região de Jaú), the choice lies with the researcher, and it can diminish the operational cost. Keywords: Cluster Analysis; Sugar Cane
|
285 |
Cluster Analysis of Discussions on Internet Forums / Klusteranalys av Diskussioner på InternetforumHolm, Rasmus January 2016 (has links)
The growth of textual content on internet forums over the last decade have been immense which have resulted in users struggling to find relevant information in a convenient and quick way. The activity of finding information from large data collections is known as information retrieval and many tools and techniques have been developed to tackle common problems. Cluster analysis is a technique for grouping similar objects into smaller groups (clusters) such that the objects within a cluster are more similar than objects between clusters. We have investigated the clustering algorithms, Graclus and Non-Exhaustive Overlapping k-means (NEO-k-means), on textual data taken from Reddit, a social network service. One of the difficulties with the aforementioned algorithms is that both have an input parameter controlling how many clusters to find. We have used a greedy modularity maximization algorithm in order to estimate the number of clusters that exist in discussion threads. We have shown that it is possible to find subtopics within discussions and that in terms of execution time, Graclus has a clear advantage over NEO-k-means.
|
286 |
Segmentace zákazníků obchodní společnosti s využitím metod shlukové analýzy / Segmentation of business company customers using cluster analysis methodsNesrstová, Markéta January 2015 (has links)
This thesis discusses the possibilities of using cluster analysis methods for customer segmentation. The theoretical part is focused on description of selected methods of cluster analysis and explanation of other concepts related to this topic, such as CRM, segmentation and targeted communication. In the practical part are applied cluster analysis methods to real data unnamed company with the aim of creating a default substrates useful for planning and implementation of targeted communication. For the main calculations is used program R, for data and output editing is used MS Excel. At the end of the work are evaluated applied methods and summarized lessons learned from the cluster analysis. For a company were created and characterized databases which could be useful for marketing decisions.
|
287 |
Připojištění / RidersSviták, David January 2009 (has links)
Riders are growing more important as a part of insurance markets. The aim of this thesis is to introduce riders offered by a chosen insurance company in the Czech Republic. The next part is dedicated to the study of a structure of arranged riders in one year. A characteristic of riders and main covers a, which affect the number of arranged riders, are specified by using statistical methods. In the last part, clients are classified based on their owned riders by using cluster analysis. This thesis contains some recommenddations to create new riders.
|
288 |
Hodnocení a klasifikace zemí EU s využitím demografických údajů / EVALUATION AND CLASSIFICATION OF THE EUROPEAN UNION COUNTRIESBrabcová, Petra January 2010 (has links)
This diploma work describes the classification of the member states of the European Union according to the demographic indicators. It evaluates development in the individual states by absolute demographic indicators too. In the year 2008 less children were born and less people were died in the most of the member states than the year 1993. The hope of the end of life grows up in all contries. Relative demographic indicators are used in the cluster analysis for diversification of rhe states into certain groups in accordance with their similarity. Two methods are used in this work --of the farthest neighbour and the Ward method (the hierarchical clustering method), the both with the Euclid distance. The hierarchical method of the farthest neighbour divided fifteen states into the four clusters in 1995, twenty-five states into the six clusters in 2004 and twenty-seven states into the six clusters in 2007. The Ward method divided these states into the three clusters in 1995, into the six clusters in 2004 and into the three clusters in 2007.
|
289 |
Analýza způsobu trávení volného času českým spotřebitelem / Czech consumer leisure analysisJansa, Hynek January 2011 (has links)
The aim of the thesis is to make segmentation of leisure activities market, to uncover important customer segments and describe the differences in their behavior and feature. Is used the knowledge of social sciences, particularly economics and sociology, with the help of which is described relationship of lifestyle and consumer behavior. The data come from research focused on consumer and media behavior and its relation to population lifestyle. Data were subjected to factor and cluster analysis.
|
290 |
Fossil Moles from the Gray Fossil Site, TN: Implications for Diversification and Evolution of North American TalpidaeOberg, Danielle 01 May 2018 (has links)
The Gray Fossil Site (GFS) is one of the richest Cenozoic terrestrial localities in the eastern United States. This study describes the first talpid specimens recovered from the GFS. Using measurements and comparisons of dental and humerus morphology, I identify 4 talpids (Parascalops nov. sp., Quyania cf. Q. europaea, Mioscalops (= Scalopoides) sp., and an unidentified stem desman) occurring at the GFS. Humeral morphology has been used to diagnose talpid species and study relationships. A geometric morphometric analysis showed that humerus shape is highly reflective of locomotor ecology in extant talpids and allows ecological inferences for fossil talpids. Hierarchical cluster analysis using morphometric data allowed examination of similarity among taxa and helped to secondarily verify taxonomic designations for the GFS taxa. The resulting phenogram showed strong similarity to the most up-to-date molecular cladogram and actually matched phylogenetic relationships substantially better than any morphological cladistic analyses to date.
|
Page generated in 0.0663 seconds