Spelling suggestions: "subject:"clustering detechniques"" "subject:"clustering 3dtechniques""
1 |
Improving Clustering of Gene Expression PatternsJonsson, Per January 2000 (has links)
<p>The central question investigated in this project was whether clustering of gene expression patterns could be done more biologically accurate by providing the clustering technique with additional information about the genes as input besides the expression levels. With the term biologically accurate we mean that the genes should not only be clustered together according to their similarities in expression profiles, but also according to their functional similarity in terms of functional annotation and metabolic pathway. The data was collected at AstraZeneca R&D Mölndal Sweden and the applied computational technique was self-organising maps. In our experiments we used the combination of expression profiles together with enzyme classification annotation as input for the self-organising maps instead of just the expression profiles. The results were evaluated both statistically and biologically. The statistical evaluation showed that our method resulted in a small decrease in terms of compactness and isolation. The biological evaluation showed that our method resulted in clusters with greater functional homogeneity with respect to enzyme classification, functional hierarchy and metabolic pathway annotation.</p>
|
2 |
Improving Clustering of Gene Expression PatternsJonsson, Per January 2000 (has links)
The central question investigated in this project was whether clustering of gene expression patterns could be done more biologically accurate by providing the clustering technique with additional information about the genes as input besides the expression levels. With the term biologically accurate we mean that the genes should not only be clustered together according to their similarities in expression profiles, but also according to their functional similarity in terms of functional annotation and metabolic pathway. The data was collected at AstraZeneca R&D Mölndal Sweden and the applied computational technique was self-organising maps. In our experiments we used the combination of expression profiles together with enzyme classification annotation as input for the self-organising maps instead of just the expression profiles. The results were evaluated both statistically and biologically. The statistical evaluation showed that our method resulted in a small decrease in terms of compactness and isolation. The biological evaluation showed that our method resulted in clusters with greater functional homogeneity with respect to enzyme classification, functional hierarchy and metabolic pathway annotation.
|
3 |
Using Self-Organizing Maps to Cluster Products for Storage Assignment in a Distribution CenterDavis, Casey J. 13 June 2017 (has links)
No description available.
|
4 |
Fuzzy C-Means Clustering Approach to Design a Warehouse LayoutNaik, Vaibhav C 08 July 2004 (has links)
Allocation of products in a warehouse is done by various storage policies. These are broadly classified into three main categories: dedicated storage, randomized storage, and class-based storage. In dedicated storage policy a product is assigned a designated slot while in random storage policy incoming product is randomly assigned a storage location close to the input/output point. Finally, the class-based storage is a mixed policy where products are randomly assigned within their fixed class. Dedicated storage policy is most commonly used in practice. While designing large warehouse layout, the product information in terms of throughput and storage level is either uncertain or is not available to the warehouse designer. Hence it is not possible to locate products on the basis of the throughput to storage ratio method used in the above mentioned storage location policies. To take care of this uncertainty in product data we propose a fuzzy C-means clustering (FCM) approach.
This research is mainly directed to improve the efficiency (distance or time traveled) by designing a fuzzy logic based warehouse with large number of products. The proposed approach looks for similarity in the product data to form clusters. The obtained clusters can be directly utilized to develop the warehouse layout. Further, it is investigated if the FCM approach can take into account other factors such as product size, similarity and/or characteristics to generate layouts which are not only efficient in terms of reducing distance traveled to store/retrieve products but are effective in terms of retrieval time, space utilization and/or better material control.
|
5 |
An Attempt To Classify Turkish District Data: K-means And Self-organizing Map (som) AlgorithmsAksoy, Ece 01 January 2005 (has links) (PDF)
ABSTRACT
AN ATTEMPT TO CLASSIFY TURKISH DISTRICT DATA: K-MEANS AND SELF-ORGANIZING MAP (SOM) ALGORITHMS
Aksoy, Ece
M.S., Department of Geodetic and Geographic Information Systems
Supervisor: Assoc. Prof. Dr. Oguz ISik
December 2004, 112 pages
There is no universally applicable clustering technique in discovering the variety of structures display in data sets. Also, a single algorithm or approach is not adequate to solve every clustering problem. There are many methods available, the criteria used differ and hence different classifications may be obtained for the same data. While larger and larger amounts of data are collected and stored in databases, there is increasing the need for efficient and effective analysis methods. Grouping or classification of measurements is the key element in these data analysis procedures. There are lots of non-spatial clustering techniques in various areas. However, spatial clustering techniques and software are not so common.
This thesis is an attempt to classify Turkish district data with the help of two clustering algorithms: K-means clustering and self organizing maps (SOM). With the help of these two common techniques it is expected that a clustering can be reached, which can be used for different aims such as regional politics, constructing statistical integrity or analyzing distribution of funds, for same data in GIS environment and putting forward the facilitative usage of GIS in regional and statistical studies.
All districts of Turkey, which is 923 units, were chosen as an application area in this thesis. Some limitations such as population were specified for clustering of Turkey&rsquo / s districts. Firstly, different clustering techniques for spatial classification were researched. K-Means and SOM algorithms were chosen to compare different methods with Turkey&rsquo / s district data. Afterward, database of Turkey&rsquo / s statistical datum was formed and analyzed joining with geographical data in the GIS environment. Different clustering software, ArcGIS, CrimeStat and Matlab, were applied according to conclusion of clustering techniques research. Self Organizing Maps (SOM) algorithm, which is the best and most common spatial clustering algorithm in recent years, and CrimeStat K-Means clustering were used in this thesis as clustering methods.
|
6 |
Alinhamento múltiplo de seqüências através de técnicas de agrupamento / Multiple alignment of sequences through clustering techniquesPeres, Patrícia Silva 24 February 2006 (has links)
Made available in DSpace on 2015-04-11T14:02:59Z (GMT). No. of bitstreams: 1
Patricia Silva Peres.pdf: 506475 bytes, checksum: 40dfa72e28b5cca338c104148bd4ef06 (MD5)
Previous issue date: 2006-02-24 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / The simultaneous alignment of many DNA or protein sequences is one of the commonest tasks in computational molecular biology. Multiple alignments are important in many applications, such as, predicting the structure of new sequences, demonstrating the relationship between new
sequences and existing families of sequences, inferring the evolutionary history of a family of sequences,finding the characteristic motifs (core blocks) between biological sequences, assembling fragments in DNA sequencing, and many others. Currently, the most popular strategy used for solving the multiple sequence alignment problem is the progressive alignment. Each step of this strategy might generate an error which is expected to be low for closely related sequences but increases as sequences diverge. Therefore, determining the order in which the sequences will be aligned is a key step in the progressive alignment strategy. Traditional approaches take into account, in each iteration of the progressive alignment, only the closest pair or groups of sequences to be aligned. Such strategy minimizes the error introduced in each step, but may not be the best option to minimize the final error. Based on that hypothesis, this work aims the study and the application of a global clustering technique to perform a previous analysis of all sequences in order to separate them into groups according to their similarities. These groups, then, guide the traditional progressive alignment, as an attempt to minimize the overall error introduced by the steps of the progressive alignment and improve the final result. To assess the reliability of this new strategy, three well-known methods were modified for the purpose of introducing the new sequence clustering stage. The accuracy of new versions of the methods was tested using three diferent reference collections. Besides, the modified methods were compared with their original versions. Results of the conducted experiments depict that the new versions of the methods with the global clustering stage really obtained better alignments than their original versions in the three reference collections and achieving improvement over the main methods found in literature, with an increase of only 3% on average in the running time. / O alinhamento simultâneo entre várias seqüências de DNA ou proteína é um dos principais problemas em biologia molecular computacional. Alinhamentos múltiplos são importantes em
muitas aplicações, tais como, predição da estrutura de novas seqüências, demonstração do relacionamento entre novas seqüências e famílias de seqüências já existentes, inferência da história evolutiva de uma família de seqüências, descobrimento de padrões que sejam compartilhados
entre seqüências, montagem de fragmentos de DNA, entre outras. Atualmente, a estratégia mais popular utilizada na resolução do problema do alinhamento múltiplo é o alinhamento progressivo. Cada etapa desta estratégia pode gerar uma taxa de erro que tenderá a ser baixa no caso de seqüências muito similares entre si, porêm tenderá a ser alta
na medida em que as seqüências divergirem. Portanto, a determinação da ordem de alinhamento das seqüências constitui-se em um passo fundamental na estratégia de alinhamento progressivo. Estratégias tradicionais levam em consideração, a cada iteração do alinhamento progressivo,
apenas o par ou grupo de seqüências mais próximo a ser alinhado. Tal estratégia minimiza a taxa de erro introduzida em cada etapa, porém pode não ser a melhor forma para minimizar a taxa de erro final. Baseado nesta hipótese, este trabalho tem por objetivo o estudo e aplicação de uma técnica de agrupamento global para executar uma análise prévia de todas as seqüências de forma a separálas em grupos de acordo com suas similaridades. Estes grupos, então, guiarão o alinhamento progressivo tradicional, numa tentativa de minimizar a taxa de erro global introduzida pelas
etapas do alinhamento progressivo e melhorar o resultado final.
Para avaliar a contabilidade desta nova estratégia, três métodos conhecidos foram modificados com o objetivo de agregar a nova etapa de agrupamento de seqüências. A acurácia das novas versões dos métodos foi testada utilizando três diferentes coleções de referências. Além
disso, os métodos modificados foram comparadas com suas respectivas versões originais. Os resultados dos experimentos mostram que as novas versões dos métodos com a etapa de
agrupamento global realmente obtiveram alinhamentos melhores do que suas versões originais nas três coleções de referência e alcançando melhorias sobre os principais métodos encontrados na literatura, com um aumento de apenas 3% em média no tempo de execução.
|
7 |
Učení bez učitele / Unsupervised learningKantor, Jan January 2008 (has links)
The purpose of this work has been to describe some techniques which are normally used for cluster data analysis process of unsupervised learning. The thesis consists of two parts. The first part of thesis has been focused on some algorithms theory describing advantages and disadvantages of each discussed method and validation of clusters quality. There are many ways how to estimate and compute clustering quality based on internal and external knowledge which is mentioned in this part. A good technique of clustering quality validation is one of the most important parts in cluster analysis. The second part of thesis deals with implementation of different clustering techniques and programs on real datasets and their comparison with true dataset partitioning and published related work.
|
8 |
Reliability assessment of electrical power systems using genetic algorithms / Reliability assessment of electric power systems using genetic algorithmsSamaan, Nader Amin Aziz 15 November 2004 (has links)
The first part of this dissertation presents an innovative method for the assessment of generation system reliability. In this method, genetic algorithm (GA) is used as a search tool to truncate the probability state space and to track the most probable failure states. GA stores system states, in which there is generation deficiency to supply system maximum load, in a state array. The given load pattern is then convoluted with the state array to obtain adequacy indices.
In the second part of the dissertation, a GA based method for state sampling of composite generation-transmission power systems is introduced. Binary encoded GA is used as a state sampling tool for the composite power system network states. A linearized optimization load flow model is used for evaluation of sampled states. The developed approach has been extended to evaluate adequacy indices of composite power systems while considering chronological load at buses. Hourly load is represented by cluster load vectors using the k-means clustering technique. Two different approaches have been developed which are GA parallel sampling and GA sampling for maximum cluster load vector with series state revaluation.
The developed GA based method is used for the assessment of annual frequency and duration indices of composite system. The conditional probability based method is used to calculate the contribution of sampled failure states to system failure frequency using different component transition rates. The developed GA based method is also used for evaluating reliability worth indices of composite power systems. The developed GA approach has been generalized to recognize multi-state components such as generation units with derated states. It also considers common mode failure for transmission lines.
Finally, a new method for composite system state evaluation using real numbers encoded GA is developed. The objective of GA is to minimize load curtailment for each sampled state. Minimization is based on the dc load flow model. System constraints are represented by fuzzy membership functions. The GA fitness function is a combination of these membership values. The proposed method has the advantage of allowing sophisticated load curtailment strategies, which lead to more realistic load point indices.
|
9 |
Efficient Hierarchical Clustering Techniques For Pattern ClassificationVijaya, P A 07 1900 (has links) (PDF)
No description available.
|
Page generated in 0.1134 seconds