Global ETD Search

Return to search

Comparative analysis of clustering methods for gene expresion data

Made available in DSpace on 2014-06-12T15:59:06Z (GMT). No. of bitstreams: 2
arquivo4839_1.pdf: 1378221 bytes, checksum: f1a933734804959bb52fd2eef936641b (MD5)
license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5)
Previous issue date: 2003 / Large scale approaches, namely proteomics and transcriptomics, will play the most
important role of the so-called post-genomics. These approaches allow experiments
to measure the expression of thousands of genes from a cell in distinct time points.
The analysis of this data can allow the the understanding of gene function and gene
regulatory networks (Eisen et al., 1998).
There has been a great deal of work on the computational analysis of gene expression
time series, in which distinct data sets of gene expression, clustering techniques
and proximity indices are used. However, the focus of most of these works are on
biological results. Cluster validation has been applied in few works, but emphasis
was given on the evaluation of the proposed validation methodologies (Azuaje, 2002;
Lubovac et al., 2001; Yeung et al., 2001; Zhu & Zhang, 2000). As a result, there are
few guidelines obtained by validity studies on which clustering methods or proximity
indices are more suitable for the analysis of data from gene expression time series.
Thus, this work performs a data driven comparative study of clustering methods
and proximity indices used in the analysis of gene expression time series (or time
courses). Five clustering methods encountered in the literature of gene expression
analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering,
k-means and self-organizing maps. In terms of proximity indices, versions of
three indices are analysed: Euclidean distance, angular separation and Pearson correlation.
In order to evaluate the methods, a k-fold cross-validation procedure adapted
to unsupervised methods is applied. The accuracy of the results is assessed by the
comparison of the partitions obtained in these experiments with gene annotation,
such as protein function and series classification

Validação de agrupamentos

Gene expression

Validation groups

Expressão gênica

Identifer	oai:union.ndltd.org:IBICT/oai:repositorio.ufpe.br:123456789/2538
Date	January 2003
Creators	Gesteira Costa Filho, Ivan
Contributors	de Assis Tenório Carvalho, Francisco
Publisher	Universidade Federal de Pernambuco
Source Sets	IBICT Brazilian ETDs
Language	Portuguese
Detected Language	English
Type	info:eu-repo/semantics/publishedVersion, info:eu-repo/semantics/masterThesis
Source	reponame:Repositório Institucional da UFPE, instname:Universidade Federal de Pernambuco, instacron:UFPE
Rights	info:eu-repo/semantics/openAccess

Page generated in 0.0022 seconds

Comparative analysis of clustering methods for gene expresion data

Description

Links & Downloads

Tags

Additional Fields