Microarray technology enables researchers to measure the expression levels of thousands
of genes simultaneously to understand relationships between genes, extract
pathways, and in general understand a diverse amount of biological processes such
as diseases and cell cycles. While microarrays provide the great opportunity of revealing
information about biological processes, it is a challenging task to mine the huge
amount of information contained in the microarray datasets. Generally, since an accurate
model for the data is missing, first a clustering algorithm is applied and then the
resulting clusters are examined manually to find genes that are related with the biological
process under inspection. We need automated methods for this analysis which
can be used to eliminate unrelated genes from data and mine for biologically important
genes. Here, we introduce a general methodology which makes use of traditional
clustering algorithms and involves integration of the two main sources of biological
information, Gene Ontology and interaction networks, with microarray data for eliminating
unrelated information and find a clustering result containing only genes related
with a given biological process. We applied our methodology successfully on a number
of different cases and on different organisms. We assessed the results with Gene Set Enrichment Analysis method and showed that our final clusters are highly enriched.
We also analyzed the results manually and found that most of the genes that are in
the final clusters are actually related with the biological process under inspection.
Identifer | oai:union.ndltd.org:METU/oai:etd.lib.metu.edu.tr:http://etd.lib.metu.edu.tr/upload/12614266/index.pdf |
Date | 01 March 2012 |
Creators | Korkmaz, Gulberal Kircicegi Yoksul |
Contributors | Atalay, Mehmet Volkan |
Publisher | METU |
Source Sets | Middle East Technical Univ. |
Language | English |
Detected Language | English |
Type | Ph.D. Thesis |
Format | text/pdf |
Rights | To liberate the content for public access |
Page generated in 0.0074 seconds