Return to search

A New Reclassification Method for Highly Uncertain Microarray Data in Allergy Gene Prediction

The analysis of microarray data is a challenging task because of the large dimensionality and small sample size involved. Although a few methods are available to address the problem of small sample size, they are not sufficiently successful in dealing with microarray data from extremely small (~<20) sample sizes. We propose a method to incorporate information from diverse sources to analyze the microarray data so as to improve the predictability of significant genes. A transformed data set, including statistical parameters, literature mining and gene ontology data, is evaluated. We performed classification experiments to identify potential allergy-related genes. Feature selection is used to identify the effect of features on classifier behaviour.
An exploratory and domain knowledge analysis was performed on noisy real-life allergy data, and a subset of genes was selected as positive and negative class. A new set of transformed variables, depending on the mean and standard deviation statistics of the data distribution and other data sources, was identified. Significant allergy- and immune-related genes from the microarray data were selected. Experiments showed that classification predictability of significant genes can be improved. Important features from the transformed variable set were also identified.

Identiferoai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OGU.10214/3486
Date11 April 2012
CreatorsPaul, Jasmin
ContributorsChiu, David
Source SetsLibrary and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0021 seconds