Return to search

Gene selection for sample sets with biased distribution

Microarray expression data which contains the expression levels of a large number of simultaneously observed genes have been used in many scientific research and clinical studies. Due to its high dimensionalities, selecting a small number of genes has shown to be beneficial for many tasks such as building prediction models from the microarray expression data or gene regulatory network discovery. Traditional gene selection methods, however, fail to take the class distribution into the selection process. In biomedical science, it is very common to have microarray expression data which is severely biased with one class of examples (e.g., diseased samples) significantly less than other classes (e.g., normal samples). These sample sets with biased distributions require special attention from researchers for identification of genes responsible for a particular disease. In this thesis, we propose three filtering techniques, Higher Weight ReliefF, ReliefF with Differential Minority Repeat and ReliefF with Balanced Minority Repeat to identify genes responsible for fatal diseases from biased microarray expression data. Our solutions are evaluated on five well-known microarray datasets, Colon, Central Nervous System, DLBCL Tumor, Lymphoma and ECML Pancreas. Experimental comparisons with the traditional ReliefF filtering method demonstrate the effectiveness of the proposed methods in selecting informative genes from microarray expression data with biased sample distributions. / by Abu Hena Mustafa Kamal. / Thesis (M.S.C.S.)--Florida Atlantic University, 2009. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2009. Mode of access: World Wide Web.

Identiferoai:union.ndltd.org:fau.edu/oai:fau.digital.flvc.org:fau_2878
ContributorsKamal, Abu Hena Mustafa., College of Engineering and Computer Science, Department of Computer and Electrical Engineering and Computer Science
PublisherFlorida Atlantic University
Source SetsFlorida Atlantic University
LanguageEnglish
Detected LanguageEnglish
TypeText, Electronic Thesis or Dissertation
Formatx, 98 p. : ill. (some col.)., electronic
Rightshttp://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0023 seconds