In this thesis study, three different feature selection methods, LASSO, SLR, and SMLR, were tested and compared using microarray fold change data. Two real datasets were used to first investigate and compare the ability of the algorithms in selecting feature genes on data under two conditions. It was found that SMLR was quite sensitive to its parameter, and was more accurate in selecting differentially expressed genes when compared to SLR and LASSO. In addition, the model coefficients generated by SMLR had a close relationship with the magnitude of fold changes. Also, SMLR's ability in selecting differentially expressed genes with data that had more than two conditions was shown to be successful. The results from simulation experiments agreed with the results from the real dataset experiments. Additionally, it was found that different proportions of differentially expressed genes in the data did not affect the performance of LASSO and SLR, but the number of genes selected by SMLR increased with the proportion of regulated genes. Also, as the number of replicates used to build the model increased, the number of genes selected by SMLR increased. This applied to both correctly and incorrectly selected genes. Furthermore, it was found that SMLR performed the best in identifying future treatment samples.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:BVIV.1828/1308 |
Date | 22 December 2008 |
Creators | Law, Timothy Tao Hin |
Contributors | Lesperance, Mary |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English, English |
Detected Language | English |
Type | Thesis |
Rights | Available to the World Wide Web |
Page generated in 0.0015 seconds