Gene expression studies are of growing importance in the field of medicine. In fact, subtypes within the same disease have been shown to have differing gene expression profiles (Golub et al., 1999). Often, researchers are interested in differentiating a disease by a categorical classification indicative of disease progression. For example, it may be of interest to identify genes that are associated with progression and to accurately predict the state of progression using gene expression data. One challenge when modeling microarray gene expression data is that there are more genes (variables) than there are observations. In addition, the genes usually demonstrate a complex variance-covariance structure. Therefore, modeling a categorical variable reflecting disease progression using gene expression data presents the need for methods capable of handling an ordinal outcome in the presence of a high dimensional covariate space. In this research we present a method that combines the stereotype regression model (Anderson, 1984) with an elastic net penalty (Friedman et al., 2010) as a method capable of modeling an ordinal outcome for high-throughput genomic datasets. Results from applying the proposed method to both simulated and gene expression data will be reported and the effectiveness of the proposed method compared to a univariable and heuristic approach will be discussed.
Identifer | oai:union.ndltd.org:vcu.edu/oai:scholarscompass.vcu.edu:etd-1146 |
Date | 29 October 2010 |
Creators | Williams, Andre |
Publisher | VCU Scholars Compass |
Source Sets | Virginia Commonwealth University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Theses and Dissertations |
Rights | © The Author |
Page generated in 0.011 seconds