Global ETD Search

Return to search

Stereotype Logit Models for High Dimensional Data

Gene expression studies are of growing importance in the field of medicine. In fact, subtypes within the same disease have been shown to have differing gene expression profiles (Golub et al., 1999). Often, researchers are interested in differentiating a disease by a categorical classification indicative of disease progression. For example, it may be of interest to identify genes that are associated with progression and to accurately predict the state of progression using gene expression data. One challenge when modeling microarray gene expression data is that there are more genes (variables) than there are observations. In addition, the genes usually demonstrate a complex variance-covariance structure. Therefore, modeling a categorical variable reflecting disease progression using gene expression data presents the need for methods capable of handling an ordinal outcome in the presence of a high dimensional covariate space. In this research we present a method that combines the stereotype regression model (Anderson, 1984) with an elastic net penalty (Friedman et al., 2010) as a method capable of modeling an ordinal outcome for high-throughput genomic datasets. Results from applying the proposed method to both simulated and gene expression data will be reported and the effectiveness of the proposed method compared to a univariable and heuristic approach will be discussed.

gene expression

ordinal outcome

Biostatistics

Physical Sciences and Mathematics

Statistics and Probability

Identifer	oai:union.ndltd.org:vcu.edu/oai:scholarscompass.vcu.edu:etd-1146
Date	29 October 2010
Creators	Williams, Andre
Publisher	VCU Scholars Compass
Source Sets	Virginia Commonwealth University
Detected Language	English
Type	text
Format	application/pdf
Source	Theses and Dissertations
Rights	© The Author

Page generated in 0.002 seconds

Stereotype Logit Models for High Dimensional Data

Description

Links & Downloads

Tags

Additional Fields