One problem of interest is to relate genes to survival outcomes of patients for the purpose of building regression models to predict future patients¡¯ survival based on their gene expression data. Applying semeparametric additive risk model of survival analysis, this thesis proposes a new approach to conduct the analysis of gene expression data with the focus on model¡¯s predictive ability. The method modifies the correlation principal component regression to handle the censoring problem of survival data. Also, we employ the time dependent AUC and RMSEP to assess how well the model predicts the survival time. Furthermore, the proposed method is able to identify significant genes which are related to the disease. Finally, this proposed approach is illustrated by simulation data set, the diffuse large B-cell lymphoma (DLBCL) data set, and breast cancer data set. The results show that the model fits both of the data sets very well.
Identifer | oai:union.ndltd.org:GEORGIA/oai:digitalarchive.gsu.edu:math_theses-1050 |
Date | 22 April 2008 |
Creators | Wang, Guoshen |
Publisher | Digital Archive @ GSU |
Source Sets | Georgia State University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Mathematics Theses |
Page generated in 0.0023 seconds