Date of Award


Degree Type

Closed Thesis

Degree Name

Master of Science (MS)


Mathematics and Statistics

First Advisor

Yichuan Zhao - Chair

Second Advisor

Yu-Sheng Hsu

Third Advisor

Jiawei(Jay) Liu

Fourth Advisor

Xu Zhang


One problem of interest is to relate genes to survival outcomes of patients for the purpose of building regression models to predict future patients¡¯ survival based on their gene expression data. Applying semeparametric additive risk model of survival analysis, this thesis proposes a new approach to conduct the analysis of gene expression data with the focus on model¡¯s predictive ability. The method modifies the correlation principal component regression to handle the censoring problem of survival data. Also, we employ the time dependent AUC and RMSEP to assess how well the model predicts the survival time. Furthermore, the proposed method is able to identify significant genes which are related to the disease. Finally, this proposed approach is illustrated by simulation data set, the diffuse large B-cell lymphoma (DLBCL) data set, and breast cancer data set. The results show that the model fits both of the data sets very well.