Date of Award

Spring 5-7-2011

Degree Type


Degree Name

Doctor of Philosophy (PhD)


Mathematics and Statistics

First Advisor

Yixin Fang


The receiver operating characteristic (ROC) curves is a popular tool for evaluating continuous diagnostic tests. The traditional definition of ROC curves incorporates implicitly the idea of "hard" thresholding, which also results in the empirical curves being step functions. The first topic is to introduce a novel definition of soft ROC curves, which incorporates the idea of "soft" thresholding. The softness of a soft ROC curve is controlled by a regularization parameter that can be selected suitably by a cross-validation procedure. A byproduct of the soft ROC curves is that the corresponding empirical curves are smooth.

The second topic is on combination of several diagnostic tests to achieve better diagnostic accuracy. We consider the optimal linear combination that maximizes the area under the receiver operating characteristic curve (AUC); the estimates of the combination's coefficients can be obtained via a non-parametric procedure. However, for estimating the AUC associated with the estimated coefficients, the apparent estimation by re-substitution is too optimistic. To adjust for the upward bias, several methods are proposed. Among them the cross-validation approach is especially advocated, and an approximated cross-validation is developed to reduce the computational cost. Furthermore, these proposed methods can be applied for variable selection to select important diagnostic tests.

However, the above best-subset variable selection method is not practical when the number of diagnostic tests is large. The third topic is to further develop a LASSO-type procedure for variable selection. To solve the non-convex maximization problem in the proposed procedure, an efficient algorithm is developed based on soft ROC curves, difference convex programming, and coordinate descent algorithm.


Included in

Mathematics Commons