Date of Award

12-16-2020

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Mathematics and Statistics

First Advisor

Yichuan Zhao

Second Advisor

Jing Zhang

Third Advisor

Yichen Cheng

Fourth Advisor

Jun Kong

Abstract

The empirical likelihood method is a reliable data analysis tool in all statistical areas for its nonparametric features with parametric likelihood benefits. Because of the versatility of this method, we investigate its performance under survival and non-survival data structures. Zero-inflated data may arise in many areas where there are many zero values, and the non-zero values are often highly positively skewed. Confidence intervals based on a normal approximation for such zero-inflated data may have low coverage probabilities. We study empirical likelihood (EL) based inference techniques to construct a nonparametric confidence interval for the mean of a zero-inflated population, the mean difference of two zero-inflated skewed populations, and the quantile difference of a zero-inflated population.

We also apply the empirical likelihood method in two different kinds of survival data. First, we consider panel count data. In panel count data, each study subject can only be observed at discrete time points rather than continuously. The total number of events between the two observation times are known, but the exact time of events is unknown. Furthermore, the observation times can be different among subjects and carry important information about the underlying recurrent process. The second dataset comes from cohort study data. Collecting covariate information on all study subjects makes cohort studies very expensive. One way to reduce the cost while keeping sufficient covariate information is to use a case-cohort study design. We consider case-cohort data to make inferences about the regression parameters of semiparametric transformation models. For both datasets, an empirical likelihood ratio is formulated, and the Wilks' theorem is established.

Extensive simulation studies are carried out to assess all the methods mentioned earlier in various data settings. We compare the performance in terms of coverage probabilities and average lengths by NA and EL methods' confidence intervals. The applicability of the methods is also illustrated by real datasets.

DOI

https://doi.org/10.57709/20536407

File Upload Confirmation

1

COinS