Date of Award

Fall 8-12-2014

Degree Type


Degree Name

Doctor of Philosophy (PhD)


Educational Policy Studies

First Advisor

William L. Curlette, Ph.D.

Second Advisor

Theresa A. Sipe, Ph.D.

Third Advisor

Hongli Li, Ph.D.

Fourth Advisor

T. Chris Oshima, Ph. D.

Fifth Advisor

Meltem Alemdar, Ph. D.


Differential item functioning (DIF) occurs when individuals from different groups who have equal levels of a latent trait fail to earn commensurate scores on a testing instrument. Type I error occurs when DIF-detection methods result in unbiased items being excluded from the test while a Type II error occurs when biased items remain on the test after DIF-detection methods have been employed. Both errors create potential issues of injustice amongst examinees and can result in costly and protracted legal action. The purpose of this research was to evaluate two methods for detecting DIF: logistic regression (LR) and Mantel-Haenszel (MH).

To accomplish this, meta-analysis was employed to summarize Monte Carlo quantitative studies that used these methods in published and unpublished literature. The criteria employed for comparing these two methods were Type I error rates, the Type I error proportion, which was also the Type I error effect size measure, deviation scores, and power rates. Monte Carlo simulation studies meeting inclusion criteria, with typically 15 Type I error effect sizes per study, were compared to assess how the LR and MH statistical methods function to detect DIF.

Studied variables included DIF magnitude, nature of DIF (uniform or non-uniform), number of DIF items, and test length. I found that MH was better at Type I error control while LR was better at controlling Type II error. This study also provides a valuable summary of existing DIF methods and a summary of the types of variables that have been manipulated in DIF simulation studies with LR and MH. Consequently, this meta-analysis can serve as a resource for practitioners to help them choose between LR and MH for DIF detection with regard to Type I and Type II error control, and can provide insight for parameter selection in the design of future Monte Carlo DIF studies.