Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Educational Policy Studies

First Advisor

Phill Gagne, Ph.D. - Committee Chair

Second Advisor

Carolyn Furlow, Ph.D. - Committee Member

Third Advisor

Deanne Swan, Ph.D. - Committee Member

Fourth Advisor

Philo Hutcheson, Ph.D. - Committee Member

Fifth Advisor

Sheryl Gowen, Ph.D. - Committee Member


When researchers are unable to randomly assign students to treatment conditions, selection bias is introduced into the estimates of treatment effects. Random assignment to treatment conditions, which has historically been the scientific benchmark for causal inference, is often impossible or unethical to implement in educational systems. For example, researchers cannot deny services to those who stand to gain from participation in an academic program. Additionally, students select into a particular treatment group through processes that are impossible to control, such as those that result in a child dropping-out of high school or attending a resource-starved school. Propensity score methods provide valuable tools for removing the selection bias from quasi-experimental research designs and observational studies through modeling the treatment assignment mechanism. The utility of propensity scores has been validated for the purposes of removing selection bias when the observations are assumed to be independent; however, the ability of propensity scores to remove selection bias in a multilevel context, in which group membership plays a role in the treatment assignment, is relatively unknown. A central purpose of the current study was to begin filling in the gaps in knowledge regarding the performance of propensity scores for removing selection bias, as defined by covariate balance, in multilevel settings using a Monte Carlo simulation study. The performance of propensity scores were also examined using a large-scale national dataset. Results from this study provide support for the conclusion that multilevel characteristics of a sample have a bearing upon the performance of propensity scores to balance covariates between treatment and control groups. Findings suggest that propensity score estimation models should take into account the cluster-level effects when working with multilevel data; however, the numbers of treatment and control group individuals within each cluster must be sufficiently large to allow estimation of those effects. Propensity scores that take into account the cluster-level effects can have the added benefit of balancing covariates within each cluster as well as across the sample as a whole.