Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Educational Policy Studies

First Advisor

William Curlette

Second Advisor

Janice Fournillier

Third Advisor

Chris Oshima

Fourth Advisor

Kerry Pannell


The application of Bayesian networks within the field of institutional research is explored through the development of a Bayesian network used to predict first- to second-year retention of undergraduates. A hybrid approach to model development is employed, in which formal elicitation of subject-matter expertise is combined with machine learning in designing model structure and specification of model parameters. Subject-matter experts include two academic advisors at a small, private liberal arts college in the southeast, and the data used in machine learning include six years of historical student-related information (i.e., demographic, admissions, academic, and financial) on 1,438 first-year students. Netica 5.12, a software package designed for constructing Bayesian networks, is used for building and validating the model. Evaluation of the resulting model’s predictive capabilities is examined, as well as analyses of sensitivity, internal validity, and model complexity. Additionally, the utility of using Bayesian networks within institutional research and higher education is discussed.

The importance of comprehensive evaluation is highlighted, due to the study’s inclusion of an unbalanced data set. Best practices and experiences with expert elicitation are also noted, including recommendations for use of formal elicitation frameworks and careful consideration of operating definitions. Academic preparation and financial need risk profile are identified as key variables related to retention, and the need for enhanced data collection surrounding such variables is also revealed. For example, the experts emphasize study skills as an important predictor of retention while noting the absence of collection of quantitative data related to measuring students’ study skills. Finally, the importance and value of the model development process is stressed, as stakeholders are required to articulate, define, discuss, and evaluate model components, assumptions, and results.