Date of Award


Degree Type


Degree Name

Doctor of Philosophy (PhD)


Educational Psychology and Special Education

First Advisor

Karen M. Zabrucky - Chair


Researchers interested in metacognition of text comprehension (metacomprehension) have investigated both a knowledge and a monitoring component. Knowledge of comprehension consists of one’s awareness of person, strategy, and task variables and is investigated primarily through interviews and questionnaires. Monitoring of comprehension consists of two equally important abilities: evaluation and regulation. Evaluation involves adults’ ability to assess their understanding during reading, whereas regulation involves their ability to use compensatory strategies to resolve comprehension failures. Monitoring of comprehension is assessed through a variety of paradigms, such as on-line performance measures, error detection, and calibration. Researchers interested in adults’ evaluation ability have frequently employed a calibration paradigm in which adults are asked to take a comprehension test after reading one or more passages and make confidence judgments about their future test performance (predictions) or past test performance (postdictions). Findings indicate that adults are generally poor at evaluating their comprehension, and that a number of variables may influence their performance. However, findings have often been inconsistent, and a clearer picture of adults’ ability is needed. Item Response Theory (IRT) is a modern psychometric approach that has been successfully applied in psychological and educational research. An IRT-based comprehension test may provide a better measure of comprehension than those used in prior research. The main purpose of this study was to develop an IRT-based comprehension test for use in calibration studies. Students were also asked to report their guessing behavior, which was analyzed to determine if guessing influenced postdiction accuracy. Undergraduate and graduate students (n = 1,006) completed a comprehension test, made postdictions after each item, and reported their guessing behavior. Calibration accuracy was measured by comparing students’ test scores and postdictions. Factor analysis and a scree test were used to determine unidimensionality of the data, and chi square statistics were used to determine item fit. The comprehension test was found to be appropriate for distinguishing students at the low end of the ability continuum, but additional items need to be developed to discriminate among students at higher ability levels. Guessing scores were moderately, but significantly (p < .01) correlated with both comprehension performance and postdiction accuracy.