Testlet effects on pass/fail decisions under competing rasch models.


Access rights

Worldwide access
Access changed 12/4/17

Journal Title

Journal ISSN

Volume Title



The item response model chosen to estimate ability can influence proficiency classification, or pass/fail decisions, made about people based on test scores. This poses a potential problem for both the examinee and the decision makers because examinees may be misclassified based on the item response model used to estimate ability and not their actual proficiency in a domain of interest. The purpose of this study was to examine the use of an incorrect item response model and its impact on proficiency classification. A Monte Carlo simulation design was employed in order to directly compare competing models when the true structure of the data is known (i.e., testlet conditions). The conditions used in the design (e.g., number of items, testlet to item ratio, testlet variance, proportion of items that are testlet-based and sample size) reflect those found in the applied educational literature. An empirical example is also analyzed for pass/fail decisions with the competing models. Overall, decision consistency (DC) was very high between the two models, ranging from 91.5% to 100%. The design factor that had the greatest effect on DC was the testlet effect or testlet variance. Other design factors that affected DC included number of testlets, an interaction between testlet variance and the percent of total items in testlets, and an interaction between the number of testlets and the percent of total items in testlets. PISA is traditionally calibrated with a DRM, and contained 29 items in nine testlets. The classification agreement percent between the DRM and the TRM was 99.5%. When a testlet structure is present in applied data the testlet variance is unknown and as the testlet variance increases so does the misclassification of examinees. When measurement models are used that do not align with the structure of the data additional error is introduced into the parameter estimates. This directly impacts the decisions that are made about people.



Testlets. Rasch.