Files
Abstract
Multilevel Item Response Theory (IRT) models provide an analytic approach that formally incorporates the hierarchical structure characteristic of much educational and psychological data. In this study, maximum likelihood (ML) estimation, which is the method most widely used in current applied multilevel IRT analyses and Bayesian estimation, which has become a viable alternative to ML-based estimation techniques were examined. Item andability parameter estimates from Bayesian and ML methods were compared using both empirical data and simulated data. It was found that Bayesian estimation using WinBUGS performed better than ML estimations in all conditions with regard to the item parameter estimates. For the individual (Level 2) variance estimates, PQL estimation using HLM showed less bias than the others. However, Bayesian and ML estimations performed similarly to each other for the group (Level 3) variance parameter estimates.