Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Latent Dirichlet Allocation (LDA) is a probabilistic topic model that has been used in the context of testing for detecting the latent themes in examinees’ responses to constructed-response (CR) items. There does not as yet appear to be clear evidence, however, as to which model selection indices are most accurate in conditions common with CR answers. In this study, we evaluated the performance of several model selection indices commonly used with topic models, including similarity measures, perplexity using 5-fold cross-validation, information criterion indices, semantic coherence, and exclusivity. Data were simulated with different numbers of latent topics, answer documents, average lengths of answers, and numbers of unique words typical of practical measurement conditions. Results suggested that the average cosine similarity, perplexity using 5-fold cross-validation, and information criteria indices were most accurate for model selection over the conditions simulated in this study.

Details

PDF

Statistics

from
to
Export
Download Full History