Model Selection For Latent Dirichlet Allocation In Constructed Response Items

Mardones-Segovia, Constanza

Model Selection For Latent Dirichlet Allocation In Constructed Response Items

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket

Files

Abstract

Latent Dirichlet Allocation (LDA) is a probabilistic topic model that has been used in the context of testing for detecting the latent themes in examinees’ responses to constructed-response (CR) items. There does not as yet appear to be clear evidence, however, as to which model selection indices are most accurate in conditions common with CR answers. In this study, we evaluated the performance of several model selection indices commonly used with topic models, including similarity measures, perplexity using 5-fold cross-validation, information criterion indices, semantic coherence, and exclusivity. Data were simulated with different numbers of latent topics, answer documents, average lengths of answers, and numbers of unique words typical of practical measurement conditions. Results suggested that the average cosine similarity, perplexity using 5-fold cross-validation, and information criteria indices were most accurate for model selection over the conditions simulated in this study.

Details

Record ID

3536

Record Created

2024-12-05

Title

Model Selection For Latent Dirichlet Allocation In Constructed Response Items

Author

Mardones-Segovia, Constanza

Contributor

Cohen, Allan S Advisor
Cohen, Allan S Committee Member
Wang, Shiyu Committee Member
Engelhard, George Committee Member

College or School

Mary Frances Early College of Education

Department

Educational Psychology

Content Type

Thesis

Pagination

106

File Format

pdf

Language

English

Degree Type

Master of Arts (MA)

Name of Granting Institution

University of Georgia

Year Degree Granted

2022-12

Keywords

Coherent measures; Gibbs sampling; k-fold cross-validadation; Latent Dirichlet allocation; Model selection; Similarity measures; Statistics

Record Appears in

Electronic Theses and Dissertations > Graduate Thesis
Mary Frances Early College of Education
All Resources

System Control Number

9949510412602959

PDF

Statistics

Download Full History