Files
Abstract
Automatic scoring has always been a point of interest in educational settings. With the recentadvancements in Large Language Model this work emphasize on automatic scoring strategies with Large
Language models by providing the explainability to the automatic scoring task. This thesis explores
different open-source large language models and their capabilities to perform automatic scoring. Since
the automatic scoring is directly rated to students and their evaluation; the explainability, reliability and
trustworthiness of automatically generated scores is crucial. This research is an attempt provide
explainability to these LLM generated scores by aligning LLMs with human grading strategies with
attempt to improve the automatic scoring accuracy. Finally, this thesis also presented the implementation
of system "GPTest", an end-to end system to perform the automatic scoring using LLM and used in
educational setting by teachers and students to streamline the scoring and assessment procedure.