error
Kurzer Serviceunterbruch am Donnerstag, 15. Januar 2026, 12 bis 13 Uhr. Sie können in diesem Zeitraum keine neuen Dokumente hochladen oder bestehende Einträge bearbeiten. Das Login wird in diesem Zeitraum deaktiviert. Grund: Wartungsarbeiten // Short service interruption on Thursday, January 15, 2026, 12.00 – 13.00. During this time, you won’t be able to upload new documents or edit existing records. The login will be deactivated during this time. Reason: maintenance work
 

Performance of the pre-trained large language model GPT-4 on automated short answer grading


Loading...

Author / Producer

Date

2024-07-05

Publication Type

Journal Article

ETH Bibliography

yes

Citations

Altmetric

Data

Abstract

Automated Short Answer Grading (ASAG) has been an active area of machine-learning research for over a decade. It promises to let educators grade and give feedback on free-form responses in large-enrollment courses in spite of limited availability of human graders. Over the years, carefully trained models have achieved increasingly higher levels of performance. More recently, pre-trained Large Language Models (LLMs) emerged as a commodity, and an intriguing question is how a general-purpose tool without additional training compares to specialized models. We studied the performance of GPT-4 on the standard benchmark 2-way and 3-way datasets SciEntsBank and Beetle, where in addition to the standard task of grading the alignment of the student answer with a reference answer, we also investigated withholding the reference answer. We found that overall, the performance of the pre-trained general-purpose GPT-4 LLM is comparable to hand-engineered models, but worse than pre-trained LLMs that had specialized training.

Publication status

published

Editor

Book title

Volume

4 (1)

Pages / Article No.

47

Publisher

Springer

Event

Edition / version

Methods

Software

Geographic location

Date collected

Date created

Subject

Automated short answer grading; Large language model; SciEntsBank; Beetle; GPT

Organisational unit

02219 - ETH AI Center / ETH AI Center

Notes

Funding

Related publications and datasets