Implementasi Model Gpt-3.5 Turbo Untuk Otomatisasi Penilaian Esai Pada Sistem Pembelajaran Daring

Ade Suyadi, Sandra Jamu Kuryanti, Cep Adiwihardja, Khaila Anjani, Meutya Febi Santoso

Ade Suyadi, Sandra Jamu Kuryanti, Cep Adiwihardja, Khaila Anjani, Meutya Febi Santoso

Education

JURNAL RISET KOMPUTER (JURIKOM)

0.0 (0 ratings)

Introduction

Implementasi model gpt-3.5 turbo untuk otomatisasi penilaian esai pada sistem pembelajaran daring. Otomatisasi penilaian esai dalam pembelajaran daring menggunakan GPT-3.5 Turbo. Model ini mencapai akurasi 94,3% dan kesepakatan kuat dengan penilai manusia, menghemat waktu serta upaya.

177 views

Abstract

Essay assessment in online learning requires significant time, effort, and consistency, which can be challenging to maintain when conducted manually. This study explores the use of the large language model GPT-3.5 Turbo as the core of an automated essay scoring system for online learning platforms. Employing a Research and Development (R&D) approach with the ADDIE development model—comprising Analysis, Design, Development, Implementation, and Evaluation phases—the research adopts the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework for its methodology. The automated essay scoring system utilizing Prompt 4 demonstrated exceptionally high accuracy and reliability. The model achieved an accuracy of 94.3%, an F1-Score of 0.955, and a Cohen’s Kappa value of 0.878. This high Kappa value indicates a very strong agreement between AI-generated assessments and the gold standard validated by educators, surpassing the initial inter-rater agreement among educators themselves, which was only 0.1157. The superior performance of Prompt 4 is also confirmed by the lowest Mean Absolute Error (MAE) of 30.54 and the highest Area Under the Curve (AUC) of 0.956.

Review

This paper presents a highly relevant and timely exploration into the automation of essay assessment within online learning environments, a domain notoriously challenged by the time-intensive and consistency issues of manual grading. The core contribution lies in the successful implementation of GPT-3.5 Turbo as the foundation for an automated essay scoring system. The abstract immediately highlights compelling empirical results, demonstrating the model's exceptional accuracy and reliability, which positions this study as a significant advancement in leveraging large language models to streamline educational processes and alleviate educator workload. The methodological approach, characterized by a Research and Development (R&D) strategy incorporating the ADDIE development model and the CRISP-DM framework, provides a strong foundation for the study's findings. The reported performance metrics are particularly impressive: an accuracy of 94.3%, an F1-Score of 0.955, and a Cohen's Kappa value of 0.878. The high Kappa value is especially noteworthy, indicating a "very strong agreement" between the AI's assessments and the gold standard validated by educators. Crucially, this agreement significantly surpasses the initial inter-rater agreement observed among human educators, suggesting that the AI system, specifically optimized with "Prompt 4," offers a more consistent and potentially objective evaluation standard. The lowest Mean Absolute Error (MAE) and highest Area Under the Curve (AUC) further corroborate the model's robust predictive capabilities. The implications of this research are profound for the future of online education, offering a tangible solution to enhance the efficiency and consistency of essay grading. By automating this process, educators can redirect their efforts towards more personalized student interaction and curriculum development, while students benefit from timely and consistent feedback. While the abstract strongly emphasizes performance, future work could delve into the generalizability of "Prompt 4" across diverse essay topics, disciplines, or student populations, and potentially explore the interpretability of the AI's grading rationale. Overall, this study offers a highly promising and empirically sound application of advanced AI in education, making a substantial contribution to the field and warranting serious consideration for publication.

Full Text

You need to be logged in to view the full text and Download file of this article - Implementasi Model Gpt-3.5 Turbo Untuk Otomatisasi Penilaian Esai Pada Sistem Pembelajaran Daring from JURNAL RISET KOMPUTER (JURIKOM) .

Comments

You need to be logged in to post a comment.

Top Blogs by Rating

Favorite Blog

Implementasi Model Gpt-3.5 Turbo Untuk Otomatisasi Penilaian Esai Pada Sistem Pembelajaran Daring

Home Research Details

Ade Suyadi, Sandra Jamu Kuryanti, Cep Adiwihardja, Khaila Anjani, Meutya Febi Santoso