Evaluating LLMs as Educational Co-authors: A Methodological Framework

International Scientific Multidisciplinary Conference: AI for a Smarter Tomorrow

Bruno Polonijo - University of Applied Sciences of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia ORCID

Sabrina Šuman ORCID

Tomislav Car - University of Rijeka, Faculty of Tourism and Hospitality Management, Primorska 46, 51410 Opatija, Croatia ORCID

Abstract:

This paper introduces and applies a novel methodological framework for the comprehensive evaluation of Large Language Models (LLMs) as educational co-authors. The purpose of this work is to address the lack of structured, reproducible processes for assessing the quality and practical utility of AI-generated educational content. A controlled experiment was conducted in which a standardised task was assigned to several leading LLMs. The effectiveness and quality of each model were evaluated within a newly developed framework that measures both the final product quality and the “refinement effort” required. The results reveal a non-linear evolution in model capabilities, highlighting persistent shortcomings in complex reasoning. The findings emphasize the importance of a holistic evaluation that considers both the final output and the efficiency of the collaborative process, offering a practical tool for educators and researchers

International Scientific Multidisciplinary Conference: AI for a Smarter Tomorrow - AI-SMART , September 25-26, 2025

Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-Non-Commercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission.

Suggested Citation

Polonijo, B., Šuman, S., & Car, T. (2025). Evaluating LLMs as Educational Co-authors: A Methodological Framework. (pp. 147-155). https://doi.org/10.31410/AI.SMART.2025.147