Evaluasi Sistem Retrieval-Augmented Generation Berbasis Low-Code dalam Meningkatkan Akurasi Konseptual Pembelajaran Sosiologi
DOI:
https://doi.org/10.35746/jtim.v8i2.1016Keywords:
Artificial Intelligence, Large Language Model, n8n, Retrieval-Augmented Generation, ROUGE-LAbstract
The utilization of Large Language Models (LLMs) in higher education offers significant efficiency, yet it introduces critical risks of information hallucination and conceptual bias, particularly in the discipline of Sociology. This study aims to evaluate the performance of a Retrieval-Augmented Generation (RAG) system based on the low-code platform n8n as a robust solution for hallucina-tion mitigation. The system integrates semantic search using Supabase as a vector database and the Gemini 2.5 Flash model to restrict response generation exclusively to verified academic litera-ture. The research employed an Experimental Single-System Evaluation method with a du-al-evaluation approach (quantitative and qualitative) across 50 test instruments. Quantitative testing using the ROUGE-L metric recorded a mean score of 0.354, indicating adequate structural similarity despite variations inherent to the paraphrasing nature of LLMs in analytical tasks. Thematic qualitative analysis of evaluator comments revealed 98.2% positive sentiment, with the dominant theme being “Conceptually Accurate”. Ultimately, 100% of the expert panel declared the system suitable for implementation as a reliable supplementary learning medium.
Downloads
References
J. Robert, "2024 EDUCAUSE AI Landscape Study, EDUCAUSE, 2024. Available: https://library.educause.edu/resources/2024/2/2024-educause-ai-landscape-study. [Accessed: Nov. 19, 2025].
OECD, "Artificial Intelligence and Education and Skills," OECD, 2024. Available: https://www.oecd.org/en/topics/sub-issues/artificial-intelligence-and-education-and-skills.html. [Accessed: Nov. 19, 2025].
H. Khandakar, S. A. Fazal, K. F. Afnan, and K. K. Hasan, “Implications of artificial intelligence chatbot models in higher education,” IAES Int. J. Artif. Intell., vol. 13, no. 4, pp. 3808–3813, 2024, https://doi.org/10.11591/ijai.v13.i4.pp3808-3813.
I. Gligorea, M. Cioca, R. Oancea, A. T. Gorski, H. Gorski, and P. Tudorache, “Adaptive Learning Using Artificial Intelli-gence in e-Learning: A Literature Review,” Educ. Sci., vol. 13, no. 12, 2023, https://doi.org/10.3390/educsci13121216.
Y. Zhang et al., “Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models,” Comput. Linguist., pp. 1–46, 2025, https://doi.org/10.1162/coli.a.16.
F F. Cuconasu et al., “The Power of Noise: Redefining Retrieval for RAG Systems,” SIGIR 2024 - Proc. 47th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., pp. 719–729, 2024, https://doi.org/10.1145/3626772.3657834.
D. Abror and Rousyati, “Etika Dan Bias Dalam Llm: Tanggung Jawab Sosial Atas Kecerdasan Buatan Generatif,” J. Unitek, vol. 18, no. 1, pp. 69–75, 2025, https://doi.org/10.52072/unitek.v18i1.1386.
H. W. Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, “Retriev-al-Augmented Generation for Large Language Models: A Survey,” arXiv preprint, 2023,: https://doi.org/10.48550/arXiv.2312.10997.
S. Dhuliawala et al., “Chain-of-Verification Reduces Hallucination in Large Language Models,” Proc. Annu. Meet. Assoc. Comput. Linguist., pp. 3563–3578, 2024, https://doi.org/10.18653/v1/2024.findings-acl.212.
Z. Li, Z. Wang, W. Wang, K. Hung, H. Xie, and F. Lee, “Retrieval-augmented generation for educational application : A systematic survey,” Comput. Educ. Artif. Intell., vol. 8, no. May, p. 100417, 2025, https://doi.org/10.1016/j.caeai.2025.100417.
O. Nakhod, “Using Retrieval-Augmented Generation to Elevate Low-Code Developer Skills,” Artif. Intell., 2023, https://doi.org/10.15407/jai2023.03.126.
J. Swacha and M. Gracel, “Retrieval-Augmented Generation (RAG) Chatbots for Education: A Survey of Applications,” Appl. Sci., vol. 15, no. 8, 2025, https://doi.org/10.3390/app15084234.
D. Thüs, S. Malone, and R. Brünken, “Exploring generative AI in higher education: a RAG system to enhance student en-gagement with scientific literature,” Front. Psychol., vol. 15, no. October, pp. 1–23, 2024, https://doi.org/10.3389/fpsyg.2024.1474892.
S. Dakshit, “Faculty Perspectives on the Potential of RAG in Computer Science Higher Education,” Proc. 25th Annu. Conf. Inf. Technol. Educ. SIGITE 2024, pp. 19–24, 2024, https://doi.org/10.1145/3686852.3686864.
E. Tyndall, C. Gayheart, A. Some, J. Genz, T. Wagner, and B. Langhals, “Impact of retrieval augmented generation and large language model complexity on undergraduate exams created and taken by AI agents,” Data Policy, vol. 7, 2025, https://doi.org/10.1017/dap.2025.10024.
Q. Huang, C. Lv, L. Lu, and S. Tu, “Evaluating the Quality of AI-Generated Digital Educational Resources for University Teaching and Learning,” pp. 1–18, 2025, https://doi.org/10.3390/systems13030174.
C.-Y. Lin, "ROUGE: A package for automatic evaluation of summaries," in Text Summarization Branches Out, 2004, pp. 74-81, https://aclanthology.org/W04-1013/.
A. Janakiraman and B. Ghoraani, “An Empirical Comparison of Text Summarization: A Multi-Dimensional Evaluation of Large Language Models,” arXiv preprint, 2025, http://arxiv.org/abs/2504.04534.
S. Es, J. James, L. Espinosa-anke, and S. Schockaert, “Ragas: Automated Evaluation of Retrieval Augmented Generation,” arXiv preprint, 2023, https://arxiv.org/abs/2309.15217.
H. Li et al., “LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods,” arXiv preprint, 2024, https://arxiv.org/abs/2412.05579.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Rusmanto, Jasrino Maulana Putra

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




