Evaluasi Sistem Retrieval-Augmented Generation Berbasis Low-Code dalam Meningkatkan Akurasi Konseptual Pembelajaran Sosiologi

Penulis

  • Rusmanto Rusmanto Program Studi Sistem Informasi, Sekolah Tinggi Teknologi Terpadu Nurul Fikri
  • Jasrino Maulana Putra Program Studi Sistem Informasi, Sekolah Tinggi Teknologi Terpadu Nurul Fikri

DOI:

https://doi.org/10.35746/jtim.v8i2.1016

Kata Kunci:

Artificial Intelligence, Large Language Model, n8n, Retrieval-Augmented Generation, ROUGE-L

Abstrak

The utilization of Large Language Models (LLMs) in higher education offers significant efficiency, yet it introduces critical risks of information hallucination and conceptual bias, particularly in the discipline of Sociology. This study aims to evaluate the performance of a Retrieval-Augmented Generation (RAG) system based on the low-code platform n8n as a robust solution for hallucina-tion mitigation. The system integrates semantic search using Supabase as a vector database and the Gemini 2.5 Flash model to restrict response generation exclusively to verified academic litera-ture. The research employed an Experimental Single-System Evaluation method with a du-al-evaluation approach (quantitative and qualitative) across 50 test instruments. Quantitative testing using the ROUGE-L metric recorded a mean score of 0.354, indicating adequate structural similarity despite variations inherent to the paraphrasing nature of LLMs in analytical tasks. Thematic qualitative analysis of evaluator comments revealed 98.2% positive sentiment, with the dominant theme being “Conceptually Accurate”. Ultimately, 100% of the expert panel declared the system suitable for implementation as a reliable supplementary learning medium.

Unduhan

Data unduhan tidak tersedia.

Referensi

J. Robert, "2024 EDUCAUSE AI Landscape Study, EDUCAUSE, 2024. Available: https://library.educause.edu/resources/2024/2/2024-educause-ai-landscape-study. [Accessed: Nov. 19, 2025].

OECD, "Artificial Intelligence and Education and Skills," OECD, 2024. Available: https://www.oecd.org/en/topics/sub-issues/artificial-intelligence-and-education-and-skills.html. [Accessed: Nov. 19, 2025].

H. Khandakar, S. A. Fazal, K. F. Afnan, and K. K. Hasan, “Implications of artificial intelligence chatbot models in higher education,” IAES Int. J. Artif. Intell., vol. 13, no. 4, pp. 3808–3813, 2024, https://doi.org/10.11591/ijai.v13.i4.pp3808-3813.

I. Gligorea, M. Cioca, R. Oancea, A. T. Gorski, H. Gorski, and P. Tudorache, “Adaptive Learning Using Artificial Intelli-gence in e-Learning: A Literature Review,” Educ. Sci., vol. 13, no. 12, 2023, https://doi.org/10.3390/educsci13121216.

Y. Zhang et al., “Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models,” Comput. Linguist., pp. 1–46, 2025, https://doi.org/10.1162/coli.a.16.

F F. Cuconasu et al., “The Power of Noise: Redefining Retrieval for RAG Systems,” SIGIR 2024 - Proc. 47th Int. ACM SIGIR Conf. Res. Dev. Inf. Retr., pp. 719–729, 2024, https://doi.org/10.1145/3626772.3657834.

D. Abror and Rousyati, “Etika Dan Bias Dalam Llm: Tanggung Jawab Sosial Atas Kecerdasan Buatan Generatif,” J. Unitek, vol. 18, no. 1, pp. 69–75, 2025, https://doi.org/10.52072/unitek.v18i1.1386.

H. W. Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, Meng Wang, “Retriev-al-Augmented Generation for Large Language Models: A Survey,” arXiv preprint, 2023,: https://doi.org/10.48550/arXiv.2312.10997.

S. Dhuliawala et al., “Chain-of-Verification Reduces Hallucination in Large Language Models,” Proc. Annu. Meet. Assoc. Comput. Linguist., pp. 3563–3578, 2024, https://doi.org/10.18653/v1/2024.findings-acl.212.

Z. Li, Z. Wang, W. Wang, K. Hung, H. Xie, and F. Lee, “Retrieval-augmented generation for educational application : A systematic survey,” Comput. Educ. Artif. Intell., vol. 8, no. May, p. 100417, 2025, https://doi.org/10.1016/j.caeai.2025.100417.

O. Nakhod, “Using Retrieval-Augmented Generation to Elevate Low-Code Developer Skills,” Artif. Intell., 2023, https://doi.org/10.15407/jai2023.03.126.

J. Swacha and M. Gracel, “Retrieval-Augmented Generation (RAG) Chatbots for Education: A Survey of Applications,” Appl. Sci., vol. 15, no. 8, 2025, https://doi.org/10.3390/app15084234.

D. Thüs, S. Malone, and R. Brünken, “Exploring generative AI in higher education: a RAG system to enhance student en-gagement with scientific literature,” Front. Psychol., vol. 15, no. October, pp. 1–23, 2024, https://doi.org/10.3389/fpsyg.2024.1474892.

S. Dakshit, “Faculty Perspectives on the Potential of RAG in Computer Science Higher Education,” Proc. 25th Annu. Conf. Inf. Technol. Educ. SIGITE 2024, pp. 19–24, 2024, https://doi.org/10.1145/3686852.3686864.

E. Tyndall, C. Gayheart, A. Some, J. Genz, T. Wagner, and B. Langhals, “Impact of retrieval augmented generation and large language model complexity on undergraduate exams created and taken by AI agents,” Data Policy, vol. 7, 2025, https://doi.org/10.1017/dap.2025.10024.

Q. Huang, C. Lv, L. Lu, and S. Tu, “Evaluating the Quality of AI-Generated Digital Educational Resources for University Teaching and Learning,” pp. 1–18, 2025, https://doi.org/10.3390/systems13030174.

C.-Y. Lin, "ROUGE: A package for automatic evaluation of summaries," in Text Summarization Branches Out, 2004, pp. 74-81, https://aclanthology.org/W04-1013/.

A. Janakiraman and B. Ghoraani, “An Empirical Comparison of Text Summarization: A Multi-Dimensional Evaluation of Large Language Models,” arXiv preprint, 2025, http://arxiv.org/abs/2504.04534.

S. Es, J. James, L. Espinosa-anke, and S. Schockaert, “Ragas: Automated Evaluation of Retrieval Augmented Generation,” arXiv preprint, 2023, https://arxiv.org/abs/2309.15217.

H. Li et al., “LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods,” arXiv preprint, 2024, https://arxiv.org/abs/2412.05579.

Diterbitkan

2026-05-29

Terbitan

Bagian

Articles

Cara Mengutip

[1]
R. Rusmanto dan J. M. Putra, “Evaluasi Sistem Retrieval-Augmented Generation Berbasis Low-Code dalam Meningkatkan Akurasi Konseptual Pembelajaran Sosiologi”, jtim, vol. 8, no. 2, hlm. 406–416, Mei 2026, doi: 10.35746/jtim.v8i2.1016.