BUILDING A VIETNAMESE MATH CHATBOT BASED ON RAG AND LLM: SYSTEM DESIGN, IMPLEMENTATION AND EXPERIMENTAL EVALUATION:

Pham Van Khanh and Pham Vu Anh Tuan

Pham Van Khanh and Pham Vu Anh Tuan

Tóm tắt

In recent years, large language models (LLM) and Retrieval-Augmented Generation (RAG) techniques have opened up new opportunities for the development of intelligent learning assistant systems. Nevertheless, the direct application of LLMs to Vietnamese Mathematics still has several limitations, including the illusion effect, a lack of knowledge base, failures to adhere to the Vietnamese education curricula, and difficulty in proceessing complex-related image issues. This paper presents the design, implementation, and experimental evaluation of a Vietnamese Mathematics Chatbot system based on the RAG architecture combined with LLM. The system comprises: (i) a pineline for collecting and standardizing Mathematics data from textbooks, exam questions and reference materials; (ii) a Milvus vector database to store embeddings generated by the BGE-m3 model; (iii) a multi-task pipeline coordinated by LangGraph; (iv) the inference component uses the Qwen3-VL-8B model implemented via vLLM; and (v) the WebUI user interface supports multimodal queries (text + image). Experimental results show that the system competently delivers detailed solutions, maintains the conversation flow, and significantly mitigates hallucination compared to pure LLM. The system also demonstrates potential for application in teaching and learning Mathematics, especially in situations requiring accurate knowledge retrieval and step-by-step explanations. These results suggest directions towards developing specialized learning assistants within the context of Vietnamese education.

BUILDING A VIETNAMESE MATH CHATBOT BASED ON RAG AND LLM: SYSTEM DESIGN, IMPLEMENTATION AND EXPERIMENTAL EVALUATION

DOI: 10.18173/2354-1059.2025-0053

Tóm tắt

BỘ KHOA HỌC VÀ CÔNG NGHỆ - MINISTRY OF SCIENCE AND TECHNOLOGY OF VIETNAM

CỤC THÔNG TIN, THỐNG KÊ - NATIONAL AGENCY FOR SCIENCE AND TECHNOLOGY INFORMATION AND STATISTICS