XÂY DỰNG MÔ HÌNH KIỂM TRA ĐỐI CHIẾU DỮ LIỆU SỬ DỤNG NHẬN DẠNG KÝ TỰ QUANG HỌC
Abstract
Certificates validation by comparing the information on certificate scanned image with the data stored in the database is a simple and efficient method. In this research, we propose to build a model that allows automatic extraction of text information from certificates using optical character recognition techniques to compare data before publishing the information. We investigated the model on a data set of 200 applied information technology certificates from the Center for Foreign Languages - Informatics at Can Tho University of Technology, building a data comparison system integrated into the Center's certificate lookup system and experimental results show that it can digitize and extract information with 89.72% accuracy. Based on that, we propose the most significant system and give some recommendations for future researchs.