Vietnamese text recognition in scene images using deep learning

Huynh Van Huy; Nguyen Thi Thanh Tan; Ngo Quoc Tao

Huynh Van Huy Lac Hong University
Nguyen Thi Thanh Tan Electric Power University
Ngo Quoc Tao Institute of Information Technology, Vietnam Academy of Science and Technology

Keywords: Detection; Recognition; Feature; Probability; Accuracy.

Abstract

This article proposes an effective method for recognizing Vietnamese text in scene images. The proposed method is based on the idea of combining three processing tasks simultaneously in one recognition stage, including (i) Recognizing (predicting) character sequences from images; (ii) Context processing; and (iii) Fusing and iterative correction. The effectiveness of this method was carried out on two Vietnamese scene image datasets collected from reality: VinText and VnSceneText. Experimental results show that the proposed method is capable of detecting text of any shape and size with high and stable accuracy. Specifically, the method achieves word-level accuracy, character-level accuracy is (81.87%, 93.02%) and (82.56%, 94.33%) for the test datasets, respectively.

Vietnamese text recognition in scene images using deep learning

Abstract

BỘ KHOA HỌC VÀ CÔNG NGHỆ - MINISTRY OF SCIENCE AND TECHNOLOGY OF VIETNAM

CỤC THÔNG TIN, THỐNG KÊ - NATIONAL AGENCY FOR SCIENCE AND TECHNOLOGY INFORMATION AND STATISTICS