Research on application of visual language model (VLM) in smart land management
Abstract
Land management in Vietnam requires high accuracy and efficiency in processing records, especially with red books (Certificates of land use rights, house ownership rights and other assets attached to land). Traditional Optical Character Recognition (OCR) technology has many limitations, such as manual labeling costs and low flexibility. Visionl-Language Modeling (VLM) has emerged as a new solution, promising to reduce labeling efforts and increase contextual understanding. This paper explores the potential of VLM in recognizing red book information, compares its advantages and disadvantages with OCR, and proposes development directions. Initial experimental results show that VLM reduces labeling time by 70%, but the accuracy is only 88% compared to 95% of OCR on fine print. Recommendations focus on model refinement, hybrid solution development, and pilot implementation in Vietnam.