A RESEARCH ON VIETNAMESE-K’HO LANGUAGE TRANSLATION SYSTEM USING NEURAL MACHINE TRANSLATION

  • Nguyen Thi Luong, La Quoc Thang, Tran Nhat Quang, Duong Bao Ninh, Nguyen Huu Khanh, Phan Thi Thanh Nga, Tran Ngo Nhu Khanh, Tran Thong
Keywords: K'Ho language; Bilingual Corpus; Automatic translation; RNN; OpenNMT

Abstract

The K'Ho language is used by the K'Ho ethnic group, who live in the South Central Highlands, especially the districts of Don Duong, Duc Trong, Di Linh, Da Huoai, and Lac Duong in Lam Dong province. Currently, the provincial People's Committee and the Ethnic Minority Committee of Lam Dong province are encouraging cadres and officials in the province to learn the K'Ho language to contact and propagate the guidelines, lines, policies, and laws of the Party and government to the K'Ho people. In this paper, we utilize the K'Ho language resources and support from many K'Ho language experts to build a Vietnamese - K'Ho bilingual corpus to contribute the promotion and preservation of the K'Ho language. The corpus includes more than 16,000 Vietnamese-K'Ho bilingual sentence pairs, which are not easy to collect due to the limitation of K'Ho language resource. Moreover, we use the OpenNMT framework to build an automatic translation system based on the collected bilingual data. The result can reach to an accuracy of 56.54%, which is an acceptable result in the automatic translation field. 

điểm /   đánh giá
Published
2023-04-07
Section
INFORMATION AND COMMUNICATIONS TECHNOLOGY