TOWARDS CROSS-ATTENTION PRE-TRAINING IN NEURAL MACHINE TRANSLATION

Pham Vinh Khang; Nguyen Hong Buu Long

Pham Vinh Khang
Nguyen Hong Buu Long

Từ khóa: cross-attention; cross-lingual; natural language processing; neural machine translation; pre-training, language model

Tóm tắt

The advent of pre-train techniques and large language models has significantly leveraged the performance of many natural language processing (NLP) tasks. However, pre-trained language models for neural machine translation remain a challenge as little information about the interaction of the language pair is learned. In this paper, we explore several studies trying to define a training scheme to pre-train the cross-attention module between the encoder and the decoder by using the large-scale monolingual corpora independently. The experiments show promising results, proving the effectiveness of using pre-trained language models in neural machine translation.

TOWARDS CROSS-ATTENTION PRE-TRAINING IN NEURAL MACHINE TRANSLATION

Tóm tắt

BỘ KHOA HỌC VÀ CÔNG NGHỆ - MINISTRY OF SCIENCE AND TECHNOLOGY OF VIETNAM

CỤC THÔNG TIN KHOA HỌC VÀ CÔNG NGHỆ QUỐC GIA - NATIONAL AGENCY FOR SCIENCE AND TECHNOLOGY INFORMATION