Text data augmentation techniques for sentiment analysis based on Vietnamese language

  • Hồ Hướng Thiên
Keywords: product comments; text mining; text data augmentation; sentiment analysis; natural language processing

Abstract

Comments from online system are used as a data source that exist in relevant information about customer sentiment. These include sentiments toward a product or service. This is useful for making a specific decision for customers and management. In order to building a high accuracy prediction model, it requires much more labeled data. In this paper, we have investigated a simple approach for augmenting text data based on Vietnamese language comments. Four basic techniques are used to generate more new sentences such as random insertion, random swap, word replacement, word deletion. The results of experimental shows that the proposed approach is efficient.

Tác giả

Hồ Hướng Thiên

Trường Đại học Đại học Mở Thành phố Hồ Chí Minh, Việt Nam

điểm /   đánh giá
Published
2022-11-23
Section
Bài viết