Text data augmentation techniques for sentiment analysis based on Vietnamese language
Keywords:
product comments; text mining; text data augmentation; sentiment analysis; natural language processing
Abstract
Comments from online system are used as a data source that exist in relevant information about customer sentiment. These include sentiments toward a product or service. This is useful for making a specific decision for customers and management. In order to building a high accuracy prediction model, it requires much more labeled data. In this paper, we have investigated a simple approach for augmenting text data based on Vietnamese language comments. Four basic techniques are used to generate more new sentences such as random insertion, random swap, word replacement, word deletion. The results of experimental shows that the proposed approach is efficient.