An unsupervised approach for sentiment analysis via financial texts

  • Cong Chi Pham
  • Bay Van Nguyen
  • Huy Quoc Nguyen

Tóm tắt

The rapidly increasing volume of textual data has made manual labeling extremely costly and time-consuming. To address this limitation, researchers have gradually focused on unsupervised learning techniques that enable models to classify text without relying on labeled data. Among these, deep clustering has garnered significant interest. However, most existing deep clustering methods are primarily designed for computer vision tasks. In this paper, we propose modifications to two of the most powerful deep clustering methods, including DEKM and DeepCluster, by integrating transformer algorithms in the Natural Language Processing (NLP) domain, enabling these methods to handle textual data. With the proposed methods, we achieved the best results on the test set of the Financial Phrase Bank (FPB) dataset with an accuracy of 57.71% and on the test set of the Twitter Financial News (TFN) dataset with an accuracy of 65.58%. Although these results are still lower than those of traditional supervised deep learning methods, we have demonstrated that the performance of our proposed methods can be further improved when trained with more data. This highlights the promising potential of deep clustering methods for natural language processing tasks. Especially when addressing tasks where the data is either unlabeled or lacks sufficient labeling.

Tác giả

Cong Chi Pham
Khoa Công nghệ thông tin - Trường Đại học Mở TP.HCM
Bay Van Nguyen
Ho Chi Minh City Open University, Ho Chi Minh City
Huy Quoc Nguyen
Ho Chi Minh City Open University, Ho Chi Minh City
điểm /   đánh giá
Phát hành ngày
2025-01-13
Chuyên mục
Bài viết