MINING TOP-K CORRELATED HIGH UTILITY ITEMSETS ON TRANSACTION DATABASE

  • Mạnh Thiên Lý
  • Nguyễn Thị Thanh Thủy
  • Nguyễn Văn Lễ
Keywords: High utility itemsets, correlation, top-k, transaction database, correlated high utility itemsets, top-k correlated high utility itemsets

Abstract

In this paper, we propose a new research direction, which is to exploit the top-k Correlated High Utility Itemset (TCHUI) on the transaction database. Combination of mining Correlated High Utility Itemsets and mining Top-K High Utility Itemsets to find the top-k correlated high utility itemsets on the transaction database. To address this issue, we combine the Correlated High Utility Itemset (CoHUI) mining with the threshold raising strategies and propose the TCH algorithm. This algorithm uses the Utility List data structure to store data about the utility of itemsets, uses the Kulc threshold to measure correlation, and applies several pruning strategies such as U-Prune, TWU -Prune, LA-Prune to reduce the search space. Besides that, the threshold raising strategies are applied such as RIU, LIU-E, and RUC to exploit the TCHUI set effectively. Our experiment results on large datasets including Chess, Mushroom, Retail, and Chainstore and compare with the state-of-the-art THUI algorithm. The results show that the proposed algorithm has better performance than the THUI algorithm in terms of execution time and memory usage.

điểm /   đánh giá
Published
2023-04-25
Section
Bài viết