Graph clustering and application to the extraction of main ideas in collection of online forum messages

  • Đỗ Phúc
  • Mai Xuân Hùng
  • Nguyễn Thị Kim Phụng

Abstract

This paper presents the results of building a graph clustering system for grouping the similar messages of forum of e-learning system and extracting the main ideas in the collection of messages. Message is a kind of text. To cluster the messages, we need a model for representing the documents. The traditional approaches used the models of bag of words or vector model for representing the documents. These models discard the important structural information of document such as word position, the semantic relation of words in document, the links of web pages... Recently, there are several works using the graph for representing the documents. After representing the documents by graph, Kohonen neural network was used for grouping the graphs. One of the advantages of Kohonen neural network is to cluster the data without specifying the number of clusters. Besides, Kohonen neural output layer is a document map which can put on the computer display for easily accessing the similar documents. The graph distance based on the maximum common sub-graph and the updated operation of Kohonen neural network based on the weighted means of two graphs was chosen. Our proposed solution with the messages in our online forum was tested and discuss the results were analysed.

điểm /   đánh giá
Published
2008-08-08
Section
ARTILES