IDEAL 2002

Self-organising maps for hierarchical tree view document clustering using contextual information

By Richard Freeman and Hujun Yin

Abstract

We propose an effective method to cluster documents into a dynamically built taxonomy of topics, directly extracted from the documents. We take into account short contextual information within the text corpus, which is weighted by importance and used as input to a set of independently spun growing self-organising maps (SOM). This work shows an increase in precision and labelling quality in the hierarchy of topics, using these indexing units. The use of the tree structure over sets of conventional two-dimensional maps creates topic hierarchies that are easy to browse and understand, in which the documents are stored based on their content similarity.

Keywords

Content Management, Knowledge Management, Portal Generation, Taxonomy Generation

Bibliographic Details

@inproceedings{freemanIdeal02,
Author = {Freeman, Richard and Yin, Hujun},
Title = {Self-organising maps for hierarchical tree view document clustering using
contextual information},
BookTitle = {Intelligent Data Engineering and Automated Learning-IDEAL 2002.
Third International Conference, 12-14 Aug. 2002},
Series = {Lecture Notes in Computer Science Vol.2412},
Address= {Manchester, UK},
Publisher = {Springer-Verlag},
Pages = {123-128},
Year = {2002} }
}