IDEAL 2006 – Richard Freeman, PhD

Topological Tree Clustering of Web Search Results

By Richard Freeman

Abstract

Abstract. In the knowledge economy taxonomy generation, information retrieval and portals in intelligent enterprises need to be dynamically adaptive to changes in their enterprise content. To remain competitive and efficient, this has to be done without exclusively relying on knowledge workers to update taxonomies or manually label documents. This paper briefly reviews existing visualisation methods used in presenting search results retrieved from a web search engine. A method, termed topological tree, that could be use to automatically organise large sets of documents retrieved from any type of search, is presented. The retrieved results, organised using an online version of the topological tree method, are compared to the visual representation of a web search engine that uses a document clustering algorithm. A discussion is made on the criterions of representing hierarchical relationships, having visual scalability, presenting underlying topics extracted from the document set, and providing a clear view of the connections between topics. The topological tree has been found to be a superior representation in all cases and well suited for organising web content.

Keywords

Information retrieval, document clustering, search engine, self-organizing maps, topological tree, information access, faceted classification, guided navigation, taxonomy generation, neural networks, post retrieval clustering, taxonomy generation, enterprise portals, enterprise content management, enterprise search, information management.

Bibliographic Details

@inproceedings{freemanIdeal06,
   Author = {Freeman, Richard T.},
   Title = {Topological Tree Clustering of Web Search Results},
   BookTitle = {Intelligent Data Engineering and Automated Learning-IDEAL 2006. 
   Fifth International Conference, 20-23 Sep. 2006},
   Series = {Lecture Notes in Computer Science Vol.4224},
   Address= {Burgos, Spain},
   Publisher = {Springer-Verlag},
   Pages = {789-797},
   Year = {2006}
}