Abstract
Clustering techniques have shown their usefulness for many real applications. In this article, we design a new clustering algorithm and adapt it to Web information foraging. The algorithm namely k-MM, takes advantages of both k-means and PAM to comply with the clustering criteria such as effectiveness, efficiency, scalability and ability to control noise and outliers. We experimented k-MM on some UCI datasets and show that when, compared to k-means, PAM, CLARA and CLARANS, it is very effective and efficient. We also tested it on COIL-100 to show its applicability on concrete domains and demonstrate that it outperforms a recent image clustering algorithm found in the literature. In a second step, we present an application to Web Information Foraging and confront k-MM to a recent agent-based method. Experiments in this case were performed on a real dynamic website called MedlinePlus, in contrast of what was traditionally done on web logs. We show that k-MM integrated to Web Information Foraging, has the ability to discover authorities more effectively and more efficiently.
Get full access to this article
View all access options for this article.
