NII Technical Report (NII-2009-017E)

Title A Set Correlation Model for Partitional Clustering
Authors Nguyen Xuan Vinh and Michael E. Houle
Abstract This paper introduces GlobalRSC, a novel formulation for partitional data clustering based on the Relevant Set Correlation (RSC) clustering model. Our formulation resembles that of the K-means clustering model, but with a shared-neighbor similarity measure instead of the Euclidean distance. Unlike K-means and most other clustering heuristics that can only work with real-valued data and distance measures taken from specific families, GlobalRSC has the advantage that it can work with any distance measure, and any data representation. We also discuss various techniques for boosting the scalability of GlobalRSC.
Language English
Published Dec 22, 2009
Pages 14p

NII Technical Reports
National Institute of Informatics