An official website of the European Union How do you know?      
European Commission logo
JRC Publications Repository Menu

A Validity Index for Prototype Based Clustering of Data Sets with Complex Cluster Structures

cover
Evaluation of how well the extracted clusters fit the true partitions of a data set is one of the fundamental challenges in unsupervised clustering because the data structure and the number of clusters are unknown a priori. Cluster validity indices are commonly used to select the best partitioning from different clustering results, however, they are often inadequate unless clusters are well separated or have parametrical shapes. Prototype based clustering (finding of clusters by grouping the prototypes obtained by vector quantization of the data), which is becoming increasingly important for its effectiveness in the analysis of large, high-dimensional data sets, adds another dimension to this challenge. For validity assessment of prototype based clusterings, previously proposed indexes ¿ mostly devised for the evaluation of point based clusterings ¿ usually perform poorly. The poor performance is made worse when the validity indexes are applied to large data sets with complicated cluster structure. In this work we propose a new index, Conn Index, which can be applied to data sets with a wide variety of clusters of different shapes, sizes, densities or overlaps. We construct Conn Index based on inter and intra-cluster connectivities of prototypes. Connectivities are defined through a ¿connectivity matrix¿, which is a weighted Delaunay graph where the weights indicate the local data distribution. Experiments on synthetic and real data indicate that Conn Index outperforms existing validity indices, used in this study, for the evaluation of prototype based clustering results.
2011-07-28
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
JRC62772
1083-4419,   
https://publications.jrc.ec.europa.eu/repository/handle/JRC62772,   
10.1109/TSMCB.2010.2104319,   
Language Citation
NameCountryCityType
Datasets
IDTitlePublic URL
Dataset collections
IDAcronymTitlePublic URL
Scripts / source codes
DescriptionPublic URL
Additional supporting files
File nameDescriptionFile type 
Show metadata record  Copy citation url to clipboard  Download BibTeX
Items published in the JRC Publications Repository are protected by copyright, with all rights reserved, unless otherwise indicated. Additional information: https://ec.europa.eu/info/legal-notice_en#copyright-notice