An official website of the European Union How do you know?      
European Commission logo
JRC Publications Repository Menu

Fast and robust clustering of general-shaped structures with tk-merge

cover
In real-world applications, the group of provenance of data can be inherently uncertain, the data values can be imprecise and some of them can be wrong. We handle uncertain, imprecise and noisy data in clustering problems with general-shaped structures. We do it under very weak parametric assumptions with a two-step hybrid robust clustering algorithm based on trimmed k-means and hierarchical agglomeration. The algorithm has low computational complexity and effectively identifies the clusters also in presence of data contamination. We also present natural generalizations of the approach as well as an adaptive procedure to estimate the amount of contamination in a data-driven fashion. Our proposal outperforms state-of-the-art robust, model-based methods in our numerical simulations and real-world applications related to color quantization for image analysis, human mobility patterns based on GPS data, biomedical images of diabetic retinopathy, and functional data across weather stations.
2024-04-11
ELSEVIER SCIENCE INC
JRC134108
0888-613X (online),   
https://www.sciencedirect.com/science/article/pii/S0888613X24000392,    https://publications.jrc.ec.europa.eu/repository/handle/JRC134108,   
10.1016/j.ijar.2024.109152 (online),   
Language Citation
NameCountryCityType
Datasets
IDTitlePublic URL
Dataset collections
IDAcronymTitlePublic URL
Scripts / source codes
DescriptionPublic URL
Additional supporting files
File nameDescriptionFile type 
Show metadata record  Copy citation url to clipboard  Download BibTeX
Items published in the JRC Publications Repository are protected by copyright, with all rights reserved, unless otherwise indicated. Additional information: https://ec.europa.eu/info/legal-notice_en#copyright-notice