Issues on clustering and data gridding
This contribution addresses clustering issues in presence of densely populated data points with high degree of overlapping. In order to avoid the disturbing effects of high dense areas we suggest a technique that selects a point in each cell of a grid defined along the Principal Component axes of the data. The selected subsample removes the high density areas while preserving the general structure of the data. Once the clustering on the gridded data is produced, it is easy to classify the rest of the data with reliable and stable results. The good performance of the approach is shown on a complex dataset coming from international trade data.
HEIKKONEN Jukka;
PERROTTA Domenico;
RIANI Marco;
TORTI Francesca;
2013-04-10
Springer-Verlag
JRC61801
978-3-642-28893-7,
1431-8814,
https://publications.jrc.ec.europa.eu/repository/handle/JRC61801,
10.1007/978-3-642-28894-4_5,
Additional supporting files
| File name | Description | File type | |