Please use this identifier to cite or link to this item:
|Title:||Robust Methods for Complex Data|
|Authors:||RIANI Marco; CERIOLI Andrea; PERROTTA DOMENICO; TORTI FRANCESCA|
|Citation:||Atti della XLIV Riunione Scientifica p. 163-170|
|Publisher:||Cooperativa Libraria Editrice Universitaria di Padova - CLEUP|
|Type:||Articles in periodicals and books|
|Abstract:||The goal of this paper is to describe a semi-automatic approach to outlier detection and clustering through the forward search (Atkinson and Riani, 2000; Atkinson, Riani and Cerioli, 2004).After an overview of the basic principles of the forward search, we concentrate on the identification of clusters of points coming from different regression models. The method is motivated in the context of fraud detection in external trade data sets, where robust fitting of mixtures of regression lines is an important open issue (see Bishop, 2006, for non-robust methods). Our tools for outlier detection and clustering are developed from forward plots of residuals computed from searches with random starting points. We also address a number of challenging issues, including selection of the number of groups. We make use of simulation envelopes and distributional results to precisely identify the outliers and the clusters. The performance of the algorithm is shown on several European trade data sets relevant for fraud detection problems selected by the antifraud services of the European Commission and of the Member States.|
|JRC Directorate:||Space, Security and Migration|
Files in This Item:
There are no files associated with this item.
Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.