Title: Robust Methods for Complex Data
Citation: Atti della XLIV Riunione Scientifica p. 163-170
Publisher: Cooperativa Libraria Editrice Universitaria di Padova - CLEUP
Publication Year: 2008
JRC N°: JRC45085
URI: http://publications.jrc.ec.europa.eu/repository/handle/JRC45085
Type: Articles in periodicals and books
Abstract: The goal of this paper is to describe a semi-automatic approach to outlier detection and clustering through the forward search (Atkinson and Riani, 2000; Atkinson, Riani and Cerioli, 2004).After an overview of the basic principles of the forward search, we concentrate on the identification of clusters of points coming from different regression models. The method is motivated in the context of fraud detection in external trade data sets, where robust fitting of mixtures of regression lines is an important open issue (see Bishop, 2006, for non-robust methods). Our tools for outlier detection and clustering are developed from forward plots of residuals computed from searches with random starting points. We also address a number of challenging issues, including selection of the number of groups. We make use of simulation envelopes and distributional results to precisely identify the outliers and the clusters. The performance of the algorithm is shown on several European trade data sets relevant for fraud detection problems selected by the antifraud services of the European Commission and of the Member States.
JRC Directorate:Space, Security and Migration

Files in This Item:
There are no files associated with this item.

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.