Title: Robust Data Imputation
Citation: COMPUTATIONAL BIOLOGY AND CHEMISTRY vol. 33 no. 1 p. 7-13
Publication Year: 2009
JRC N°: JRC49689
ISSN: 1476-9271
URI: http://publications.jrc.ec.europa.eu/repository/handle/JRC49689
DOI: 10.1016/j.compbiolchem.2008.07.019
Type: Articles in Journals
Abstract: Single imputation methods have been wide-discussed topics among researchers in the field of bioinformatics. One major shortcoming of methods proposed until now is the lack of robustness considerations. Like all data, gene expression data can possess outlying values. The presence of these outliers could have negative effects on the imputated values for the missing values. Afterwards, the outcome of any statistical analysis on the completed data could lead to incorrect conclusions. Therefore it is important to consider the possibility of outliers in the data set, and to evaluate how imputation techniques will handle these values. In this paper, a simulation study is performed to test existing techniques for data imputation in case outlying values arepresent in the data. To overcomesomeshortcomings of the existingimputation techniques, a new robust imputation method that can deal with the presence of outliers in the data is introduced. In addition, the robust imputation procedure cleans the data for further statistical analysis. Moreover, this method can be easily extended towards a multiple imputation approach by which the uncertainty of the imputed values is emphasised. Finally, a classification example illustrates the lack of robustness of some existing imputation methods and shows the advantage of the multiple imputation approach of the new robust imputation technique.
JRC Institute:Institute for the Protection and Security of the Citizen

Files in This Item:
There are no files associated with this item.

Items in repository are protected by copyright, with all rights reserved, unless otherwise indicated.