Handling missing values in exploratory multivariate data analysis methods
Auteurs-es
Julie Josse
François Husson
Résumé
This paper is a written version of the talk Julie Josse delivered at the 44 Journées de Statistique (Bruxelles,
2012), when being awarded the Marie-Jeanne Laurent-Duhamel prize for her Ph.D. dissertation by the French Statistical
Society. It proposes an overview of some results, proposed in Julie Josse and François Husson’s papers, as well as new
challenges in the field of handling missing values in exploratory multivariate data analysis methods and especially in
principal component analysis (PCA). First we describe a regularized iterative PCA algorithm to provide point estimates
of the principal axes and components and to overcome the major issue of overfitting. Then, we give insight in the
parameters variance using a non parametric multiple imputation procedure. Finally, we discuss the problem of the
choice of the number of dimensions and we detail cross-validation approximation criteria. The proposed methodology
is implemented in the R package missMDA.