Détection non-supervisée d’observations atypiques en contrôle de qualité : un survol
Abstract
The outlier or anomaly detection is quite a challenge in many areas. In this article, we mainly focus on quality control and we do a review of the literature of unsupervised methods. All along this work, the notion of outlyingness follows the definition given by Hawkins (1980), namely that an observation is outlying if it is generated by a different mechanism than the one of the bulk of the data. A first section focuses on the context of quality control for the electronic components for automotive applications. It reviews all the common methods used in practice. It appears that mainly univariate methods are integrated into the fault detection processes. Only a few multivariate methods like the Mahalanobis distance or the Principal Components Analysis are used by some manufacturers. The next sections attempt to summarize all the unsupervised methods for outlier detection as well as their implementation in the R software (R Core Team, 2017). A distinction is made between methods designed for standard data, i.e. with more observations than variables, and those adapted to high dimensional data with a small sampling size.