Exploring land use prediction errors from area frame survey data
Résumé
We consider the problem of areal level land use classification from the information provided by point level databases such as the area frame surveys (American NRI survey, EUROSTAT Lucas survey, French Teruti-Lucas survey) and easily accessible covariates. An exploratory analysis emphasizes the link between the areal level prediction error and a measure of difficulty of prediction given by the Gini-Simpson impurity index. We provide a methodology and an R code for allowing to explore the quality of an areal frame survey by generating synthetic data.