next up previous
Next: ADDITIONAL PAPERS Up: SELECTIVE EDITING Previous: DEMONSTRATION OF THE GRAPHICAL

STOPPING CRITERION: A WAY OF OPTIMISING DATA EDITING AND ASSESSING ITS MINIMAL COST

Pascal Rivière

INSEE
18 Boulevard Adolphe, Pinard
75675 Pari Cedex 14
France

There is generally no scientific criterion to stop the manual checking of a survey: the editing process is stopped because everything has been checked, or because there is no time left for verification, or for other practical reasons. In this paper, we deliberately consider a simplistic goal for the editing process: ensuring, with a certain level of confidence, that the error rate falls below a certain threshold. For that purpose, we calculate approximate confidence intervals for the proportion of errors, and approximate predictive intervals for the number of remaining errors after checking and fixing part of the returns. Using the upper bound of the one-sided predictive interval, we can then easily define a stopping criterion: whenever the upper bound of the rate of remaining errors is greater that the target error rate, data editing continues; as soon as it falls below the threshold, we can stop editing. Such an approach allows to reduce the cost of data editing by avoiding unnecessary manual checks. The main results of this paper are in section 5, in which we define a relationship between the cost of editing and four main parameters: target error rate, number of target domains, number of returns, and level of confidence. In the last section, we examine the issues raised by the principle of stopping criterion, in order to generalise the criterion that we suggested.


Pasi Koikkalainen
Fri Oct 18 19:03:41 EET DST 2002