next up previous
Next: HANDLING MISSING DATA BY Up: MULTIPLE IMPUTATION Previous: THE EFFECT OF ITEM-NONRESPONSE

EXAMPLES OF MULTIPLE IMPUTATION IN LARGE-SCALE SURVEYS

N. T. Longford

De Montfort University, Leicester
James Went Building 2-8, De Montfort University, The Gateway,
Leicester LE1 9BH, UK.
Email: NTL@dmu.ac.uk

Missing data are a ubiquitous problem in large-scale surveys. Such incompleteness is usually dealt with either by restricting the analysis to the cases with complete records or by imputing for each missing item an efficiently estimated value. The deficiencies of these approaches will be discussed, especially in the context of estimating a large number of quantities. The main part of the paper will describe two examples of analysses using multiple imputation.

In the first, the ILO employment status is imputed in the British Labour Force Survey by a Bayesian bootstrap method. It is an adaptation of the hot deck method which seeks to fully exploit the auxiliary information. Important auxiliary information is given by the previous ILO status, when available, and the standard demographic variables.

Missing data can be interpreted more generally, as in the framework of the EM algorithm. The second example is from the Scottish House Condition Survey, and its focus is on the inconsistency of the surveyors. The surveyors assess the sampled dwelling units on a large number of elements, or features of the dwelling, such as internal walls, roof, and plumbing, which are scored and converted to a summarising `comprehensive repair cost'. The level of inconsistency is estimated from the discrepancies between the pairs of assessments of doubly surveyed dwellings. The principal research questions are: how much information is lost due to the inconsistency and whether the naive estimators that ignore the inconsistency are unbiased. The problem is solved by multiple imputation, generating plausible scores for all the dwellings in the survey.



Pasi Koikkalainen
Fri Oct 18 19:03:41 EET DST 2002