next up previous
Next: EDITING AND IMPUTATION SYSTEMS Up: MULTIPLE IMPUTATION Previous: EXAMPLES OF MULTIPLE IMPUTATION

HANDLING MISSING DATA BY MULTIPLE IMPUTATIONS IN THE ANALYSIS OF WOMEN PREVALENCE OF CANCER

Ula A. Nur

University of Leeds
Nuffield Institute for Health (Block D)
71-75 Clarendon road
Leeds LS2 9PL
email: hssuamn@leeds.ac.uk

The UK Women's Cohort study aims to explore the relationship between diet and cancer incidence and mortality in a group of middle-aged vegetarian women in the UK. Standard statistical methods employed in epidemiological studies are valid only when applied to a representative sample of the population of interest. Even when the sample has been designed to be representative at the outset, the validity of the methods will be eroded if some of the subjects are subsequently lost to the survey, or failed to compete all items of the questionnaire. Missing data can rarely be avoided in large-scale studies in which subjects are requested to complete questionnaires with many items. Methods for handling missing data have been developed for a variety of contexts, With Multiple Imputations the uncertainty about the missing values is represented by the differences among the sets of imputed (plausible) values. Once the plausible values are generated, the remainder of the analysis is as complex as the planned (complete-data) analysis, except that it is conducted on the datasets completed by each set of the plausible values. The aim of this research is to compare multiple imputations to other methods of handling missing data in the statistical analysis-investigating link between prevalence of cancer and a number of life style and socio-economic factors.



Pasi Koikkalainen
Fri Oct 18 19:03:41 EET DST 2002