Pasi Koikkalainen
Laboratory of Data Analysis
University of Jyväskylä
P.O.Box 35, FIN-40351 Jyväskylä
Finland
The concern of this presentation is how neural networks can be used for editing and imputation.
In imputation tasks the promise is that neural networks can
overcome the problem of ``curse of dimensionality'':
Dense samples as needed to learn pdf well,
but dense samples are hard to get in high dimensions.
When using neural networks the dimensionality of data is not the problem, rather it is the COMPLEXITY of data. The self-organizing map, for example, combines dimension reduction and data modelling under a single learning algorithm. This allows us to model multivariate distributions with relatively effectively. The imputation model is then obtained by conditionlizing the modelled distribution by observed values.
In editing neural networks can be used for both strong and weak type of error localization. Strong knowledge assumes that errors can be modelled, while weak knowledge expects that we are able to discriminate between acceptable and erroneous observations. The use of weak knowledge is more common in neural systems. The objective is to build a model that explains well all clean observations, but which gives low matching probabilities for erroneous ones. This can be done in two ways: