next up previous
Next: STOPPING CRITERION: A WAY Up: SELECTIVE EDITING Previous: SELECTIVE EDITING BY MEANS

DEMONSTRATION OF THE GRAPHICAL EDITING AND ANALYSIS QUERY SYSTEM

Paula Weir

Department of Energy, EIA
Energy Information Administration, EI-42
1000 Independence Ave., S.W.
Washington, D.C. 20585
USA
email: paula.weir@eia.doe.gov

The Graphical Editing and Analysis Query System (GEAQS) is software developed by EIA as an in-house tool for editing and validating survey data. The graphical approach used in the system is based on best practices of four other systems examined at the beginning of the software development. GEAQS uses a top-down approach to examining data at the macro or aggregate level, highlighting questionable aggregates, and drilling down through lower level aggregates to identify the potential micro or reported data outlier. The graphical views include anomaly maps according to the various dimensions of the data, scatter plots, box-whisker plots, and time series graphs. In the recent version of the system, all of these graphical displays are available for both macro and micro views, with complementary metadata provided through accompanying spreadsheets with point-and-click mapping to the graphs. Outliers are identified by their position relative to the other respondents' values and,! in the case of scatter graphs, by their position relative to the fit line. The user can select the scale of the data, linear or log, to facilitate unclustering of data if necessary. The graphs display reported and imputed data with the data points colored according to user selected edit scores or measures of influence, allowing users to also evaluate the other edit or imputation rules through the graphic presentation. The original GEAQS PowerBuilder code has recently been rewritten to capture the capabilities of the newer PowerBuilder version, and to run on an Oracle database for the efficiency required for larger datasets, as necessitated by larger surveys and longer time series.



Pasi Koikkalainen
Fri Oct 18 19:03:41 EET DST 2002