From Data Quality Toolkit
Jump to: navigation, search


Data Quality Toolkit

The Data Quality Toolkit is an open web-based application for OpenUp! data providers and BioCASE (http://www.biocase.org/products/provider_software/) providers in general performing data quality checks on their data. The system integrates a set of distributed quality services into a single and consistent user interface, thereby hiding the complexity of the individual services. In particular, the Quality Toolkit integrates the zoological and botanical quality services developed by WP 4 and WP 5 as well as the data integrity services developed within WP 2.

The Data Quality Toolkit operates directly on a given BioCASE provider service installation. It pages through a subset of records (collection units) specified by the given user query (e.g. for a specific genus) and applies a set of user-selectable quality testing procedures.

The result comes as an annotated ABCD-document containing all unit-records with one or more quality problems. Annotations explaining the problems are directly placed in the form of structured comments next to the elements they refer to. Using ABCD as a reporting format has two advantages over a proprietary format: 1) the connection between data and their annotations is directly visible and does not have to be “explained” using a different structure and 2) using ABCD opens opportunities for future developments of software components, which automatically re-inserts annotated data into provider databases.