Difference between revisions of "Home"

From Data Quality Toolkit
Jump to: navigation, search
m (moved Main Page to Home)
 
(11 intermediate revisions by one other user not shown)
Line 1: Line 1:
 +
http://services.bgbm.org/DataQualityToolkit/
  
== Overview ==
+
== Data Quality Toolkit ==
  
The Data Quality Toolkit is an open web-based application for OpenUp! data providers and BioCASE providers in general performing data quality checks on their data. The system integrates a set of distributed quality services into a single and consistent user interface, thereby hiding the complexity of the individual services. In particular, the Quality Toolkit integrates the zoological and botanical quality services developed by WP 4 and WP 5 as well as the data integrity services developed within WP 2.
+
The Data Quality Toolkit is an open web-based application for OpenUp! data providers and BioCASE (http://www.biocase.org/products/provider_software/) providers in general performing data quality checks on their data. The system integrates a set of distributed quality services into a single and consistent user interface, thereby hiding the complexity of the individual services. In particular, the Quality Toolkit integrates the zoological and botanical quality services developed by WP 4 and WP 5 as well as the data integrity services developed within WP 2.
  
 
The Data Quality Toolkit operates directly on a given BioCASE provider service installation. It pages through a subset of records (collection units) specified by the given user query (e.g. for a specific genus) and applies a set of user-selectable quality testing procedures.
 
The Data Quality Toolkit operates directly on a given BioCASE provider service installation. It pages through a subset of records (collection units) specified by the given user query (e.g. for a specific genus) and applies a set of user-selectable quality testing procedures.
  
 
The result comes as an annotated ABCD-document containing all unit-records with one or more quality problems. Annotations explaining the problems are directly placed in the form of structured comments next to the elements they refer to. Using ABCD as a reporting format has two advantages over a proprietary format: 1) the connection between data and their annotations is directly visible and does not have to be “explained” using a different structure and 2) using ABCD opens opportunities for future developments of software components, which automatically re-inserts annotated data into provider databases.
 
The result comes as an annotated ABCD-document containing all unit-records with one or more quality problems. Annotations explaining the problems are directly placed in the form of structured comments next to the elements they refer to. Using ABCD as a reporting format has two advantages over a proprietary format: 1) the connection between data and their annotations is directly visible and does not have to be “explained” using a different structure and 2) using ABCD opens opportunities for future developments of software components, which automatically re-inserts annotated data into provider databases.

Latest revision as of 14:11, 15 November 2012

http://services.bgbm.org/DataQualityToolkit/

Data Quality Toolkit

The Data Quality Toolkit is an open web-based application for OpenUp! data providers and BioCASE (http://www.biocase.org/products/provider_software/) providers in general performing data quality checks on their data. The system integrates a set of distributed quality services into a single and consistent user interface, thereby hiding the complexity of the individual services. In particular, the Quality Toolkit integrates the zoological and botanical quality services developed by WP 4 and WP 5 as well as the data integrity services developed within WP 2.

The Data Quality Toolkit operates directly on a given BioCASE provider service installation. It pages through a subset of records (collection units) specified by the given user query (e.g. for a specific genus) and applies a set of user-selectable quality testing procedures.

The result comes as an annotated ABCD-document containing all unit-records with one or more quality problems. Annotations explaining the problems are directly placed in the form of structured comments next to the elements they refer to. Using ABCD as a reporting format has two advantages over a proprietary format: 1) the connection between data and their annotations is directly visible and does not have to be “explained” using a different structure and 2) using ABCD opens opportunities for future developments of software components, which automatically re-inserts annotated data into provider databases.