Integrated Rules

From Data Quality Toolkit
Jump to: navigation, search

This page is a working document containing an evolving set of rules which will be contineously implemented into the data integrity service and quality toolkit. So far, only a few examples have been included. The numbering scheme will also be used to specify the set of rules to be applied when using the integrity service.


Atomized Genus elements should start with a single uppercase character followed ny a non-empty sequence of lower-case characters

ABCD elements:

/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Zoological/GenusOrMonomial /DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/GenusOrMonomial

Regular expression:

[A-Z][a-z]+


Check whether collection date fields conform to specification in ABCD 2.06

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeBegin /DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeEnd /DataSets/DataSet/Units/Unit/Identifications/Identification/Date/ISODateTimeBegin /DataSets/DataSet/Units/Unit/Identifications/Identification/Date/ISODateTimeEnd

Regular expression:

\d\d\d\d(\-(0[1-9]|1[012])(\-((0[1-9])|1\d|2\d|3[01])(T(0\d|1\d|2[0-3])(:[0-5]\d){0,2})?)?)?|\-\-(0[1-9]|1[012])(\-(0[1-9]|1\d|2\d|3[01]))?|\-\-\-(0[1-9]|1\d|2\d|3[01])


Check numeric ranges of site coordinates latitude value

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/LatitudeDecimal

Rule:

-90.0 <= lat <= 90.0


Check numeric ranges of site coordinates longitude value

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/LongitudeDecimal

Rule:

-180.0 <= lon <= 180.0


Check syntactical correctness of ABCD elements used for email addresses

ABCD elements:

All elements with email-addresses

Regular expression:

^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9\-\.]+)$


Check whether country element conforms with ISO3166

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/Country/ISO3166Code

Rule:

Use 2- or 3-letter ISO country code (ISO3166-1).


Check whether scientific name is known by zoological name service

ABCD elements:

/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString

Rule:

Use Zoological Name Service


Check whether scientific name is known by botanical name service

ABCD elements:

/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString

Rule:

Use Botanical Name Service


Check whether field for multimedia object type uses mime types

ABCD elements:

/DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMediaObject/FileURI /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMediaObject/Format

Rule:

Use http://www.ietf.org/rfc/rfc2046.txt


Check whether multimedia object file is available

ABCD elements:

/DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMediaObject/File

Rule:

HTTP HEAD request


Check whether multimedia object has an associated copyright statement

ABCD elements:

/DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMediaObject/IPR/Copyrights/Copyright/Text

Rule:

Copyright element has to be non-empty.


Check whether rule 7 and rule 8 find the scientific name

ABCD elements:

/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString

Rule:

Use rule 7 and rule 8


Check whether the value for measurement and fact is a number

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/Altitude/MeasurementOrFactAtomised/LowerValue

/DataSets/DataSet/Units/Unit/Gathering/Altitude/MeasurementOrFactAtomised/UpperValue

/DataSets/DataSet/Units/Unit/Gathering/Depth/MeasurementOrFactText

/DataSets/DataSet/Units/Unit/Gathering/Height/MeasurementOrFactText

Rule:

MaF data type field values have to be a number

Check whether Record basis is mapped

ABCD elements:

/DataSets/DataSet/Units/Unit/RecordBasis

Rule:

Record basis field has to be mapped