Difference between revisions of "Integrated Rules"

From Data Quality Toolkit
Jump to: navigation, search
(Check whether the value for measurement and fact is a number)
 
(One intermediate revision by the same user not shown)
Line 4: Line 4:
  
  
=== 1 Atomized Genus elements should start with a single uppercase character followed ny a non-empty sequence of lower-case characters ===
+
=== Atomized Genus elements should start with a single uppercase character followed ny a non-empty sequence of lower-case characters ===
  
  
Line 19: Line 19:
  
  
=== 2 Check whether collection date fields conform to specification in ABCD 2.06 ===
+
=== Check whether collection date fields conform to specification in ABCD 2.06 ===
  
  
Line 35: Line 35:
 
   
 
   
  
=== 3 Check numeric ranges of site coordinates latitude value ===
+
=== Check numeric ranges of site coordinates latitude value ===
  
  
Line 48: Line 48:
  
  
=== 4 Check numeric ranges of site coordinates longitude value ===
+
=== Check numeric ranges of site coordinates longitude value ===
  
  
Line 61: Line 61:
 
   
 
   
  
=== 5 Check syntactical correctness of ABCD elements used for email addresses ===
+
=== Check syntactical correctness of ABCD elements used for email addresses ===
  
  
Line 73: Line 73:
  
  
=== 6 Check whether country element conforms with ISO3166 ===
+
=== Check whether country element conforms with ISO3166 ===
  
  
Line 86: Line 86:
  
  
=== 7 Check whether scientific name is known by zoological name service ===
+
=== Check whether scientific name is known by zoological name service ===
  
  
Line 99: Line 99:
  
  
=== 8 Check whether scientific name is known by botanical name service ===
+
=== Check whether scientific name is known by botanical name service ===
  
  
Line 113: Line 113:
 
   
 
   
 
   
 
   
=== 9 Check whether field for multimedia object type uses mime types ===
+
=== Check whether field for multimedia object type uses mime types ===
  
  
Line 127: Line 127:
  
  
=== 10 Check whether multimedia object file is available ===
+
=== Check whether multimedia object file is available ===
  
  
Line 140: Line 140:
  
  
=== 11 Check whether multimedia object has an associated copyright statement ===
+
=== Check whether multimedia object has an associated copyright statement ===
  
  
Line 153: Line 153:
 
   
 
   
  
=== 12 Check whether rule 7 and rule 8 find the scientific name ===
+
=== Check whether rule 7 and rule 8 find the scientific name ===
  
  
Line 166: Line 166:
  
  
=== 13 Check whether the value for measurement and fact  is a number ===
+
=== Check whether the value for measurement and fact  is a number ===
  
  
Line 172: Line 172:
  
 
/DataSets/DataSet/Units/Unit/Gathering/Altitude/MeasurementOrFactAtomised/LowerValue
 
/DataSets/DataSet/Units/Unit/Gathering/Altitude/MeasurementOrFactAtomised/LowerValue
 +
 
/DataSets/DataSet/Units/Unit/Gathering/Altitude/MeasurementOrFactAtomised/UpperValue
 
/DataSets/DataSet/Units/Unit/Gathering/Altitude/MeasurementOrFactAtomised/UpperValue
 +
 
/DataSets/DataSet/Units/Unit/Gathering/Depth/MeasurementOrFactText
 
/DataSets/DataSet/Units/Unit/Gathering/Depth/MeasurementOrFactText
 +
 
/DataSets/DataSet/Units/Unit/Gathering/Height/MeasurementOrFactText
 
/DataSets/DataSet/Units/Unit/Gathering/Height/MeasurementOrFactText
  
Line 180: Line 183:
 
MaF data type field values have to be a number
 
MaF data type field values have to be a number
  
+
=== Check whether Record basis is mapped ===
 
 
=== 14 Check whether Record basis is mapped ===
 
  
  

Latest revision as of 12:25, 7 November 2012

This page is a working document containing an evolving set of rules which will be contineously implemented into the data integrity service and quality toolkit. So far, only a few examples have been included. The numbering scheme will also be used to specify the set of rules to be applied when using the integrity service.


Atomized Genus elements should start with a single uppercase character followed ny a non-empty sequence of lower-case characters

ABCD elements:

/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Zoological/GenusOrMonomial /DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/NameAtomised/Botanical/GenusOrMonomial

Regular expression:

[A-Z][a-z]+


Check whether collection date fields conform to specification in ABCD 2.06

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeBegin /DataSets/DataSet/Units/Unit/Gathering/DateTime/ISODateTimeEnd /DataSets/DataSet/Units/Unit/Identifications/Identification/Date/ISODateTimeBegin /DataSets/DataSet/Units/Unit/Identifications/Identification/Date/ISODateTimeEnd

Regular expression:

\d\d\d\d(\-(0[1-9]|1[012])(\-((0[1-9])|1\d|2\d|3[01])(T(0\d|1\d|2[0-3])(:[0-5]\d){0,2})?)?)?|\-\-(0[1-9]|1[012])(\-(0[1-9]|1\d|2\d|3[01]))?|\-\-\-(0[1-9]|1\d|2\d|3[01])


Check numeric ranges of site coordinates latitude value

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/LatitudeDecimal

Rule:

-90.0 <= lat <= 90.0


Check numeric ranges of site coordinates longitude value

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/SiteCoordinateSets/SiteCoordinates/CoordinatesLatLong/LongitudeDecimal

Rule:

-180.0 <= lon <= 180.0


Check syntactical correctness of ABCD elements used for email addresses

ABCD elements:

All elements with email-addresses

Regular expression:

^([a-zA-Z0-9_\-\.\+]+)@([a-zA-Z0-9\-\.]+)$


Check whether country element conforms with ISO3166

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/Country/ISO3166Code

Rule:

Use 2- or 3-letter ISO country code (ISO3166-1).


Check whether scientific name is known by zoological name service

ABCD elements:

/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString

Rule:

Use Zoological Name Service


Check whether scientific name is known by botanical name service

ABCD elements:

/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString

Rule:

Use Botanical Name Service


Check whether field for multimedia object type uses mime types

ABCD elements:

/DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMediaObject/FileURI /DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMediaObject/Format

Rule:

Use http://www.ietf.org/rfc/rfc2046.txt


Check whether multimedia object file is available

ABCD elements:

/DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMediaObject/File

Rule:

HTTP HEAD request


Check whether multimedia object has an associated copyright statement

ABCD elements:

/DataSets/DataSet/Units/Unit/MultiMediaObjects/MultiMediaObject/IPR/Copyrights/Copyright/Text

Rule:

Copyright element has to be non-empty.


Check whether rule 7 and rule 8 find the scientific name

ABCD elements:

/DataSets/DataSet/Units/Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/FullScientificNameString

Rule:

Use rule 7 and rule 8


Check whether the value for measurement and fact is a number

ABCD elements:

/DataSets/DataSet/Units/Unit/Gathering/Altitude/MeasurementOrFactAtomised/LowerValue

/DataSets/DataSet/Units/Unit/Gathering/Altitude/MeasurementOrFactAtomised/UpperValue

/DataSets/DataSet/Units/Unit/Gathering/Depth/MeasurementOrFactText

/DataSets/DataSet/Units/Unit/Gathering/Height/MeasurementOrFactText

Rule:

MaF data type field values have to be a number

Check whether Record basis is mapped

ABCD elements:

/DataSets/DataSet/Units/Unit/RecordBasis

Rule:

Record basis field has to be mapped