Database

From Berlin Harvesting and Indexing Toolkit
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

The default database is MySQL. If you want to use another system, you will have to change the configuration (application.properties) and add the library.


Data is saved two times:

  • first, the original data delivered by the provider is saved in the raw* tables (rawidentification, rawcoordinates, rawoccurrence, rawpreservationtype, rawhigher)
  • after running the quality tests, improved data is saved in tables without the raw prefix (identification, coordinates, occurrence, perservationtype, higher)


There are 2 central tables:

  • the bio_datasource table, which stores every datasource (accespoint, name, number of records, standard and protocol).
  • the tripleidstore table, which stores every triple ID (unitID, collectionCode, institutionCode) met during the processing of the harvested records.