Sample Metadata and Sampling Information

From BGBM Collection Workflows
Jump to navigation Jump to search

Nature:

Sample origin data (e.g., locality) and various identifiers associated with it, e.g., labcode, tissue ID (BGT- number), DB number, voucher ID (herbarium barcode, internal or external if available). This category also encompasses sampling lists for subprojects, sampling wish lists, sample lists obtained from other sources (e.g., collaborators, datasets from herbaria, records downloaded from databases, BoGart exports, and similar). Tables that track lab work progress (e.g., status or success of DNA extraction, PCR success, sequencing results) do not fall in this category as they serve a different purpose; they are part of lab methodology data.

Format:

- spreadsheets: curated standardised DNA lists, collection data forms (CDF) curated in MS Excel

- The format should follow an established standard; if data come from external sources, they should be formatted to fit this standard. For locality metadata, the established standard is the JACQ database field structure.

Storage / folder organization:

- stored on <DRIVENAME>:/DNA-lists and organized by labcode (i.e. corresponding to a taxonomic group)

- long-term-goal: consolidate all records in a central database or core that effectively links different data sources

Naming convention:

Example: MASTER_CAR_Caryophyllaceae_DNA_CuratedList_Jan2024.xlsx

Version control:

Version regularly, approximately every three months or after each major update/addition, including the date into the file name.

Access control:

Access is restricted to maintain data integrity and consistency. One person is responsible for curating each list; others may have writing permissions. All other persons get read-only access.

Retention:

Retained indefinitely. Old or obsolete versions identified as such by the responsible curator may be deleted.

Publication:

- Sample metadata are published through online portals (e.g., JACQ, GGBN) ensuring links between sample, tissue, etc., providing the associated identifiers.

- Sample metadata for published sequences are published through GenBank/ENA.