Difference between revisions of "Main Page"
O.tschoepe (talk | contribs) |
O.tschoepe (talk | contribs) (→Data Model) |
||
Line 69: | Line 69: | ||
*Data format of original document (ABCD/Darwin core) | *Data format of original document (ABCD/Darwin core) | ||
*Annotator (Name, Email, Institution) | *Annotator (Name, Email, Institution) | ||
− | * | + | *Annotation date |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
==Publications== | ==Publications== |
Revision as of 09:43, 21 June 2012
Contents
A generic annotation system for biodiversity data
The project
The project´s main objective is to exemplarily develop a specification for an annotation data repository for networked and highly complex scientific data.
It will be implemented using the example of collection and observation data in the botanic domain provided by the GBIF/BioCase system (currently over 50.8 million data sets, including 15 million data sets from natural history collection objects).
Analogical to the traditional, written annotation of natural history collection objects, e.g. concerning their taxonomic identity, a procedure is established for data available via the internet. This will allow annotations of single data sets as well as mass annotations for sets of collection objects.
Using the example of natural history collection data in the framework of GBIF-Germany the project develops solutions for several cross-domaine and domaine-specific problems and implements them in a pilot system. The issues include:
- categorisation of annotations
- access rights, rights of personality and rights of attribution of annotating scientists
- quality check
- reference and linking of annotations
- conception of a user-friendly system that encourages annotations
- feedback to the distributed data providers
- the potential use of the system in ongoing research projects for filtering of useable data from the overall system
- in general, the integration of data access on annotation data in the overall system of GBIF, BioCASE and GBIF-Germany
AnnoSys is a three-year project from the Botanic Garden and Botanical Museum Berlin-Dahlem) , funded by the LIS-Programme of the German Research Foundation (DFG). The original project title is “Ein generisches Annotationssystem für Biodiversitätsdaten” (project number BE 2283/4-1).
Overall architecture
Principle workflow
The user visits the GBIF or the BioCase portal to access records. When wishing to annotate a record, he enters the annotation system and, after log in, conducts the annotation. The annotation is then saved on the annotation server together with the original record, both elements are connected via a GUID. The annotated version as well as the original record is then displayed to subsequent users on the data portal.
It is also possible to search for annotations via the annotation system, using specific criteria (e.g. a taxonomic group or a country). After an annotation has been conducted, a message system informs the collection manager about the changes in the record as well as users, who subscribed to the information service. The collection manager can then decide, whether he wants to adopt the annotation in his local database.
Main components
The Annotation System consists of the following components:
- Repository for annotations, annotated documents and user profiles
- Exchange of (Meta-)Data
- Message-System
- Graphical Interface
- Security (Authentication, Authorization)
Data Model
Annotation Data
We represent our annotation data as a list of data elements affected by modifications, the modified data element values within the dataset and/or comments added by the annotating agent. In consideration of the XML based nature of the envisioned dataset standard formats, elements of that list will at least consists of the following elements:
- annotation context selection
- Value suggestion for the selected context elements
- Annotation type (e.g. new determination, typo correction, georeference)
- Free text comment referring to the selected context element(s)
- Annotator's motivation for making the proposal
- Evidence for the proposal given by the annotator
- Annotator's constraints (optional)
Annotation Meta Data
To complete the data model the following meta data are needed:
- Reference to
- the original object (GUID-Triple)
- the digital original document (GUID)
- the annotation (GUID on the basis of GUID of digital original document)
- origin of original document (e.g. download-URL with date)
- Data format of original document (ABCD/Darwin core)
- Annotator (Name, Email, Institution)
- Annotation date
Publications
- Tschöpe, O., Suhrbier, l., Güntsch, A. & Berendsohn, W.G. 2012: “A generic annotation system for biodiversity data” [Poster]. – "GBIF European Regional Nodes Meeting 2012“, 27.-29.3., Berlin.
- Tschöpe, O., Suhrbier, l., Güntsch, A. & Berendsohn, W.G. 2012: “AnnoSys: A generic annotation system for biodiversity data”. – "GBIF European Regional Nodes Meeting 2012“, 27.-29.3., Berlin. Media:AnnoSys_GBIF_nodes.pdf
People
Project staff
Walter Berendsohn - Principle Investigator
Anton Güntsch - Co-Investigator
Okka Tschöpe - Biologist
Lutz Suhrbier - Computer Scientist
contact: AnnoSys
Funding
The Annotation System project is funded by the Scientific Library Services and Information Systems (LIS) programme of the DFG ( LIS).