Difference between revisions of "Existing name checking mechanisms"

From TETTRIs
Jump to: navigation, search
m
Line 10: Line 10:
 
'''Data updated:''' depending on target dataset<br />
 
'''Data updated:''' depending on target dataset<br />
 
'''Limitation:''' Direct input of list limited to 6000 names. (With file upload not limited) <br />
 
'''Limitation:''' Direct input of list limited to 6000 names. (With file upload not limited) <br />
'''Other:''' Login with GBIF account is recommended (self-registration at [https://www.gbif.org/user/profile https://www.gbif.org/user/profile])<br />
 
 
'''Local ID input returned:''' NO (checked August 16, 2023)<br />
 
'''Local ID input returned:''' NO (checked August 16, 2023)<br />
 
'''Local Name input returned:''' YES<br />
 
'''Local Name input returned:''' YES<br />
 
'''Aggregator name ID returned:''' YES - in download only<br />
 
'''Aggregator name ID returned:''' YES - in download only<br />
 
'''Interactive mode for partial matches:''' NO<br />
 
'''Interactive mode for partial matches:''' NO<br />
'''OpenRefine reconciliation API:''' No
+
'''OpenRefine reconciliation API:''' No<br />
 +
'''Other:''' Login with GBIF account is recommended (self-registration at [https://www.gbif.org/user/profile https://www.gbif.org/user/profile])<br />
  
 
== Catalogue of Life ==
 
== Catalogue of Life ==
Line 29: Line 29:
 
'''Data updated:''' July 2023 (semiannual edition)<br />
 
'''Data updated:''' July 2023 (semiannual edition)<br />
 
'''Limitation:''' Not found - tested with 144.000 records<br />
 
'''Limitation:''' Not found - tested with 144.000 records<br />
'''Other:''' Service can be installed as local copy<br />
 
 
'''Local ID input returned:''' Yes <br />
 
'''Local ID input returned:''' Yes <br />
 
'''Local Name input returned:''' Yes <br />
 
'''Local Name input returned:''' Yes <br />
Line 35: Line 34:
 
'''Interactive mode for partial matches:''' Yes <br />
 
'''Interactive mode for partial matches:''' Yes <br />
 
'''OpenRefine reconciliation API:''' yes
 
'''OpenRefine reconciliation API:''' yes
 +
'''Other:''' Service can be installed as local copy<br />
 +
'''Other:''' R-Package World Flora - see [https://cran.r-project.org/web/packages/WorldFlora/index.html https://cran.r-project.org/web/packages/WorldFlora/index.html]<br />
  
 
==[https://www.eu-nomen.eu/portal/taxamatch.php PESI]==
 
==[https://www.eu-nomen.eu/portal/taxamatch.php PESI]==
Line 43: Line 44:
 
'''Data updated:''' 2014 <br />
 
'''Data updated:''' 2014 <br />
 
'''Limitation:''' 5,000 names <br />
 
'''Limitation:''' 5,000 names <br />
Other: <br />
 
 
'''Local ID input returned:''' No <br />
 
'''Local ID input returned:''' No <br />
 
'''Local Name input returned:''' Yes <br />
 
'''Local Name input returned:''' Yes <br />
 
'''Aggregator name ID returned:''' Yes (as provided by the primary aggregator)<br />
 
'''Aggregator name ID returned:''' Yes (as provided by the primary aggregator)<br />
 
'''Interactive mode for partial matches:''' Yes<br />
 
'''Interactive mode for partial matches:''' Yes<br />
'''OpenRefine reconciliation API:''' No
+
'''OpenRefine reconciliation API:''' No<br />
 +
Other: <br />
  
 
==[https://lifewatch.be/en/e-lab LifeWatch]==
 
==[https://lifewatch.be/en/e-lab LifeWatch]==
Line 61: Line 62:
 
Aggregator name ID returned: <br />
 
Aggregator name ID returned: <br />
 
Interactive mode for partial matches: <br />
 
Interactive mode for partial matches: <br />
'''OpenRefine reconciliation API:''' No
+
'''OpenRefine reconciliation API:''' No <br />
  
 
==[https://verifier.globalnames.org/ Global Names]==
 
==[https://verifier.globalnames.org/ Global Names]==
Line 70: Line 71:
 
'''Data updated:''' Differs for stored datasets<br />
 
'''Data updated:''' Differs for stored datasets<br />
 
'''Limitation:''' 5000 names, at least in interactive mode<br />
 
'''Limitation:''' 5000 names, at least in interactive mode<br />
'''Other:''' Offers a kind of query language that seems to be very flexible<br />
 
 
Local ID input returned: <br />
 
Local ID input returned: <br />
 
Local Name input returned: <br />
 
Local Name input returned: <br />
Line 76: Line 76:
 
Interactive mode for partial matches: <br />
 
Interactive mode for partial matches: <br />
 
'''OpenRefine reconciliation API:''' yes, with step-by-step documentation: https://github.com/gnames/gnverifier/wiki/OpenRefine-readme <br />
 
'''OpenRefine reconciliation API:''' yes, with step-by-step documentation: https://github.com/gnames/gnverifier/wiki/OpenRefine-readme <br />
 +
'''Other:''' Offers a kind of query language that seems to be very flexible<br />
 
'''TETTRIS Notes''' includes<br />  
 
'''TETTRIS Notes''' includes<br />  
 
a 2021 Algabase set (but matching doesn’t work - [8/23]) <br />
 
a 2021 Algabase set (but matching doesn’t work - [8/23]) <br />
 
an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900 <br />
 
an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900 <br />
 +
  
 
==[http://namematch.science.kew.org/ Kew (IPNI)] <br />==
 
==[http://namematch.science.kew.org/ Kew (IPNI)] <br />==
Line 86: Line 88:
 
'''Data updated:''' current <br />
 
'''Data updated:''' current <br />
 
'''Limitation:''' Not found - tested with 144.000 records<br />
 
'''Limitation:''' Not found - tested with 144.000 records<br />
'''Other:''' OpenRefine Interface? <br />
 
 
'''Local ID input returned:''' YES <br />
 
'''Local ID input returned:''' YES <br />
 
'''Local Name input returned:''' YES<br />
 
'''Local Name input returned:''' YES<br />
 
'''Aggregator name ID returned:''' YES: IPNI-LSID<br />
 
'''Aggregator name ID returned:''' YES: IPNI-LSID<br />
 
'''Interactive mode for partial matches:''' NO<br />
 
'''Interactive mode for partial matches:''' NO<br />
'''OpenRefine reconciliation API:''' Yes - documentation: [https://data1.kew.org/reconciliation/help https://data1.kew.org/reconciliation/help]
+
'''OpenRefine reconciliation API:''' Yes - documentation: [https://data1.kew.org/reconciliation/help https://data1.kew.org/reconciliation/help]<br />
 +
'''Other:'''<br />
  
 
==[https://tnrs.biendata.org/ TNRS Taxonomic Name Resolution Service] <br />==
 
==[https://tnrs.biendata.org/ TNRS Taxonomic Name Resolution Service] <br />==
Line 99: Line 101:
 
'''Data updated:''' 2023<br />
 
'''Data updated:''' 2023<br />
 
'''Limitation:''' Pasting 5000 names; API-processing unlimited (in batches of 5000)<br />
 
'''Limitation:''' Pasting 5000 names; API-processing unlimited (in batches of 5000)<br />
'''Other:''' R package available<br />
 
 
'''Local ID input returned:''' No <br />
 
'''Local ID input returned:''' No <br />
 
'''Local Name input returned:''' YES<br />
 
'''Local Name input returned:''' YES<br />
 
'''Aggregator name ID returned:''' NO<br />
 
'''Aggregator name ID returned:''' NO<br />
 
'''Interactive mode for partial matches:''' YES<br />
 
'''Interactive mode for partial matches:''' YES<br />
'''OpenRefine reconciliation API:''' No
+
'''OpenRefine reconciliation API:''' No<br />
 +
'''Other:''' R package available<br />
  
 
==[https://legacy.tropicos.org/NameMatching.aspx Tropicos]==
 
==[https://legacy.tropicos.org/NameMatching.aspx Tropicos]==
Line 112: Line 114:
 
Data updated: <br />
 
Data updated: <br />
 
Limitation: <br />
 
Limitation: <br />
Other: <br />
 
 
'''Local ID input returned:''' Yes<br />
 
'''Local ID input returned:''' Yes<br />
 
'''Local Name input returned:''' Yes<br />
 
'''Local Name input returned:''' Yes<br />
 
'''Aggregator name ID returned:''' Yes: Tropicos-ID<br />
 
'''Aggregator name ID returned:''' Yes: Tropicos-ID<br />
 
'''Interactive mode for partial matches:''' No<br />
 
'''Interactive mode for partial matches:''' No<br />
'''OpenRefine reconciliation API:''' No
+
'''OpenRefine reconciliation API:''' No<br />
 +
Other: <br />
  
 
==[https://www.gbif.org/tools/species-lookup GBIF Taxonomic Backbone]==
 
==[https://www.gbif.org/tools/species-lookup GBIF Taxonomic Backbone]==
Line 125: Line 127:
 
'''Data updated:''' Current<br />
 
'''Data updated:''' Current<br />
 
'''Limitation:''' 6000 records<br />
 
'''Limitation:''' 6000 records<br />
Other: <br />
 
 
'''Local ID input returned:''' Yes <br />
 
'''Local ID input returned:''' Yes <br />
 
'''Local Name input returned:''' Yes<br />
 
'''Local Name input returned:''' Yes<br />
 
'''Aggregator name ID returned:''' No<br />
 
'''Aggregator name ID returned:''' No<br />
 
'''Interactive mode for partial matches:''' No<br />
 
'''Interactive mode for partial matches:''' No<br />
'''OpenRefine reconciliation API:''' No
+
'''OpenRefine reconciliation API:''' No<br />
 +
Other: <br />
 +
 
 +
==[https://www.anbg.gov.au/apni/ Australian Plant Name Index]==
 +
'''Scope:''' Australian plants<br />
 +
Software updated: <br />
 +
Codebase/Documentation:<br />
 +
Data updated: <br />
 +
Limitation:<br />
 +
Local ID input returned: <br />
 +
Local Name input returned: <br />
 +
Aggregator name ID returned: <br />
 +
Interactive mode for partial matches: <br />
 +
OpenRefine reconciliation API:<br />
 +
'''Other:''' [https://www.biorxiv.org/content/10.1101/2024.02.02.578715v1 APCalign: an R package workflow and app for aligning and updating flora names to the Australian Plant Census]<br />

Revision as of 22:19, 11 February 2024

(Testing and documentation in progress)

Terminology

In the context of TETTRIs, we distinguish three types of "Target aggregators", i.e. online databases of organism names that offer services for the use cases defined in the TETTRIs project: Primary and secondary target aggregators and Repositories ("Lookup target aggregators"). See [1] for details.

Checklist Bank (GBIF & Catalogue of Life)

Type: lookup.
Scope: All taxa or specific groups, also geographically restricted, depending on the target dataset chosen
Software updated: May 23 (frontend), June 23 (backend) (checked August 15, 2023)
Codebase/Documentation: https://api.checklistbank.org/
Data updated: depending on target dataset
Limitation: Direct input of list limited to 6000 names. (With file upload not limited)
Local ID input returned: NO (checked August 16, 2023)
Local Name input returned: YES
Aggregator name ID returned: YES - in download only
Interactive mode for partial matches: NO
OpenRefine reconciliation API: No
Other: Login with GBIF account is recommended (self-registration at https://www.gbif.org/user/profile)

Catalogue of Life

Type: Secondary Scope: potentially all taxa but incomplete for some zoological groups (genera only) and for Algae
Software: Editions are integrated into Checklist Bank, see above and Global Names, see below.

World Flora Online WFO Plant List

Type: Primary
Scope: Plants
Software updated: ongoing Sept. 2023 (not stated on website)
Codebase/Documentation: GraphQL API, Name Matching REST API, Reconciliation API
Data updated: July 2023 (semiannual edition)
Limitation: Not found - tested with 144.000 records
Local ID input returned: Yes
Local Name input returned: Yes
Aggregator name ID returned: Yes - WFO-ID
Interactive mode for partial matches: Yes
OpenRefine reconciliation API: yes Other: Service can be installed as local copy
Other: R-Package World Flora - see https://cran.r-project.org/web/packages/WorldFlora/index.html

PESI

Type: Secondary
Scope: European taxa
Software updated: 2011?
Codebase/Documentation: By refrence to components used (Taxamatch algorithm and scientific name parser)
Data updated: 2014
Limitation: 5,000 names
Local ID input returned: No
Local Name input returned: Yes
Aggregator name ID returned: Yes (as provided by the primary aggregator)
Interactive mode for partial matches: Yes
OpenRefine reconciliation API: No
Other:

LifeWatch

Scope:
Software updated:
Codebase/Documentation
Data updated:
Limitation:
Other:
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches:
OpenRefine reconciliation API: No

Global Names

Type: Lookup
Scope: defined by stored datasets - option to restrict matching to individual source dataset
Software updated: active Nov. 2023
Codebase/Documentation: https://github.com/gnames
Data updated: Differs for stored datasets
Limitation: 5000 names, at least in interactive mode
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches:
OpenRefine reconciliation API: yes, with step-by-step documentation: https://github.com/gnames/gnverifier/wiki/OpenRefine-readme
Other: Offers a kind of query language that seems to be very flexible
TETTRIS Notes includes
a 2021 Algabase set (but matching doesn’t work - [8/23])
an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900


Kew (IPNI)

Scope: Vascular plants (POWO - IPNI offered but not working)
Software updated: ?
Codebase/Documentation ?
Data updated: current
Limitation: Not found - tested with 144.000 records
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: YES: IPNI-LSID
Interactive mode for partial matches: NO
OpenRefine reconciliation API: Yes - documentation: https://data1.kew.org/reconciliation/help
Other:

TNRS Taxonomic Name Resolution Service

Scope: Plants, WFO and vascular plants WCVP - potentially more datasets could be included
Software updated: v. 5.0 Feb. 24, 2021
Codebase/Documentation: https://github.com/ojalaquellueva/TNRSapi
Data updated: 2023
Limitation: Pasting 5000 names; API-processing unlimited (in batches of 5000)
Local ID input returned: No
Local Name input returned: YES
Aggregator name ID returned: NO
Interactive mode for partial matches: YES
OpenRefine reconciliation API: No
Other: R package available

Tropicos

Scope: Plants
Software updated:
Codebase/Documentation
Data updated:
Limitation:
Local ID input returned: Yes
Local Name input returned: Yes
Aggregator name ID returned: Yes: Tropicos-ID
Interactive mode for partial matches: No
OpenRefine reconciliation API: No
Other:

GBIF Taxonomic Backbone

Scope: All taxa (GBIF taxonomic backbone)
Software updated:
Codebase/Documentation: see Species API
Data updated: Current
Limitation: 6000 records
Local ID input returned: Yes
Local Name input returned: Yes
Aggregator name ID returned: No
Interactive mode for partial matches: No
OpenRefine reconciliation API: No
Other:

Australian Plant Name Index

Scope: Australian plants
Software updated:
Codebase/Documentation:
Data updated:
Limitation:
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches:
OpenRefine reconciliation API:
Other: APCalign: an R package workflow and app for aligning and updating flora names to the Australian Plant Census