Difference between revisions of "Existing name checking mechanisms"

From TETTRIs
Jump to: navigation, search
(Checklist Bank (GBIF & Catalogue of Life))
(Terminology)
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
In the following we list (in alphabetical order) large scale aggregators of taxonomic information that either provide name matching services themselves or are indirectly accessible for name checking by means of a “lookup” aggregator.
 +
 
(Testing and documentation in progress)
 
(Testing and documentation in progress)
 
== Terminology ==
 
== Terminology ==
In the context of TETTRIs, we distinguish three types of "Target aggregators", i.e. online databases of organism names that offer services for the use cases defined in the TETTRIs project: Primary and secondary target aggregators and Repositories ("Lookup target aggregators"). See [https://wiki.bgbm.org/tettris/Global_or_regional_aggregators] for details.  
+
In the context of TETTRIs, we distinguish three types of "Target aggregators", i.e. online databases of organism names that offer services for the use cases defined in the TETTRIs project: Primary and secondary target aggregators and Repositories ("Lookup target aggregators"). See [https://wiki.bgbm.org/tettris/Global_or_regional_aggregators] for details.
+
 
 +
See [[What is name matching?]] for general discussion on terminology and intent.
 +
 
 +
==[https://www.algaebase.org Algaebase]==
 +
'''Type:''' primary<br/>
 +
'''Scope:''' Global algae<br />
 +
Currently no name matching, but included in OBIS and Global Names. Could probably be implemented using the [https://www.algaebase.org/api/ AlgaeBase API]<br />
 +
 
 +
==[https://www.anbg.gov.au/apni/ Australian Plant Name Index]==
 +
'''Type:''' primary<br />
 +
'''Scope:''' Australian plants<br />
 +
Software updated: ?<br />
 +
Codebase/Documentation: ?<br />
 +
Data updated: ?<br />
 +
'''Limitation:''' not stated, check with nearly 21,000 names ended in server error [23 may 2024]<br />
 +
'''Local ID input returned:''' No<br />
 +
Local Name input returned: <br />
 +
Aggregator name ID returned: <br />
 +
'''Interactive mode for partial matches:''' No <br />
 +
'''OpenRefine reconciliation API:''' No<br />
 +
'''Other:''' [https://www.biorxiv.org/content/10.1101/2024.02.02.578715v1 APCalign: an R package workflow and app for aligning and updating flora names to the Australian Plant Census]<br />
 +
 
 +
== [[https://www.catalogueoflife.org/ Catalogue of Life]] ==
 +
'''Type:''' secondary<br/>
 +
'''Scope:''' potentially all taxa but incomplete for some zoological groups (genera only) and for Algae<br/>
 +
'''Software:''' Editions are integrated into Checklist Bank, see above and Global Names, see below.
 +
 
 
==[https://www.checklistbank.org/tools/name-match Checklist Bank (GBIF & Catalogue of Life)]==
 
==[https://www.checklistbank.org/tools/name-match Checklist Bank (GBIF & Catalogue of Life)]==
 
'''Type:''' lookup.<br/>
 
'''Type:''' lookup.<br/>
 
'''Scope:''' All taxa or specific groups, also geographically restricted, depending on the target dataset chosen<br />
 
'''Scope:''' All taxa or specific groups, also geographically restricted, depending on the target dataset chosen<br />
'''Software updated:''' April 8, 2024 (frontend), April 2, 2024 (backend) (checked April 10, 2024)<br />
+
'''Software updated:''' May 16, 2024 (frontend), May 21, 2024 (backend) (checked May 23, 2024)<br />
 
'''Codebase/Documentation:''' [https://api.checklistbank.org/ https://api.checklistbank.org/]<br />
 
'''Codebase/Documentation:''' [https://api.checklistbank.org/ https://api.checklistbank.org/]<br />
 
'''Data updated:''' depending on target dataset<br />
 
'''Data updated:''' depending on target dataset<br />
'''Limitation:''' Direct input of list limited to 6000 names. (With file upload not limited) <br />
+
'''Limitation:''' Direct input of list limited to 6000 names. (With file upload for asynchronous response not limited) <br />
'''Local ID input returned:''' NO (checked August 16, 2023)<br />
+
'''Local ID input returned:''' YES<br />
 
'''Local Name input returned:''' YES<br />
 
'''Local Name input returned:''' YES<br />
 
'''Aggregator name ID returned:''' YES - in download only<br />
 
'''Aggregator name ID returned:''' YES - in download only<br />
Line 17: Line 45:
 
'''Other:''' Login with GBIF account is recommended (self-registration at [https://www.gbif.org/user/profile https://www.gbif.org/user/profile])<br />
 
'''Other:''' Login with GBIF account is recommended (self-registration at [https://www.gbif.org/user/profile https://www.gbif.org/user/profile])<br />
  
== Catalogue of Life ==
+
==[https://europlusmed.org/ Euro+Med PlantBase]
'''Type:''' Secondary
+
'''Type:''' primary<br/>
'''Scope:''' potentially all taxa but incomplete for some zoological groups (genera only) and for Algae<br/>
+
'''Scope:''' European plants<br />
'''Software:''' Editions are integrated into Checklist Bank, see above and Global Names, see below.
+
Currently no name matching, but included in PESI<br />
  
==[https://list.worldfloraonline.org/matching.php World Flora Online WFO Plant List]==
+
==[https://www.gbif.org/tools/species-lookup GBIF Taxonomic Backbone]==
'''Type:''' Primary<br/>
+
'''Type:''' secondary<br/>
'''Scope:''' Plants<br/>
+
'''Scope:''' All taxa (GBIF taxonomic backbone)<br />
'''Software updated:''' ongoing Sept. 2023 (not stated on website)<br />
+
Software updated: <br />
'''Codebase/Documentation:''' [https://list.worldfloraonline.org/gql_index.php GraphQL API], [https://list.worldfloraonline.org/matching_rest.php Name Matching REST API], [https://list.worldfloraonline.org/reconcile_index.php Reconciliation API] <br />
+
'''Codebase/Documentation:''' see [https://www.gbif.org/developer/species Species API] <br />
'''Data updated:''' July 2023 (semiannual edition)<br />
+
'''Data updated:''' Current<br />
'''Limitation:''' Not found - tested with 144.000 records<br />
+
'''Limitation:''' 6000 records<br />
 
'''Local ID input returned:''' Yes <br />
 
'''Local ID input returned:''' Yes <br />
'''Local Name input returned:''' Yes <br />
+
'''Local Name input returned:''' Yes<br />
'''Aggregator name ID returned:''' Yes - WFO-ID <br />
+
'''Aggregator name ID returned:''' No<br />
'''Interactive mode for partial matches:''' Yes <br />
+
'''Interactive mode for partial matches:''' No<br />
'''OpenRefine reconciliation API:''' yes
 
'''Other:''' Service can be installed as local copy<br />
 
'''Other:''' R-Package World Flora - see [https://cran.r-project.org/web/packages/WorldFlora/index.html https://cran.r-project.org/web/packages/WorldFlora/index.html]<br />
 
 
 
==[https://www.eu-nomen.eu/portal/taxamatch.php PESI]==
 
'''Type:''' Secondary<br/>
 
'''Scope:''' European taxa <br />
 
'''Software updated:''' 2011?<br />
 
'''Codebase/Documentation:''' By refrence to components used (Taxamatch algorithm and scientific name parser)<br />
 
'''Data updated:''' 2014 <br />
 
'''Limitation:''' 5,000 names <br />
 
'''Local ID input returned:''' No <br />
 
'''Local Name input returned:''' Yes <br />
 
'''Aggregator name ID returned:''' Yes (as provided by the primary aggregator)<br />
 
'''Interactive mode for partial matches:''' Yes<br />
 
 
'''OpenRefine reconciliation API:''' No<br />
 
'''OpenRefine reconciliation API:''' No<br />
 
Other: <br />
 
Other: <br />
  
==[https://www.marinespecies.org/aphia.php?p=match WoRMS (World Register of Marine Species)]==
+
==[https://verifier.globalnames.org/ Global Names Verifier]==
'''Type:''' primary.<br/>
+
'''Type:''' lookup<br/>
'''Scope:''' Marine species <br />
 
Software updated: not stated<br />
 
Codebase/Documentation: <br />
 
'''Data updated:''' current<br />
 
'''Limitation:''' limited to 1500 names.<br />
 
Local ID input returned: NO<br />
 
Local Name input returned: YES<br />
 
'''Aggregator name ID returned:''' YES<br />
 
'''Interactive mode for partial matches:''' NO<br />
 
'''OpenRefine reconciliation API:''' No<br />
 
'''Other:''' Did not work on 2 April 2024<br />
 
 
 
==[https://lifewatch.be/en/e-lab LifeWatch]==
 
Scope: <br />
 
Software updated: <br />
 
Codebase/Documentation <br />
 
Data updated: <br />
 
Limitation: <br />
 
Other: <br />
 
Local ID input returned: <br />
 
Local Name input returned: <br />
 
Aggregator name ID returned: <br />
 
Interactive mode for partial matches: <br />
 
'''OpenRefine reconciliation API:''' No <br />
 
 
 
==[https://verifier.globalnames.org/ Global Names]==
 
'''Type:''' Lookup<br/>
 
 
'''Scope:''' defined by stored datasets - option to restrict matching to individual source dataset<br />
 
'''Scope:''' defined by stored datasets - option to restrict matching to individual source dataset<br />
 
'''Software updated:''' active Nov. 2023<br />
 
'''Software updated:''' active Nov. 2023<br />
Line 95: Line 81:
 
an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900 <br />
 
an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900 <br />
  
 +
==[https://www.indexfungorum.org/ Index Fungorum]==
 +
'''Type:''' primary<br/>
 +
'''Scope:''' Global fungi<br />
 +
Currently no name matching, but included in Catalogue of Life and Global Names<br />
  
==[http://namematch.science.kew.org/ Kew (IPNI)] <br />==
+
==[http://namematch.science.kew.org/ International Plant Name Index (IPNI)] <br />==
 +
'''Type:''' primary<br/>
 
'''Scope:''' Vascular plants (POWO - IPNI offered but not working) <br />
 
'''Scope:''' Vascular plants (POWO - IPNI offered but not working) <br />
 
'''Software updated:''' ? <br />
 
'''Software updated:''' ? <br />
Line 108: Line 99:
 
'''OpenRefine reconciliation API:''' Yes - documentation: [https://data1.kew.org/reconciliation/help https://data1.kew.org/reconciliation/help]<br />
 
'''OpenRefine reconciliation API:''' Yes - documentation: [https://data1.kew.org/reconciliation/help https://data1.kew.org/reconciliation/help]<br />
 
'''Other:'''<br />
 
'''Other:'''<br />
 +
 +
==[https://lifewatch.be/en/e-lab LifeWatch]==
 +
'''Type:''' lookup<br/>
 +
Scope: <br />
 +
Software updated: <br />
 +
Codebase/Documentation <br />
 +
Data updated: <br />
 +
Limitation: <br />
 +
Other: <br />
 +
Local ID input returned: <br />
 +
Local Name input returned: <br />
 +
Aggregator name ID returned: <br />
 +
Interactive mode for partial matches: <br />
 +
'''OpenRefine reconciliation API:''' No <br />
 +
 +
==[https://www.eu-nomen.eu/portal/taxamatch.php PESI / eu-nomen]==
 +
'''Type:''' secondary<br/>
 +
'''Scope:''' European taxa <br />
 +
'''Software updated:''' 2011?<br />
 +
'''Codebase/Documentation:''' By refrence to components used (Taxamatch algorithm and scientific name parser)<br />
 +
'''Data updated:''' 2014 <br />
 +
'''Limitation:''' 5,000 names <br />
 +
'''Local ID input returned:''' No <br />
 +
'''Local Name input returned:''' Yes <br />
 +
'''Aggregator name ID returned:''' Yes (as provided by the primary aggregator)<br />
 +
'''Interactive mode for partial matches:''' Yes<br />
 +
'''OpenRefine reconciliation API:''' No<br />
 +
Other: <br />
  
 
==[https://tnrs.biendata.org/ TNRS Taxonomic Name Resolution Service] <br />==
 
==[https://tnrs.biendata.org/ TNRS Taxonomic Name Resolution Service] <br />==
 +
'''Type:''' lookup<br/>
 
'''Scope:''' Plants, WFO and vascular plants WCVP - potentially more datasets could be included<br />
 
'''Scope:''' Plants, WFO and vascular plants WCVP - potentially more datasets could be included<br />
 
'''Software updated:''' v. 5.0 Feb. 24, 2021<br />
 
'''Software updated:''' v. 5.0 Feb. 24, 2021<br />
Line 123: Line 143:
  
 
==[https://legacy.tropicos.org/NameMatching.aspx Tropicos]==
 
==[https://legacy.tropicos.org/NameMatching.aspx Tropicos]==
 +
'''Type:''' primary<br/>
 
'''Scope:''' Plants<br />
 
'''Scope:''' Plants<br />
 
Software updated: <br />
 
Software updated: <br />
Line 135: Line 156:
 
Other: <br />
 
Other: <br />
  
==[https://www.gbif.org/tools/species-lookup GBIF Taxonomic Backbone]==
+
==[https://list.worldfloraonline.org/matching.php World Flora Online WFO Plant List]==
'''Scope:''' All taxa (GBIF taxonomic backbone)<br />
+
'''Type:''' primary<br/>
Software updated: <br />
+
'''Scope:''' Plants<br/>
'''Codebase/Documentation:''' see [https://www.gbif.org/developer/species Species API] <br />
+
'''Software updated:''' ongoing Sept. 2023 (not stated on website)<br />
'''Data updated:''' Current<br />
+
'''Codebase/Documentation:''' [https://list.worldfloraonline.org/gql_index.php GraphQL API], [https://list.worldfloraonline.org/matching_rest.php Name Matching REST API], [https://list.worldfloraonline.org/reconcile_index.php Reconciliation API] <br />
'''Limitation:''' 6000 records<br />
+
'''Data updated:''' July 2023 (semiannual edition)<br />
 +
'''Limitation:''' Not found - tested with 144.000 records<br />
 
'''Local ID input returned:''' Yes <br />
 
'''Local ID input returned:''' Yes <br />
'''Local Name input returned:''' Yes<br />
+
'''Local Name input returned:''' Yes <br />
'''Aggregator name ID returned:''' No<br />
+
'''Aggregator name ID returned:''' Yes - WFO-ID <br />
'''Interactive mode for partial matches:''' No<br />
+
'''Interactive mode for partial matches:''' Yes <br />
 +
'''OpenRefine reconciliation API:''' yes: [https://list.worldfloraonline.org/reconcile_index.php https://list.worldfloraonline.org/reconcile_index.php]<br />
 +
'''Other:''' Service can be installed as local copy<br />
 +
'''Other:''' R-Package World Flora - see [https://cran.r-project.org/web/packages/WorldFlora/index.html https://cran.r-project.org/web/packages/WorldFlora/index.html]<br />
 +
 
 +
==[https://www.marinespecies.org/aphia.php?p=match WoRMS (World Register of Marine Species)]==
 +
'''Type:''' primary.<br/>
 +
'''Scope:''' Marine species <br />
 +
Software updated: not stated<br />
 +
Codebase/Documentation: <br />
 +
'''Data updated:''' current<br />
 +
'''Limitation:''' limited to 1500 names.<br />
 +
Local ID input returned: NO<br />
 +
Local Name input returned: YES<br />
 +
'''Aggregator name ID returned:''' YES<br />
 +
'''Interactive mode for partial matches:''' NO<br />
 
'''OpenRefine reconciliation API:''' No<br />
 
'''OpenRefine reconciliation API:''' No<br />
Other: <br />
+
'''Other:''' Did not work on 2 April 2024<br />
 
 
==[https://www.anbg.gov.au/apni/ Australian Plant Name Index]==
 
'''Scope:''' Australian plants<br />
 
Software updated: <br />
 
Codebase/Documentation:<br />
 
Data updated: <br />
 
Limitation:<br />
 
Local ID input returned: <br />
 
Local Name input returned: <br />
 
Aggregator name ID returned: <br />
 
Interactive mode for partial matches: <br />
 
OpenRefine reconciliation API:<br />
 
'''Other:''' [https://www.biorxiv.org/content/10.1101/2024.02.02.578715v1 APCalign: an R package workflow and app for aligning and updating flora names to the Australian Plant Census]<br />
 

Revision as of 07:23, 26 September 2024

In the following we list (in alphabetical order) large scale aggregators of taxonomic information that either provide name matching services themselves or are indirectly accessible for name checking by means of a “lookup” aggregator.

(Testing and documentation in progress)

Terminology

In the context of TETTRIs, we distinguish three types of "Target aggregators", i.e. online databases of organism names that offer services for the use cases defined in the TETTRIs project: Primary and secondary target aggregators and Repositories ("Lookup target aggregators"). See [1] for details.

See What is name matching? for general discussion on terminology and intent.

Algaebase

Type: primary
Scope: Global algae
Currently no name matching, but included in OBIS and Global Names. Could probably be implemented using the AlgaeBase API

Australian Plant Name Index

Type: primary
Scope: Australian plants
Software updated: ?
Codebase/Documentation: ?
Data updated: ?
Limitation: not stated, check with nearly 21,000 names ended in server error [23 may 2024]
Local ID input returned: No
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches: No
OpenRefine reconciliation API: No
Other: APCalign: an R package workflow and app for aligning and updating flora names to the Australian Plant Census

[Catalogue of Life]

Type: secondary
Scope: potentially all taxa but incomplete for some zoological groups (genera only) and for Algae
Software: Editions are integrated into Checklist Bank, see above and Global Names, see below.

Checklist Bank (GBIF & Catalogue of Life)

Type: lookup.
Scope: All taxa or specific groups, also geographically restricted, depending on the target dataset chosen
Software updated: May 16, 2024 (frontend), May 21, 2024 (backend) (checked May 23, 2024)
Codebase/Documentation: https://api.checklistbank.org/
Data updated: depending on target dataset
Limitation: Direct input of list limited to 6000 names. (With file upload for asynchronous response not limited)
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: YES - in download only
Interactive mode for partial matches: NO
OpenRefine reconciliation API: No (but for OpenRefine possible with REST services)
Other: Login with GBIF account is recommended (self-registration at https://www.gbif.org/user/profile)

==Euro+Med PlantBase Type: primary
Scope: European plants
Currently no name matching, but included in PESI

GBIF Taxonomic Backbone

Type: secondary
Scope: All taxa (GBIF taxonomic backbone)
Software updated:
Codebase/Documentation: see Species API
Data updated: Current
Limitation: 6000 records
Local ID input returned: Yes
Local Name input returned: Yes
Aggregator name ID returned: No
Interactive mode for partial matches: No
OpenRefine reconciliation API: No
Other:

Global Names Verifier

Type: lookup
Scope: defined by stored datasets - option to restrict matching to individual source dataset
Software updated: active Nov. 2023
Codebase/Documentation: https://github.com/gnames
Data updated: Differs for stored datasets
Limitation: 5000 names, at least in interactive mode
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches:
OpenRefine reconciliation API: yes, with step-by-step documentation: https://github.com/gnames/gnverifier/wiki/OpenRefine-readme
Other: Offers a kind of query language that seems to be very flexible
TETTRIS Notes includes
a 2021 Algabase set (but matching doesn’t work - [8/23])
an (supposedly) up to date Index Fungorum search – the output (HTML, JSON, CSV, TSV) contains a UUID in the „id“ field which is not the „Index Fungorum UUID“, probably a Global Names UUID. But the field „RecordID“ is the „Index Fungorum Registration Identifier“ which in a URL resolves to the name page: http://www.indexfungorum.org/Names/NamesRecord.asp?RecordID=229900

Index Fungorum

Type: primary
Scope: Global fungi
Currently no name matching, but included in Catalogue of Life and Global Names

International Plant Name Index (IPNI)

Type: primary
Scope: Vascular plants (POWO - IPNI offered but not working)
Software updated: ?
Codebase/Documentation ?
Data updated: current
Limitation: Not found - tested with 144.000 records
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: YES: IPNI-LSID
Interactive mode for partial matches: NO
OpenRefine reconciliation API: Yes - documentation: https://data1.kew.org/reconciliation/help
Other:

LifeWatch

Type: lookup
Scope:
Software updated:
Codebase/Documentation
Data updated:
Limitation:
Other:
Local ID input returned:
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches:
OpenRefine reconciliation API: No

PESI / eu-nomen

Type: secondary
Scope: European taxa
Software updated: 2011?
Codebase/Documentation: By refrence to components used (Taxamatch algorithm and scientific name parser)
Data updated: 2014
Limitation: 5,000 names
Local ID input returned: No
Local Name input returned: Yes
Aggregator name ID returned: Yes (as provided by the primary aggregator)
Interactive mode for partial matches: Yes
OpenRefine reconciliation API: No
Other:

TNRS Taxonomic Name Resolution Service

Type: lookup
Scope: Plants, WFO and vascular plants WCVP - potentially more datasets could be included
Software updated: v. 5.0 Feb. 24, 2021
Codebase/Documentation: https://github.com/ojalaquellueva/TNRSapi
Data updated: 2023
Limitation: Pasting 5000 names; API-processing unlimited (in batches of 5000)
Local ID input returned: No
Local Name input returned: YES
Aggregator name ID returned: NO
Interactive mode for partial matches: YES
OpenRefine reconciliation API: No
Other: R package available

Tropicos

Type: primary
Scope: Plants
Software updated:
Codebase/Documentation
Data updated:
Limitation:
Local ID input returned: Yes
Local Name input returned: Yes
Aggregator name ID returned: Yes: Tropicos-ID
Interactive mode for partial matches: No
OpenRefine reconciliation API: No
Other:

World Flora Online WFO Plant List

Type: primary
Scope: Plants
Software updated: ongoing Sept. 2023 (not stated on website)
Codebase/Documentation: GraphQL API, Name Matching REST API, Reconciliation API
Data updated: July 2023 (semiannual edition)
Limitation: Not found - tested with 144.000 records
Local ID input returned: Yes
Local Name input returned: Yes
Aggregator name ID returned: Yes - WFO-ID
Interactive mode for partial matches: Yes
OpenRefine reconciliation API: yes: https://list.worldfloraonline.org/reconcile_index.php
Other: Service can be installed as local copy
Other: R-Package World Flora - see https://cran.r-project.org/web/packages/WorldFlora/index.html

WoRMS (World Register of Marine Species)

Type: primary.
Scope: Marine species
Software updated: not stated
Codebase/Documentation:
Data updated: current
Limitation: limited to 1500 names.
Local ID input returned: NO
Local Name input returned: YES
Aggregator name ID returned: YES
Interactive mode for partial matches: NO
OpenRefine reconciliation API: No
Other: Did not work on 2 April 2024