Name matching services

From TETTRIs
Jump to: navigation, search

These are online services that let users compare their lists of scientific organism names against [[Taxonomic_datasets|reference datasets]. Because user needs can vary widely, selecting both an appropriate dataset and a suitable name-matching service requires careful consideration before deciding on the most appropriate workflow. Also see What is name matching? for a general discussion on terminology and intent.
The following results from the TNLS (Taxonomic Name Linking Services) TETTRIs Satellite Project provide an important overview of name service functionality:
Overview of input parameters of aggregator services
Overview of output fields of aggregator services
We distinguish between services that act upon repositories of taxonomic datasets and single-dataset services.

Repository services

Checklist Bank (GBIF & Catalogue of Life)

Taxonomic scope: All taxa or specific groups depending on the target dataset chosen
Geographic scope: Global or specific areas, depending on the target dataset chosen
Software updated: current (last checked june 26, 2025)
Codebase/Documentation: https://www.checklistbank.org/about/API
Data updated: depending on target dataset
Limitation: Direct input of list limited to 6000 names. (With file upload for asynchronous response not limited)
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: YES - in download only
Interactive mode for partial matches: NO
OpenRefine reconciliation API: NO (but for OpenRefine possible with REST services)
Other: Login with GBIF account is recommended, required for file upload (self-registration at https://www.gbif.org/user/profile)
Offers the possibility to match datasets in repository against other such datasets.

Global Names Verifier

Taxonomic scope: defined by stored datasets - option to restrict matching to individual source dataset
Geographic scope: global (cross datasets or with global datasets) or restricted by choice of dataset
Software updated: active Feb. 2025
Codebase/Documentation: https://github.com/gnames
Codebase/Documentation: https://resolver.globalnames.org/api
Data updated: Differs for stored datasets
Limitation: 5000 names, at least in interactive mode
Local ID input returned: NO
Local Name input returned: YES
Aggregator name ID returned: YES (may be a taxon ID)
Interactive mode for partial matches: NO
OpenRefine reconciliation API: YES, with step-by-step documentation: https://github.com/gnames/gnverifier/wiki/OpenRefine-readme
Other: Offers a kind of query language that seems to be very flexible

TNRS Taxonomic Name Resolution Service also: http://tnrs.iplantcollaborative.org/

Taxonomic scope: Plants, WFO and vascular plants WCVP - potentially more datasets could be included
Geographic scope: Global
Software updated: v. 5.0 Feb. 24, 2021
Codebase/Documentation: https://github.com/ojalaquellueva/TNRSapi and https://github.com/iPlantCollaborativeOpenSource/TNRS/.
Data updated: 2023 (2024)
Limitation: Pasting 5000 names; API-processing unlimited (in batches of 5000)
Local ID input returned: NO
Local Name input returned: YES
Aggregator name ID returned: NO
Interactive mode for partial matches: YES
OpenRefine reconciliation API: NO
Other: API and R package available


Single-dataset name matching services

Australian Plant Name Index

Taxonomic scope: Plants
Geographic scope: Australia
Software updated: ?
Codebase/Documentation: ?
Data updated: ?
Limitation: not stated, check with nearly 21,000 names ended in server error [23 may 2024]
Local ID input returned: NO
Local Name input returned:
Aggregator name ID returned:
Interactive mode for partial matches: NO
OpenRefine reconciliation API: NO
Other: APCalign: an R package workflow and app for aligning and updating flora names to the Australian Plant Census

GBIF Taxonomic Backbone

Taxonomic scope: All taxa
Geographic scope: global
Software updated: current
Codebase/Documentation: see Species API
Data updated: August 28, 2023 (no further updates, but will stay online; will most probably be replaced by COL eXtended edition)
Limitation: 6000 records
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: NO
Interactive mode for partial matches: NO
OpenRefine reconciliation API: NO
Other: "Multi-taxonomy" mode in preparation - will allow to match against other taxonomies (e.g. COL eXtended edition). Matching Service including data will be downloadable as a Docker image.

International Plant Name Index (IPNI)

Taxonomic scope: Vascular plants (source: POWO; IPNI offered but not working)
Geographic scope: Global
Software updated: ?
Codebase/Documentation ?
Data updated: current
Limitation: Not found - tested with 144.000 records
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: YES: IPNI-LSID
Interactive mode for partial matches: NO
OpenRefine reconciliation API: YES - documentation: https://data1.kew.org/reconciliation/help
Other:

LifeWatch

Refers to Global Names for name matching.

PESI / eu-nomen

Taxonomic scope: All taxa
Geographic scope: Europe
Software updated: 2011?
Codebase/Documentation: By reference to components used (Taxamatch algorithm and scientific name parser)
Data updated: 2014
Limitation: 5,000 names
Local ID input returned: NO
Local Name input returned: YES
Aggregator name ID returned: YES (as provided by the primary aggregator)
Interactive mode for partial matches: YES
OpenRefine reconciliation API: NO
Other:

ROpenSci taxize

Taxonomic scope: All taxa or specific groups, depending on dataset used
Geographic scope: Global or regional, depending on dataset used
Software updated: Feb 2025
Codebase/Documentation: https://github.com/ropensci/taxize/
Data updated:
Limitation:
Local ID input returned:
Local Name input returned:
Aggregator name ID returned: YES
Interactive mode for partial matches: NO
OpenRefine reconciliation API: n/a
Other:

Tropicos

Taxonomic scope: Plants
Geographic scope: Global
Software updated:
Codebase/Documentation
Data updated: Current
Limitation:
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: YES: Tropicos-ID
Interactive mode for partial matches: NO
OpenRefine reconciliation API: NO
Other:

World Flora Online WFO Plant List

Taxonomic scope: Plants
Geographic scope: Global
Software updated: ongoing June 2025 (not stated on website)
Codebase/Documentation: GraphQL API, Name Matching REST API, Reconciliation API
Data updated: December 2024 (semiannual edition)
Limitation: Not found - tested with 144.000 records
Local ID input returned: YES
Local Name input returned: YES
Aggregator name ID returned: YES - WFO-ID
Interactive mode for partial matches: YES
OpenRefine reconciliation API: YES: https://list.worldfloraonline.org/reconcile_index.php
Other: Service can be installed as local copy
Other: R-Package World Flora - see https://cran.r-project.org/web/packages/WorldFlora/index.html

WoRMS (World Register of Marine Species)

Scope: Marine species (global)
Software updated: not stated
Codebase/Documentation:
Data updated: current
Limitation: limited to 1500 names.
Local ID input returned: NO
Local Name input returned: YES
Aggregator name ID returned: YES (AphiaID)
Interactive mode for partial matches: NO
OpenRefine reconciliation API: NO

R-packages that include name matching

Grenié & al. (2022) cover this subject in detail, identifying a number of packages providing direct or indirect access to online taxonomic datasets. They also point to an application (“taxharmonizeexplorer") that should aid R-users to select tools and datasets. The website for the app lists and graphically depicts the relationship between taxonomic datasets and R-packages (as of end of July 2025, it covers 68 packages). If this tool continues to be updated, it should be the primary source for R-programmers to identify useful packages for name matching, and apply these in a workflow detailed in Grenié & al.’s paper.