Wish list for name matching services
From TETTRIs
Revision as of 15:50, 22 May 2024 by WalterBerendsohn (talk | contribs)
Contents
Terminology
- candidates are returned partial matches, as opposed to exact matches
- canonical name
- asynchronous output
General
Input
- Allow input of a pasted column of names
- Allow upload of a table with names (dialogue: name bearing column[s])
- Allow upload of any other column
Interactive mode
Output
- Avoid returning absurd candidates (names that are completely improbable)
- In asynchronous mode, return matched records and candidates in a single table
- If the name is entered as a string, parse the name and match the name components independently.
- Allow wildcards
- Allow input of name components in separate fields (at least in uploads):
- For ICNAFP: Monomial / genus component; infrageneric rank; infrageneric epithet; species epithet; infraspecific rank; infraspecific epithet; basionym author (team); combination author (team); year of publication.
- For ICZN: work in progress.
- Exactly match the rank of the name, if unambiguous in the input – i.e. do not return a subspecies for a variety, a genus name for a family, etc.
Parameters for exact matches
- optional (ICNAFP): accept IPNI, Tropicos and full spaced author abbreviations as exact matches
- optional (ICNAFP): ignore ex authors (the author or team preceeding the ex)
- optional (ICNAFP): ignore hybrid symbol (or “x” space/space “x” space) in name
- optional (ICNAFP): ignore authors in autonyms
- optional: ignore endings in epithets (ICNAFP) / (species/subspecies name (ICZN)
For candidate matches
- exactly match the rank of the name, if unambiguous in the input
- weigh probabilities hierchically:
- e.g., in a species name, a full or near full match on a genus name is more important than that of the species epithet (epithets may be used in many genera).