Difference between revisions of "End-user workflows for name matching"

From TETTRIs
Jump to: navigation, search
Line 1: Line 1:
The details of workflows that end-users follow very strongly depends on the individual use case, which may range from checking a single name to uploading a regional or monographic checklist with thousands of names.  
+
The workflows that end-users follow vary significantly based on the specific use case, ranging from checking a single name to uploading a regional or monographic checklist with thousands of names. There are three main types of name-checking processes:
Three types of the checking process itself can be distinguished:
+
#'''Direct Use of the Aggregator's Name Matching Mechanisms:''' Utilizing the tools provided directly by the aggregator.
#Directly using the name matching mechanisms the aggregator provides
+
#'''Using Third-Party Tools:''' Leveraging tools such as [[https://openrefine.org/ OpenRefine]] that access the aggregator's name matching services.
#Using a tool that accesses the aggregator's name matching services (such as [[https://openrefine.org/ OpenRefine]])
+
#'''Using Local Tools:''' Downloading the aggregator's data and using local tools to perform the matching.
#Using local tools (following a download of the aggregator's data)
 
 
The choice of method mainly depends on the expected result but also on the number of records to be matched and on the technical in-house expertise available to the user. A type 3 process usually requires some expertise in biodiversity data management. TETTRIs provides links to [[Downloads_from_aggregators|download sites]] to get the aggregator's data. For type 2, TETTRIs will provide some example use cases that have been successfully tested. For type 1 (direct use of the aggregator’s services, the respective documentation will be pointed out in a list paralleling the [[Existing_name_checking_mechanisms|list of general capabilities of aggregators]].   
 
The choice of method mainly depends on the expected result but also on the number of records to be matched and on the technical in-house expertise available to the user. A type 3 process usually requires some expertise in biodiversity data management. TETTRIs provides links to [[Downloads_from_aggregators|download sites]] to get the aggregator's data. For type 2, TETTRIs will provide some example use cases that have been successfully tested. For type 1 (direct use of the aggregator’s services, the respective documentation will be pointed out in a list paralleling the [[Existing_name_checking_mechanisms|list of general capabilities of aggregators]].   
 +
 +
The choice of method depends on the expected outcome, the volume of records to be matched, and the technical expertise available to the user. Type 3 processes generally require expertise in biodiversity data management. TETTRIs offers links to [[Downloads_from_aggregators|download sites]] for the aggregator's data. For type 2 processes, TETTRIs provides example use cases that have been successfully tested. For type 1 processes, relevant documentation will be documented paralleling the listed [[Existing_name_checking_mechanisms|list of general capabilities of aggregators]].
  
 
For the process itself, we can in principle distinguish 4 phases:
 
For the process itself, we can in principle distinguish 4 phases:

Revision as of 13:45, 23 May 2024

The workflows that end-users follow vary significantly based on the specific use case, ranging from checking a single name to uploading a regional or monographic checklist with thousands of names. There are three main types of name-checking processes:

  1. Direct Use of the Aggregator's Name Matching Mechanisms: Utilizing the tools provided directly by the aggregator.
  2. Using Third-Party Tools: Leveraging tools such as [OpenRefine] that access the aggregator's name matching services.
  3. Using Local Tools: Downloading the aggregator's data and using local tools to perform the matching.

The choice of method mainly depends on the expected result but also on the number of records to be matched and on the technical in-house expertise available to the user. A type 3 process usually requires some expertise in biodiversity data management. TETTRIs provides links to download sites to get the aggregator's data. For type 2, TETTRIs will provide some example use cases that have been successfully tested. For type 1 (direct use of the aggregator’s services, the respective documentation will be pointed out in a list paralleling the list of general capabilities of aggregators.

The choice of method depends on the expected outcome, the volume of records to be matched, and the technical expertise available to the user. Type 3 processes generally require expertise in biodiversity data management. TETTRIs offers links to download sites for the aggregator's data. For type 2 processes, TETTRIs provides example use cases that have been successfully tested. For type 1 processes, relevant documentation will be documented paralleling the listed list of general capabilities of aggregators.

For the process itself, we can in principle distinguish 4 phases:

  • Preparing the data

In all cases, a list of names is needed in text-only format, which can be created from a spreadsheet column or be part of a table containing these names. One name only in one line is always required.

  • Submitting the data

Depends on the type of checking process.

  • Getting and interpreting the results

Essentially, for process type 1 and 2 the results are provided by listing exact matches and possible candidates, i.e. names that match the input to a certain extent. Interpretation refers to assessing the candidates and, if appropriate, selecting one of them as the correct match.

  • Incorporating the results locally

On the one hand, this refers to corrections of names made locally as a result of candidate matching. On the other hand, once matches were made unambiguously, this may result in incorporating the aggregator's name ID into the local dataset, to allow linkage to the aggregator and (if such functionality is made available) interaction with the aggregator.