Downloads from aggregators

From TETTRIs
Jump to: navigation, search

ChecklistBank (CLB)

CLB was developed by GBIF and the Catalogue of Life. It holds a huge number of individual datasets, ranging from large checklists (like Catalogue of Life editions or World Flora Online) to data extracted from individual publications (taxonomic treatments). The latter, provided by PLAZI, form the bulk of the submissions (as of February 2025, nearly 55,000 datasets of a total of about 57,500). Functioning of the portal is documented in a tutorial for users and the code is managed on Github. All data can be downloaded, in the original format or in various formats. Download requires a login with a free GBIF user account. Parts of checklists can be downloaded by means of selecting a root taxon, e.g. a genus within the checklist, or a specific taxonomic rank.

Catalogue of Life (CoL) Downloads

The entire CoL in its latest version can be downloaded from https://www.catalogueoflife.org/data/download in ColDP Archive. Darwin Core Archive, ACEF Archive, or TextTree format. CoL-ChecklistBank also offers partial downloads (in various formats, with DOI), this requires a free GBIF user account.

World Flora Online (WFO) Plant List Downloads

Downloads of WFO-Data are available for the published versions (currently 6-monthly updates, published via ZENODO, with DOI). The current version can be found under https://zenodo.org/records/8079052. If there is a later version available, this will be indicated at the top of the page.
The following formats are available (example for the December 2022 version):

  • wfo_plantlist_2022-12.zip The Catalogue of Life Data Package of the WFO Plant List. This is the most expressive standards based form of the list.
  • plant_list_2022-12.json.zip JSON formatted version of the WFO Plant List. This has been designed for direct import into a schemaless instance of a SOLR index and is used to drive the WFO Plant List API (https://list.worldfloraonline.org) which in turn drives the WFO Plant List in the portal. This is recommended if you want a local, read only version of the list rather than use the API.
  • plant_list_2022-12.sql.gz This is the complete production database (minus logging data and API keys) as a MySQL backup file. It can be restored directly to a MySQL 5.7 or later instance if you require the list in SQL format.
  • ipni_to_wfo.csv.gz A file mapping all the IPNI IDs we track to their associated WFO IDs.
  • families_dwc.tar.gz Individual Darwin Core Archive files for each of 718 recognized families. If you want a single family in DwC but can't load the whole list download and expand this file. Family and genus files are also available for download through the portal.
  • DwC_backbone_R.zip A single Darwin Core Archive file containing non deprecated names and taxa for use in the existing R package.
  • _uber.zip A single Darwin Core Archive file containing all names and taxa even those that are deprecated along with some extra columns

Weekly updated DwC-Archive files for all families and for the _uber.zip are available at https://list.worldfloraonline.org/rhakhis/api/downloads/dwc/

World Checklist of Vascular Plants (WCVP)

According to Rafael Govaerts (pers. comm. July 17, 2024), the primary place for the WCVP dataset is on PoWo, under "DATA": WCVP. Both, text (csv) and DarwinCore download are available, with citation and other metadata provided in a readme.txt and the eml.xml file, respectively. The WCVP data can also be checked and downloaded via Checklist Bank in its latest version (the metadata there are less precise).

Integrated Taxonomic Information System - ITIS

Up to 32,727 records of a specific taxonomic group can be downloaded in Taxonomic Workbench format or as DwC-A - see https://www.itis.gov/access.html.

(The) Paleobiology Database

All records can be downloaded in several formats using the Download Generator

NCBI Taxonomy

The full taxonomy database along with files associating nucleotide and protein sequence records with their taxonomy IDs. https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/