BeginnersGuide

From BioCASe Provider Software
Jump to: navigation, search

[Diese Seite auf Deutsch]

What is the BioCASe Provider Software?

The BioCASe Provider Software is an XML data binding middleware for publishing data from a relational database to an information network. After installing BioCASe and configuring it for a given database, the published information will be accessible as a BioCASe web service, which means it can be retrieved with BioCASe protocol requests. The Provider Software is agnostic of the data model used for data publication and can be used in conjunction with any conceptual schema.

The Provider Software is a suite of tools that need to be installed on the data provider’s web server. The core component is the PyWrapper, an XML/CGI database interface written in Python that allows a standardized access to a variety of database management systems and arbitrarily structured databases. Grouped around this are a number of tools for configuring, testing and debugging a BioCASe installation or web service.

Even though BioCASe can be used for any conceptual XML data schema, its main field of application is the publication of occurrence data from specimen or observational databases to primary biodiversity information networks such as the BioCASe network and the Global Biodiversity Information Facility.

Requirements for Using the BPS

Before using the BioCASe Provider Software, four requirements need to be checked:

  • For one thing, the data to be published needs to be stored in an SQL compliant relational database management system (DBMS), with the data model being well documented or at least knowledge about the data model existing on someone’s head. Non-SQL DBMS are not supported. Currently, the BioCASe Provider Software can connect to the following databases: Microsoft Access, Microsoft SQL Server, MySQL, Postgres, Oracle, Foxpro, Sybase, 4D, DB2, Firebird.
  • For installing BioCASe, you need to have a web server (Apache or Microsoft IIS) that is permanently connected to the Internet. This can be an institutional server or a hosting service (even though it’s usually a pain to get privileges to install software for the latter). Dial-up connections won’t work.
  • For the web server in question, you must have the possibility to install Python (if it is not installed yet) and Python packages. Even though this can be done by the IT staff, you should make sure that there are no institutional policies colliding. If you want to create DarwinCore Archives, a Java Runtime Environment is required.
  • A direct connection can be established from the web server that will run the BPS to the database to be published – either because there’s no firewall in between, or the port used by the DBMS is open (e. g. 3306 for MySQL, 1433 for SQL Server).

BioCASe can now be deployed as a Docker container, which makes it very to install, update and remove BioCASe. (Docker is an operating-system-level virtualization software that allows easy software deployment; if you don't know yet what Docker is, you should read the Get Started Guide.)

Steps

Once you’ve made sure the BioCASe Provider Software is the right choice for you (contact the BioCASe team if you feel unsafe with your decision), you need to do the following steps:

  1. Think about which information you want to publish to the network. This involves settling the question of ownership of the data and the terms of use, license and usage restrictions for the published data. Also you should consider which information should not be published because they might affect endangered species (options are to block certain occurrences from publication or to blur the exact locality information).
  2. Prepare the database for publication as described in the Preparation tutorial. This includes the decision on whether to publish the live database or a snapshot, the creation of a table for metadata and, in certain cases, some data transformation.
  3. Follow the Installation Tutorial to install the BioCASe Provider Software.
  4. Set up a BioCASe Datasource and connect it to your database, then configure the database structure as described in the DatasourceSetup tutorial.
  5. Follow the ABCD2Mapping tutorial to create a mapping for the schema you want to use.
  6. Test and debug your BioCASe web service. Read the Debugging tutorial to learn how to do this.
  7. Once you’re done with setting up and configuring the BioCASe Provider Software, you can register the BioCASe web service with the information network you want to publish your data to, for example with GBIF. We strongly recommend to contact the BioCASe team for checking your web service first as we might be able to foresee problems that are more difficult to resolve after registration.

Support

In case you run into difficulties during the installation and setup, you should

  • have a look at the FAQ which might list the solution for your question,
  • ask other BioCASe data providers for help,
  • get in contact with the BioCASe team. Please describe exactly your environment (operating system, DBMS used), the steps that cause the problem and any error messages and hints you get. Screenshot are a big plus here.