Data upload to rebind framework
Contents
Uploading data to the reBiND portal
The previous data preparation step ensures the data is in ABCD format. Once the data has been prepared an XML file (conforming to the ABCD schema) can be exported from the Biocase Provider software. It should be noted that ABCD files can be prepared using any software. There are other types of software that use a metadata-based approach to extract data from CSV and relational databases and enable transformation into XML format, for example Pentaho Kettle. Furthermore the reBiND software has been designed to work with any XML file, not just ABCD. The validation and correction tools can be configured to work with any XML file, providing there is an associated XML schema available.
In addition to at least one XML data file the project should also contain a metadata file. Additional files can also uploaded, such as the original data file (from which the XML data file was generated), images, other multimedia objects or PDF files. For the sake of clarity image and multimedia objects should be placed into special subcollections within the data project collection, like images/. A data project could also contain more than one XML data document, however only one metadata file.
The figures below show a step-by-step guide to the process of uploading data to the reBiND portal. Once these steps are completed the user can continue to the the validation and automated correction steps.
Logging onto reBiND
The homepage of the reBiND portal has links to the published data-sets and to further information on the project web-site. In figure 1 below the left-hand margin shows the login form. Login is required in order to submit data to the reBiND system. For information on how to set up user accounts for login see this page.
Creating and viewing unpublished projects
After logging into the reBiND portal the user is presented with a left-hand side panel which lists the 'Unpublished' and 'Published' projects. An icon 'Create Project' which is used to create a new unpublished project to which XML files and other data can be imported. In the screenshot below the unpublished project 'ClemensHBG' has been slected and the summary of the files associated with this project can be seen in the right-hand panel. Below this are icons to upload further data.
To create an entirely new project the user should click 'Create Project' in the left-hand side panel and in the pop up form supply a unique name for the new project. The project should have a clear descriptive name, but must not contain special characters, digits, spaces or dashes. In the figure below we have used 'Puffinus' to identify the project - data on the Puffinus Creatopus - the pink-footed Shearwater.
Importing XML and other data files from file system
After creating the new project the project name (in this example 'Puffinus') should appear in the list of un-published projects in the left-hand side panel.
Clicking on the project name takes you to the list of files associated with the project. In this case there are no files yet associated with the project as it is a new empty project. The 'Upload File' can be used to upload any file type from the file system. Several files of various file types (e.g. XML, PDF and images) and folders can be added to a project. 'Upload from BioCASE' enables the user to connect to a specfic ABCD file stored in the Biocase Provider software, if the URL is known. However this is currently restricted to files below a certain size (a maximum of 700 records / abcd:Units).
Depending on the file type different options are offered for the current file. All file types have the option to view/download the file in its native form. Text based file types have the option to edit the file online. XML files can be validated against their schema (if it is registered with the reBiND Software) and can be corrected or modified by running automated corrections on them. Below the details of the data file 'reBiND_Puffinus.xml' are shown and the list of available actions. 'View XML' and 'View Data' link to an XML view or a tabular view of the data respectively. In the next section we'll describe remaining actions in turn, going into detail of how to run the validation and correction actions.