Difference between revisions of "Data upload to rebind framework"

From reBiND Documentation
Jump to: navigation, search
m (56 revision)
(Uploading data to the reBiND portal)
Line 1: Line 1:
 
= Uploading data to the reBiND portal =
 
= Uploading data to the reBiND portal =
  
The previous data preparation step ensures the data is in ABCD format. Once the data has been prepared an XML file (conforming to the ABCD schema) can be exported from the Biocase Provider software. It should be noted that ABCD files can be prepared using any software. There are other types of software that use a metadata-based approach to extract data from CSV and relational databases and enable transformation into XML format, for example [http://community.pentaho.com/projects/data-integration/ Pentaho Kettle]. Furthermore the reBiND software has been designed to work with any XML file, not just ABCD. The validation and correction tools can be configured to work with any XML file, providing there is an associated XML schema available.
+
The previous data preparation step ensures the data is in ABCD format. Once the data has been prepared an XML file (conforming to the ABCD schema) can be exported from the Biocase Provider software. It should be noted that ABCD files can be prepared using any software. There are other types of software that use a metadata-based approach to extract data from CSV and relational databases and enable transformation into XML format, for example [http://community.pentaho.com/projects/data-integration/ Pentaho Kettle]. Furthermore the reBiND software has been designed to work with any XML file, not just ABCD, providing there is an associated XML schema available.
  
In addition to at least one XML data file the project should also contain a metadata file. Additional files can also uploaded, such as the original data file (from which the XML data file was generated), images, other multimedia objects or PDF files. For the sake of clarity image and multimedia objects should be placed into special subcollections within the data project collection, like ''images/''. A data project could also contain more than one XML data document, however only one metadata file.   
+
In addition to at least one XML data file the project should also contain a metadata file. Additional files can also uploaded, such as the original data file (from which the XML data file was generated), images, other multimedia objects or PDF files. For the sake of clarity image and multimedia objects should be placed into special sub-collections within the data project collection, like ''images/''. A data project could also contain more than one XML data document, however only one metadata file.   
  
 
The figures below show a step-by-step guide to the process of uploading data to the reBiND portal. Once these steps are completed the user can continue to the the [[Validation_and_Corrections|validation and automated correction]] steps.
 
The figures below show a step-by-step guide to the process of uploading data to the reBiND portal. Once these steps are completed the user can continue to the the [[Validation_and_Corrections|validation and automated correction]] steps.
Line 16: Line 16:
 
== Creating and viewing unpublished projects ==
 
== Creating and viewing unpublished projects ==
  
After logging into the reBiND portal the user is presented with a left-hand side panel which lists the 'Unpublished' and 'Published' projects. An icon 'Create Project' which is used to create a new unpublished project to which XML files and other data can be imported. In the screenshot below the unpublished project 'ClemensHBG' has been slected and the summary of the files associated with this project can be seen in the right-hand panel. Below this are icons to upload further data.
+
After logging into the reBiND portal the user is presented with a left-hand side panel which lists the 'Unpublished' and 'Published' projects. An icon 'Create Project' which is used to create a new unpublished project to which XML files and other data can be imported. In the screenshot below the unpublished project 'ClemensHBG' has been selected and the summary of the files associated with this project can be seen in the right-hand panel. Below this are icons to upload further data.
  
  
Line 31: Line 31:
 
After creating the new project the project name (in this example 'Puffinus') should appear in the list of un-published projects in the left-hand side panel.
 
After creating the new project the project name (in this example 'Puffinus') should appear in the list of un-published projects in the left-hand side panel.
  
Clicking on the project name takes you to the list of files associated with the project. In this case there are no files yet associated with the project as it is a new empty project. The 'Upload File' can be used to upload any file type from the file system. Several files of various file types (e.g. XML, PDF and images) and folders can be added to a project. 'Upload from BioCASE' enables the user to connect to a specfic ABCD file stored in the Biocase Provider software, if the URL is known. However this is currently restricted to files below a certain size (a maximum of 700 records / abcd:Units).
+
Clicking on the project name takes you to the list of files associated with the project. In this case there are no files yet associated with the project as it is a new empty project. The 'Upload File' can be used to upload any file type from the file system. Several files of various file types (e.g. XML, PDF and images) and folders can be added to a project. 'Upload from BioCASE' enables the user to connect to a specific ABCD file stored in the Biocase Provider software, if the URL is known. However this is currently restricted to files below a certain size (a maximum of 700 records / abcd:Units).
  
 
[[File:ReBIND_portal_project_new_project.PNG|border]]
 
[[File:ReBIND_portal_project_new_project.PNG|border]]

Revision as of 02:02, 19 November 2014

Uploading data to the reBiND portal

The previous data preparation step ensures the data is in ABCD format. Once the data has been prepared an XML file (conforming to the ABCD schema) can be exported from the Biocase Provider software. It should be noted that ABCD files can be prepared using any software. There are other types of software that use a metadata-based approach to extract data from CSV and relational databases and enable transformation into XML format, for example Pentaho Kettle. Furthermore the reBiND software has been designed to work with any XML file, not just ABCD, providing there is an associated XML schema available.

In addition to at least one XML data file the project should also contain a metadata file. Additional files can also uploaded, such as the original data file (from which the XML data file was generated), images, other multimedia objects or PDF files. For the sake of clarity image and multimedia objects should be placed into special sub-collections within the data project collection, like images/. A data project could also contain more than one XML data document, however only one metadata file.

The figures below show a step-by-step guide to the process of uploading data to the reBiND portal. Once these steps are completed the user can continue to the the validation and automated correction steps.

Logging onto reBiND

The homepage of the reBiND portal has links to the published data-sets and to further information on the project web-site. In figure 1 below the left-hand margin shows the login form. Login is required in order to submit data to the reBiND system. For information on how to set up user accounts for login see this page.


Rebind portal logon.PNG

Creating and viewing unpublished projects

After logging into the reBiND portal the user is presented with a left-hand side panel which lists the 'Unpublished' and 'Published' projects. An icon 'Create Project' which is used to create a new unpublished project to which XML files and other data can be imported. In the screenshot below the unpublished project 'ClemensHBG' has been selected and the summary of the files associated with this project can be seen in the right-hand panel. Below this are icons to upload further data.


ReBIND portal project overview.PNG


To create an entirely new project the user should click 'Create Project' in the left-hand side panel and in the pop up form supply a unique name for the new project. The project should have a clear descriptive name, but must not contain special characters, digits, spaces or dashes. In the figure below we have used 'Puffinus' to identify the project - data on the Puffinus Creatopus - the pink-footed Shearwater.


ReBIND portal project create project.PNG

Importing XML and other data files from file system

After creating the new project the project name (in this example 'Puffinus') should appear in the list of un-published projects in the left-hand side panel.

Clicking on the project name takes you to the list of files associated with the project. In this case there are no files yet associated with the project as it is a new empty project. The 'Upload File' can be used to upload any file type from the file system. Several files of various file types (e.g. XML, PDF and images) and folders can be added to a project. 'Upload from BioCASE' enables the user to connect to a specific ABCD file stored in the Biocase Provider software, if the URL is known. However this is currently restricted to files below a certain size (a maximum of 700 records / abcd:Units).

ReBIND portal project new project.PNG

The screenshot below shows the 'Upload from BioCASE' option.

Upload data biocase.PNG


Depending on the file type different options are offered for the current file. All file types have the option to view/download the file in its native form. Text based file types have the option to edit the file online. XML files can be validated against their schema (if it is registered with the reBiND Software) and can be corrected or modified by running automated corrections on them. Below the details of the data file 'reBiND_Puffinus.xml' are shown and the list of available actions. 'View XML' and 'View Data' link to an XML view or a tabular view of the data respectively. In the next section we'll describe remaining actions in turn, going into detail of how to run the validation and correction actions.


ReBIND portal project upload file actions.PNG