Difference between revisions of "Manual review of data"

From reBiND Documentation
Jump to: navigation, search
(Manual Review and Corrections)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
==Manual Review and Corrections==
 
==Manual Review and Corrections==
The results from the automated correction can be manually reviewed and edited via the web interface. It is especially importanat to review the errors and warnings. Errors and warnings could be caused by technical issues or with the content of the document itself. More minor changes are flagged as 'info' messages.  If there are any errors or warnings it is advised to have both, the contributing scientist and the content administrator go through the review together. Errors in the document could be fixed by modifying the correction configuration and re-running the automated correction with a new configuration file.
+
The results from the automated correction can be manually reviewed and edited via the web interface. It is especially important to review the errors and warnings. Errors and warnings could be caused by technical issues or with the content of the document itself. More minor changes are flagged as 'info' messages.  If there are any errors or warnings it is advisable to have both the contributing scientist and the content administrator go through the review together. Errors in the document could be fixed by modifying the correction configuration and re-running the automated correction with a new configuration file.
  
 
There is an online XML editor included in the eXist release, called eXide, an online demo is available at the eXist homepage. The eXide editor is a modification of the online source code editor ACE - [http://ace.ajax.org/ Could9 Editor]. With eXide it is possible to directly edit documents stored in the database, including features like syntax highlighting or code folding.  
 
There is an online XML editor included in the eXist release, called eXide, an online demo is available at the eXist homepage. The eXide editor is a modification of the online source code editor ACE - [http://ace.ajax.org/ Could9 Editor]. With eXide it is possible to directly edit documents stored in the database, including features like syntax highlighting or code folding.  
Line 11: Line 11:
  
  
 +
[[File:Correction_output_review.png|border]]
  
When the reviewer clicks on any of the items in the list, the editor will directly jump to the element that was changed within the document. A modified GUI could also allow to only view changes of a certain type (class) or done by a certain module. It could also allow to flag changes as reviewed and hide reviewed changes from the list.
 
  
Though this might be a good way for the Content Administrator and the Technical Administrator to review the changes, there is still the problem that the Contributing Scientist is confronted with the XML document and required to work with it. So at some point in the future a better interface for the Contributing Scientist to work with the data might be desirable (e.g. the use of automatically generated web forms to edit the data), but for the general infrastructure described in this text, the online XML editor is sufficient.  
+
In the left-hand panel a list of 'Issues' is displayed and in the main editor window the data file is displayed. Clicking on any 'Issue' in the left-hand panel takes the user to the corresponding change in the data file. In the example shown the first issue in the list has been clicked. This expands the 'Issue' and shows the 'Old Content' and the 'New Content'. In this case the problem was that the XML schema required the content to be of the type xs:DateTime, but the old content only gave a year date range. The automated correction was of a type called 'Element Text Replacer'. This type of correction replaces a specific pattern (a regular expression) at a specified position within the XML file with some other text. The technical documentation details the different types of correction and [[Correction Modules|how to modify the correction modules and specify a different configuration]]. In this example the lower year is taken as the year and the date is assumed to be the 1st January of that year and the time is assumed to be midnight. If this change is acceptable to the reviewer then they can click the checkbox to indicate they agree within the change.  
  
It is not necessary to review all the changes at once. The document can be stored at any time and the review process can be resumed at a later point. So it is possible that after the correction one of the administrators reviews the changes which are caused by technical issues or the XML format used, leaving only changes which are related to the data. Then the Contributing Scientist can review the remaining changes.
+
If the change is not acceptable, the user should run another set of corrections on the original data file. To do this you need to [[Correction Modules|change the correction configuration or add new correction modules]], in consultation with a technical administrator.
 +
 
 +
A modified GUI could also allow only changes of a certain type (class) or carried out by certain module to be displayed. It could also hide reviewed changes from the list. Though this might be a good way for the Content Administrator and the Technical Administrator to review the changes, there is still the problem that the Contributing Scientist is confronted with the XML document and required to work with it. So at some point in the future a better interface for the Contributing Scientist to work with the data might be desirable (e.g. the use of automatically generated web forms to edit the data), but for the general infrastructure described in this text, the online XML editor is sufficient.
 +
 
 +
It is not necessary to review all the changes at once. The document can be stored at any time and the review process can be resumed at a later point. So it is possible that after the correction one of the administrators reviews the changes which are caused by technical issues or the XML format used, leaving only changes which are related to the data. Then the Contributing Scientist can review these.
 +
 
 +
 
 +
After the correction is finished the file can be validated again. This time - if the correction modules have been able to fix the original errors the file should be valid. The screenshot below shows rerunning the 'validation' on the reBiND_Puffinus.xml after the correction step has been run.
 +
 
 +
 
 +
[[File:Validation_final.PNG|border]]

Latest revision as of 02:08, 19 November 2014

Manual Review and Corrections

The results from the automated correction can be manually reviewed and edited via the web interface. It is especially important to review the errors and warnings. Errors and warnings could be caused by technical issues or with the content of the document itself. More minor changes are flagged as 'info' messages. If there are any errors or warnings it is advisable to have both the contributing scientist and the content administrator go through the review together. Errors in the document could be fixed by modifying the correction configuration and re-running the automated correction with a new configuration file.

There is an online XML editor included in the eXist release, called eXide, an online demo is available at the eXist homepage. The eXide editor is a modification of the online source code editor ACE - Could9 Editor. With eXide it is possible to directly edit documents stored in the database, including features like syntax highlighting or code folding.

The eXide editor that comes packaged with eXist has been modified to create the reBiND editor.

Reviewing corrections in the reBiND Editor

A screenshot showing the results of the correction on the reBiND_Puffinus.xml data file is shown below:


Correction output review.png


In the left-hand panel a list of 'Issues' is displayed and in the main editor window the data file is displayed. Clicking on any 'Issue' in the left-hand panel takes the user to the corresponding change in the data file. In the example shown the first issue in the list has been clicked. This expands the 'Issue' and shows the 'Old Content' and the 'New Content'. In this case the problem was that the XML schema required the content to be of the type xs:DateTime, but the old content only gave a year date range. The automated correction was of a type called 'Element Text Replacer'. This type of correction replaces a specific pattern (a regular expression) at a specified position within the XML file with some other text. The technical documentation details the different types of correction and how to modify the correction modules and specify a different configuration. In this example the lower year is taken as the year and the date is assumed to be the 1st January of that year and the time is assumed to be midnight. If this change is acceptable to the reviewer then they can click the checkbox to indicate they agree within the change.

If the change is not acceptable, the user should run another set of corrections on the original data file. To do this you need to change the correction configuration or add new correction modules, in consultation with a technical administrator.

A modified GUI could also allow only changes of a certain type (class) or carried out by certain module to be displayed. It could also hide reviewed changes from the list. Though this might be a good way for the Content Administrator and the Technical Administrator to review the changes, there is still the problem that the Contributing Scientist is confronted with the XML document and required to work with it. So at some point in the future a better interface for the Contributing Scientist to work with the data might be desirable (e.g. the use of automatically generated web forms to edit the data), but for the general infrastructure described in this text, the online XML editor is sufficient.

It is not necessary to review all the changes at once. The document can be stored at any time and the review process can be resumed at a later point. So it is possible that after the correction one of the administrators reviews the changes which are caused by technical issues or the XML format used, leaving only changes which are related to the data. Then the Contributing Scientist can review these.


After the correction is finished the file can be validated again. This time - if the correction modules have been able to fix the original errors the file should be valid. The screenshot below shows rerunning the 'validation' on the reBiND_Puffinus.xml after the correction step has been run.


Validation final.PNG