Edit the parser, index new elements

From Berlin Harvesting and Indexing Toolkit
Revision as of 11:46, 2 February 2016 by PatriciaKelbert (talk | contribs) (Add a repeatable or a more complex element or group of elements (ie. it will need a new table in the database))
Jump to: navigation, search

First of all, we can recommend to use Eclipse (version Luna for example). You will need the Maven extension.
Import the B-HIT project in B-HIT (File/New/Maven Project -> put the location of the directory where you downloaded B-HIT).

Add an simple element (ie. it can be added to an existing table in the database)

  1. Choose where you want to save it

Add a repeatable or a more complex element or group of elements (ie. it will need a new table in the database)

1.Choose where you want to save it Example: add a reference group (ABCD2.06) http://www.bgbm.org/tdwg/codata/schema/ABCD_2.06/HTML/ABCD_2.06.html#complexType_Reference_Link031A69A8 A Reference is made of 3 elements: TitleCitation, CitationDetail and URI. As a Reference can be linked to several ABCD concepts, it might make more sense to link the Reference(s) to the concept than to the whole Unit

A) Create a new table in the database for the references, with an auto-incrementation ID. Put a new empty line (referenceID 1, titleCitation null, citationDetail null, URI null) because you will need a foreign key to make the rest easier.
B) Add a new column in the table from the concept that will have a Reference (ie. fk_referenceID), and configure it as a foreign-Key with the default value 1 for all the old records)
C) Document the file with the "SQL changes".
D) Edit the Java code!

  • Have a look at the src/org/binhum/abcd/Multimedia.java class. The new class could look like this:
// $Id$
/***************************************************************************
  * Copyright 2015 Global Biodiversity Information Facility Secretariat and Botanic Garden and Botanical Museum Berlin-Dahlem
  * Licensed under the Apache License, Version 2.0 (the "License"); you may not
  * use this file except in compliance with the License. You may obtain a copy of
  * the License at
  * http://www.apache.org/licenses/LICENSE-2.0
  * Unless required by applicable law or agreed to in writing, software
  * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
  * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
  * License for the specific language governing permissions and limitations under
  * the License.

***************************************************************************/
package org.binhum.abcd;

import java.util.Map;

import org.dom4j.Document;

import com.mysql.jdbc.StringUtils;


public class Reference extends XMLutil {

   private static final long serialVersionUID = 2096267214610763427L;
   private String URI;
   private String title;
   private String detail;
   private String standard; //abcd or abcd21
   
   private Map<String, String> namespaceMap;
   private Document xmlDocument;
   
   Reference(Map<String, String> namespaceMap, String standard) {
       this.namespaceMap=namespaceMap;
       this.standard=standard;
   }
   /**
    * @return the uRI
    */
   public String getURI() {
       return URI;
   }
   /**
    * @param uRI the uRI to set
    */
   public void setURI(String uRI) {
       URI = uRI;
   }
   /**
    * @return the title
    */
   public String getTitle() {
       return title;
   }
   /**
    * @param title the title to set
    */
   public void setTitle(String title) {
       this.title = title;
   }
   /**
    * @return the detail
    */
   public String getDetail() {
       return detail;
   }
   /**
    * @param detail the detail to set
    */
   public void setDetail(String detail) {
       this.detail = detail;
   }
   public Document getXmlDocument() {
       return xmlDocument;
   }
   public void setXmlDocument(Document xmlDocument) {
       this.xmlDocument = xmlDocument;
       detail = getTextValue(xmlDocument, "//"+standard+":Reference/"+standard+":CitationDetail", namespaceMap);
       title=getTextValue(xmlDocument, "//"+standard+":Reference/"+standard+":TitleCitation", namespaceMap);
       URI=getTextValue(xmlDocument, "//"+standard+":Reference/"+standard+":URI", namespaceMap);
   }
}
  • Save the new Reference and get it ID (ie. referenceid= occurrenceDao.createOrUpdateReference(referenceObj);

--> create the createOrUpdateReference method in org/binhum/harvest/util/jdbc/dao/OccurrenceDao.java (abstract) and org/binhum/harvest/util/jdbc/dao/OccurrenceDaoImpl.java

  • Have a look at the class of concept you want to link it to.
  • Add this ID to the concept you want to link it to, usually in the Unit.java class (for example:
    Preparation prepa = new Preparation();
    prepa.setTripleidstoreid(triplestoreid);
    prepa.setPreparationDate(extractionDate);
    prepa.setPreparationStaff(extractionStaff);
    prepa.setPreparationMaterials(extractionMethod);
    prepa.setPreparationType(preparationType);
    prepa.setReferenceid(referenceid);
    savePreparation(prepa);

-> Do it for each standard (get them from the public int parse(String accesspoint) method in Unit.java)
-> You will have to create the method setReferenceid and getReferenceid in the class Preparation.java.
!! For each concept, reset the referenceID first to 1, or you might have the reference from a previous element attached to the current concept !!
Also, you have to update the savePreparation method (ie. occurrenceDao.createOrUpdatePreparation(prepa); ie. you will have to edit in the OccurrenceDaoImpl.java class following elements:

*createPreparation : add a ps.setInt(6, prepa.getReferenceid()); 
*CREATE_PREPA_SQL : add the table column and a question mark for the SQL statement
*UPDATE_PREPA_SQL : add the table column with the question mark for the SQL statement
*createOrUpdatePreparation : add ps.setInt(6, prepa.getReferenceid()); and change ps.setInt(6, prepa.getId()); to ps.setInt(7, prepa.getId());

!! Check the field order and the number of fields/columns !!


Now it will be stored with the next harvesting. If you want to force the indexation of the new field (the original data might have not changed since the last indexing), you will have to delete the old records from the database (use the Management tab for it).


These new fields also have to be deleted by the next update : at the end of the public ArrayList<Integer> cleanOccurrences(String dataset, String elementType, int fk_datasourceid) method in the OccurrenceDaoImpl, make a call to a new function/SQL query that will delete all references that are not used (except the ID 1) (for example, look at the function deleteFromSecundaryBasedOnTripleidstoreid).

Add a new data standard