Revision as of 09:23, 9 February 2016

Welcome to the Wiki of the GGBN Portal Software

Further information about the software will follow soon.

Introduction

The GGBN Portal Software (GPS) has been developed by the Botanic Garden and Botanical Museum Berlin-Dahlem for the Global Genome Biodiversity Network (GGBN). The project was funded by the German Research Foundation (DFG (01/2014 - 03/2016)). The software is written in php using the Yii framework and fully open source. The GPS allows costumization and therefore can be used for other special interest networks too. It is optimized for an underlying SOLR instance, data are harvested with B-HIT.

Workflow

Data are provided through BioCASe and IPT. The harvester B-HIT stores all original data in a MySQL database. To find out more about B-HIT please check out the B-HIT Wiki. After harvesting data are cleaned and can be matched against different taxonomic backbones. A SOLR instance is created on top of the MySQL database. The GPS itself requires access to the SOLR index as well as the MySQL index.

Installation

You will need

Apache (tested with Apache/2.4.7 (Ubuntu), Apache/2.2.22 (Debian), Apache/2.4.10 (Win64)) with libapache2-mod-php5
PHP5 (from 5.5! tested with 5.5.9, 5.5.30 and 5.5.15). You also might need to install php5-xsl, php5-gd and php5-mysql.
a SOLR instance (tested with Apache SOLR 4.9.0). Contact us for getting the appropriate SOLR configuration files.
a MySQL database (tested with MySQL 5.5.46). It will contain data harvested with B-HIT

Yii (version 2) will be fully contained in the subversion code of the portal software.

Copy a working version from the svn

Create a destination folder, for the rest of this documentation we will choose "my_portal" in your web-folder (ie. /var/www/, or *WINPATH*).
Get http://ww2.biocase.org/svn/dnabank/Dnabank_Portal/ggbn_portal/trunk/ into "my_portal".

Configuration

TBD yii, Apache, rights, SOLR (SOLR config files also & Java)
First, let say we want to access the portal under the URL http://localhost/ggbn_portal.

Check that the Apache User has writing/editing rights in the whole "my_portal" folder.
Edit the file "my_portal/frontend/config/main.php".
- Configure the portal URL: $baseUrl = str_replace ( '/SUBDIR_AFTER_YOUR_WEB_FOLDER/my_portal/frontend/web', '/ggbn_portal', (new Request ())->getBaseUrl () );
- Configure the MySQL database credentials

                                               'db' => [
                                               'class' => '\yii\db\Connection',
                                               'dsn' => 'mysql:host=IP;dbname=DBNAME',
                                               'username' => 'USER',
                                               'password' => 'PASS',
                                               'charset' => 'utf8' 
                               ],

Edit the file "my_portal/frontend/config/params.php".
- e-mails
  - Configure the admin e-mail 'adminEmail' => 'ADMIN@DOMAIN',. The admin will get error/bug messages.
  - Configure the support e-mail(s) 'supportEmail' => ['MAIL1','MAIL2'..],.
  - Configure the feedback e-mail(s) 'feedbackMail' => ['MAIL1','MAIL2'..],. They will get the messages sent with the feedback function.
  - Configure the no-reply-mail (who is supposed to sent e-mail(s) to the users) 'noreplyMail'=>'MAIL'

                                               $searchname="SEARCH_CORE";
                                               $previewname="PREVIEW_CORE";
                                               $detailsname="DETAILS_CORE";


There is also a core for the statistics pages (ggbn_stats).

Set the temporary folder 'tmpFolder'=>'TEMP_FOLDER'. Check the access rights! (write is needed)
Configure the options
'useLogin'=>true_OR_false: is the login enabled or disabled
'noBack'=>true_OR_false: a special backbone is used
'shopping'=>true_OR_false: the shopping system is enabled or disabled
'useLogin'=>true_OR_false: is the login enabled or disabled
'newsService'=>true_OR_false: uses the GGBN news services - if it has to be used with a different wiki news, edit the SiteController.py
'viewsCountsService'=>true_OR_false: uses the GGBN specific views containing some counts (number of species, dna, ...). Look in the GGBN specific configuration for the SQL statements.
'displayMore'=>true_OR_false: displays the categories with internal links on the details page
'displayExternals'=>true_OR_false: displays the categories with external links (GBIF, BOLD, NCBI...) on the details page
'ABCDdownloable'=>true_OR_false: ABCD files are downloadable or not
'useAnnosys'=>true_OR_false: link to annosys (annotation system) on the details page
'annosysAP'=>'https://annosys.bgbm.fu-berlin.de/AnnoSys/AnnoSys?providerURL=': root URL for Annosys
'siteName'=>'my_portal',: the portal name (used in the URL)
'siteTitle'=>'GGBN Portal',: the default site name/title
'catalogNumberClickable'=>true,: if the catalogNumber/UnitID can be clickable on the details page

Edit the Apache configuration file, for example:

Alias /ggbn_portal "/var/www/my_portal/frontend/web"
<Directory "/var/www/my_portal/frontend/web">
   Options All
   AllowOverride All
</Directory>

Personalisation

If you want to change the icon (favicon), it is located in frontend/web.

If you want to change the slider images on the home page, they are located in frontend/web/images/slider. Then you will have to modify the view in frontend/view/site/index.php to load the pictures you chose.

If you want to change the menu icons, they are located in frontend/web/images/icons.



The CSS and JS files are also in the frontend/web folder.

Adding new search fields or fields to displayAdding them in SOLR

edit the schema.xml file in SOLR and add the new fields you want to have. If you want to search for them, edit the file for the search core *_search/conf/schema.xml. If wou want to display them in the preview, edit the file for the preview core *_preview/conf/schema.xml. If you need it for the details page, edit the file for the details core *_details/conf/schema.xml. Generally: if you want to search for it (search core), the field has to be indexed, but not mandatory stored. If you want to display it (preview/details), it has to be stored but not indexed. The tripleidstoreid or the occurrenceid *always* has to be indexed.
restart Tomcat

Important : always restart Tomcat (service tomcat8 restart) when you modify the schema.xml file.

edit the Java file corresponding to the core that need to be updated- for example, if you want to index or store the recordbasis, add the following method

private void addRecordbasis(int datasourceid) throws ClassNotFoundException, SQLException {
       System.out.println("RUN RECORDBASIS ");
       if(tripleidstoreids.size()>0){
           String queryTID =
                   "SELECT distinct fk_tripleidstoreid as tripleidstoreid,  "
                           + "IFNULL(recordbasis,'N/A') as recordbasis "
                           + " FROM recordbasis "
                           + " JOIN occurrence ON recordbasisid = fk_recordbasisid "
                           + " WHERE fk_tripleidstoreid in ("+StringUtils.join(tripleidstoreids,", ")+") "
                           + " AND fk_datasourceid ="+datasourceid
                           + " ORDER BY fk_tripleidstoreid ";
           update( queryTID,tripleidstoreids.size());
       }
       System.gc();
   }

 and add a call to that method in the handleDocuments method:

       occurrencesDone.clear();
       multivalueForbidden=false;
       addRecordbasis(datasourceid );

existing documents can not be updated in SOLR in our case (only if everything was "stored"). So first, delete the old data from the core. Depending on how many records there are, you might have to delete the data for a specific repository only solrServer:solrPort/solr/coreName_SEARCH_OR_PREVIEW_OR_DETAILS/update?stream.body=<delete><query>YOUR_LIMITATION_FIELD:*</query></delete>&commit=true This "limitation_field" must be an indexed value in the core. If you don't have a lot of data, you can put <query>*:*</query> as the generation of the new SOLR index will be fast enough (5-10min per Core for 180.000 records).
per default, it will index this new field for each datasource. If it's still missing at the end for some datasources, or if you only want to do it for specific datasources, edit the main method for (int i=0;i<166;i++)

Adding them in the portal

Edit the SearchController file (controllers)(additions can be made in the CommonSearchController, deletion only in the specific SearchController if you have several portals running in parallel with the same installation)
if it's a dropdownlist, add the name of the field in the variable $knownLParams.
if it's a simple term or a suggestion list, add the name of the field in the variable $knownParams.
Edit the SearchForm file (models)(same rule as above), edit the variable $_parameters - it can be a text, a select(dropdownlist), a suggest, a radio or a range. Also add the public variable with the same name (ie. public $sampleavailability; and

            'sampleavailability' => [
            'type' => 'radio',
            'name' => 'Sample availability (loan)',
            'config' => [
                'size' => 285,
                'maxlength' => 150,
                'name' => 'sampleavailability'
            ],
            'hideable' => 'true'
            ]

. It will be automatically displayed in the search form. The hideable parameter specifies if it is displayed per default or if it is only in the accordion.
Eventually, edit the attribteLabels function such as 'sampleavailability' => 'Sample availability (loan)'

Edit the listView.php (views) file if it has to be displayed here, or the recordViewTabs.php if it has to be displayed on the details page. Here again, add it to the $suggestFields or $defaultFields or $selectFields or $buttonFields, depending of its type.
Edit the file CreateFormManager(vendor) only if it's a radio button (look how it's done for cites and sampleavailability).
Edit the file SOLRQueryManager(vendor) - for example, if it has to look in the SOLR case-unsensitive value, do as for

          case 'fullScientificName' :
          return $field . '_nc:' . $value . ;
          break;

 If it's case sensitive,

          case 'collectioncode' :
          return $field . ':"' . $value . '"';
          break;

Running multiple GPS portals in parallel

Example of two portals running in parallel, specific files are located in "ggbnportal" and "worldfloraonline". All other folders are used by both portal instances

in common/config/bootstrap: the name of all folders that Yii2 should go through (for imports and so)
in environnments/index.php: the name of the folders that will be exposed and therefore where Yii should be authorized to write logs and handle the CSS files

GGBN specific configurationSeveral views are required for the counts-box on start page.

CREATE 
    ALGORITHM = UNDEFINED 
    DEFINER = `root`@`localhost` 
    SQL SECURITY DEFINER
VIEW `counts` AS
    select 
        count(`ro`.`occurrenceid`) AS `counts`, 'DNA' AS `kind`
    from
        (`ggbn_index`.`rawoccurrence` `ro`
        join `ggbn_index`.`unitkind` `uk` ON ((`ro`.`fk_kindofunitid` = `uk`.`unitkindid`)))
    where
        (`uk`.`kindofunit_clean` = 'DNA') 
    union select 
        count(`ro`.`occurrenceid`) AS `counts`, 'Tissues' AS `kind`
    from
        (`ggbn_index`.`rawoccurrence` `ro`
        join `ggbn_index`.`unitkind` `uk` ON ((`ro`.`fk_kindofunitid` = `uk`.`unitkindid`)))
    where
        (`uk`.`kindofunit_clean` = 'tissue') 
    union select 
        count(`ro`.`occurrenceid`) AS `counts`, 'Cultures' AS `kind`
    from
        (`ggbn_index`.`rawoccurrence` `ro`
        join `ggbn_index`.`unitkind` `uk` ON ((`ro`.`fk_kindofunitid` = `uk`.`unitkindid`)))
    where
        (`uk`.`kindofunit_clean` = 'Culture') 
    union select 
        count(`ro`.`occurrenceid`) AS `counts`,
        'eVouchers' AS `kind`
    from
        (`ggbn_index`.`rawoccurrence` `ro`
        join `ggbn_index`.`unitkind` `uk` ON ((`ro`.`fk_kindofunitid` = `uk`.`unitkindid`)))
    where
        (`uk`.`kindofunit_clean` = 'eVoucher') 
    union select 
        count(`ro`.`occurrenceid`) AS `counts`,
        'Specimens' AS `kind`
    from
        (`ggbn_index`.`rawoccurrence` `ro`
        join `ggbn_index`.`unitkind` `uk` ON ((`ro`.`fk_kindofunitid` = `uk`.`unitkindid`)))
    where
        (`uk`.`kindofunit_clean` = 'specimen') 
    union select 
        count(`ro`.`occurrenceid`) AS `counts`, 'Unknown' AS `kind`
    from
        (`ggbn_index`.`rawoccurrence` `ro`
        join `ggbn_index`.`unitkind` `uk` ON ((`ro`.`fk_kindofunitid` = `uk`.`unitkindid`)))
    where
        (`uk`.`kindofunit_clean` = 'unknown') 
    union select 
        count(`ro`.`occurrenceid`) AS `counts`, 'Enviros' AS `kind`
    from
        (`ggbn_index`.`rawoccurrence` `ro`
        join `ggbn_index`.`unitkind` `uk` ON ((`ro`.`fk_kindofunitid` = `uk`.`unitkindid`)))
    where
        (`uk`.`kindofunit_clean` = 'environmental sample') 
    union select 
        count(`grouped_fullscientificname`.`fullscientificname`) AS `counts`,
        'Taxa' AS `kind`
    from
        `ggbn_index`.`grouped_fullscientificname` 
    union select 
        count(distinct `families_view`.`family`) AS `counts`,
        'Families' AS `kind`
    from
        `ggbn_index`.`families_view` 
    union select 
        count(distinct `genera_view`.`genus`) AS `counts`,
        'Genera' AS `kind`
    from
        `ggbn_index`.`genera_view` 
    union select 
        count(distinct `species_view`.`species`) AS `counts`,
        'Species' AS `kind`
    from
        `ggbn_index`.`species_view`

CREATE 
    ALGORITHM = UNDEFINED 
    DEFINER = `root`@`localhost` 
    SQL SECURITY DEFINER
VIEW `families_view` AS
    select distinct
        `f`.`family` AS `family`
    from
        ((`backbone`.`family` `f`
        join `backbone`.`name` `n` ON ((`f`.`familykey` = `n`.`familyKey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`gbifKey` = `n`.`acceptedKey`))) 
    union select distinct
        `f`.`family` AS `family`
    from
        ((`backbone_col`.`family` `f`
        join `backbone_col`.`name` `n` ON ((`f`.`familykey` = `n`.`familyKey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`colKey` = `n`.`acceptedKey`))) 
    union select distinct
        `f`.`family` AS `family`
    from
        ((`backbone_ncbi`.`family` `f`
        join `backbone_ncbi`.`name` `n` ON ((`f`.`familykey` = `n`.`familyKey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`ncbiKey` = `n`.`acceptedKey`)))

CREATE 
    ALGORITHM = UNDEFINED 
    DEFINER = `root`@`localhost` 
    SQL SECURITY DEFINER
VIEW `genera_view` AS
    select distinct
        `f`.`genus` AS `genus`
    from
        ((`backbone`.`genus` `f`
        join `backbone`.`name` `n` ON ((`f`.`genuskey` = `n`.`genusKey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`gbifKey` = `n`.`acceptedKey`))) 
    union select distinct
        `f`.`genus` AS `genus`
    from
        ((`backbone_col`.`genus` `f`
        join `backbone_col`.`name` `n` ON ((`f`.`genuskey` = `n`.`genusKey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`colKey` = `n`.`acceptedKey`))) 
    union select distinct
        `f`.`genus` AS `genus`
    from
        ((`backbone_ncbi`.`genus` `f`
        join `backbone_ncbi`.`name` `n` ON ((`f`.`genuskey` = `n`.`genusKey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`ncbiKey` = `n`.`acceptedKey`)))

CREATE 
    ALGORITHM = UNDEFINED 
    DEFINER = `root`@`localhost` 
    SQL SECURITY DEFINER
VIEW `grouped_fullscientificname` AS
    select 
        `identification`.`fullScientificName` AS `fullscientificname`
    from
        `identification`
    group by `identification`.`fullScientificName`

CREATE 
    ALGORITHM = UNDEFINED 
    DEFINER = `root`@`localhost` 
    SQL SECURITY DEFINER
VIEW `species_view` AS
    select distinct
        `n2`.`canonicalName` AS `species`
    from
        ((`backbone`.`name` `n`
        join `backbone`.`name` `n2` ON ((`n`.`acceptedKey` = `n2`.`namekey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`gbifKey` = `n`.`acceptedKey`)))
    where
        ((`n`.`rank` = 'SPECIES')
            and (`n2`.`rank` = 'SPECIES')) 
    union select distinct
        `n2`.`canonicalName` AS `species`
    from
        ((`backbone_col`.`name` `n`
        join `backbone_col`.`name` `n2` ON ((`n`.`acceptedKey` = `n2`.`namekey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`colKey` = `n`.`acceptedKey`)))
    where
        ((`n`.`rank` = 'SPECIES')
            and (`n2`.`rank` = 'SPECIES')) 
    union select distinct
        `n2`.`canonicalName` AS `species`
    from
        ((`backbone_ncbi`.`name` `n`
        join `backbone_ncbi`.`name` `n2` ON ((`n`.`acceptedKey` = `n2`.`namekey`)))
        join `ggbn_index`.`identification` `i` ON ((`i`.`ncbiKey` = `n`.`acceptedKey`)))
    where
        ((`n`.`rank` = 'SPECIES')
            and (`n2`.`rank` = 'SPECIES'))

The institution has to be filled in bio_datasource (city and institution, for the harvester_factoy like '%Harvester%') - but first the datasource has to be added (metadata update). It will be used for the repository/registry field in the GGBN portal.
The parentInstitution table is used for the contacts and must be filled manually.
The GGBN registry is based on the NCD software and requires an additional database connection in the config file. Contact us for more information.
For user management (login/shopping), a few extra tables are required. Follow the instructions from the file common/controllers/CommonPermissionController.php to create and populate them if you do not already have them from the SQL-Dump. You will also need an extra view for the statistics:

CREATE 
    ALGORITHM = UNDEFINED 
    DEFINER = `root`@`localhost` 
    SQL SECURITY DEFINER
VIEW `shopping_view` AS
    select distinct
        `oc`.`fk_tripleidstoreid` AS `tripleidstoreid`,
        `bs`.`id` AS `data_source_id`,
        `pit`.`parentInstitutionID` AS `parentInstitutionID`,
        `pit`.`institutionShort` AS `institutionShort`,
        `pit`.`logoURL` AS `logoURL`
    from
        ((`bio_datasource` `bs`
        join `occurrence` `oc` ON ((`bs`.`id` = `oc`.`fk_datasourceid`)))
        join `parentInstitution` `pit` ON ((`bs`.`fk_parentInstitutionid` = `pit`.`parentInstitutionID`)))

Debug

If something does not work as expected, have a look in the catalina.out file (generated by Tomcat) on the machine where SOLR is running. If Tomcat is running through XAMPP, you might have a part of the logs in the catalina.bat window. You will see the last queries sent to SOLR at the end of the file.
The YII file runtime/log/app.log might also help, and is located in the yii folder on the machine running the portal (ie. advanced/runtime....).

@@ Line 63: / Line 63: @@
 ***<code>'useLogin'=>true_OR_false</code>: is the login enabled or disabled
 ***<code>'newsService'=>true_OR_false</code>: uses the GGBN news services - if it has to be used with a different wiki news, edit the SiteController.py
-***<code>'viewsCountsService'=>true_OR_false</code>: uses the GGBN specific views containing some counts (number of species, dna, ...). Look in the [[section]] for the SQL statements.
+***<code>'viewsCountsService'=>true_OR_false</code>: uses the GGBN specific views containing some counts (number of species, dna, ...). Look in the '''GGBN specific configuration''' for the SQL statements.
 ***<code>'displayMore'=>true_OR_false</code>: displays the categories with internal links on the details page
 ***<code>'displayExternals'=>true_OR_false</code>: displays the categories with external links (GBIF, BOLD, NCBI...) on the details page

Main Page: Difference between revisions

Revision as of 09:23, 9 February 2016

Contents

Introduction

Workflow

Installation

You will need

Copy a working version from the svn

Configuration

Personalisation

Adding new search fields or fields to display

Adding them in SOLR

Adding them in the portal

Running multiple GPS portals in parallel

GGBN specific configuration

Debug

Navigation menu