wiki:OAIHarvester/TestHarvesting

Version 8 (modified by Twan Goosen, 4 years ago) (diff)

--

Testing harvest + import of individual endpoints

Objective: harvest a specific endpoint and include the harvested result in the alpha instance of the VLO for testing purposes.

Running the test harvest

  1. Clone https://github.com/clarin-eric/test-oai-harvest in a location with write access on the server (or locally if you prefer)
  2. Prepare a harvester XML configuration file for the harvesting scenario to test. The repositories contains several configuration files that can be used as an example or template
  3. Optionally: customise configuration (see the test-oai-harvest project's README for instructions)
  4. Finally, run the harvest.

    This following example that you have made a configuration file my-provider.xml.
    Output will in this case (assuming default configuration settings) go to test-oai-harvest/output/my-provider.

    It's probably a good idea to run the script in the background:
    cd test-oai-harvest
    nohup ./harvest.sh my-provider &
    

Importing the result into the alpha VLO

After the harvest has completed, you can import the results into the VLO.

  1. Make the harvest output available to a location (e.g.test-oai-harvest/output/my-config) that can be read by the deploy user on the VLO host.
    We will call this location $HARVEST_OUTPUT_SOURCE (e.g. HARVEST_OUTPUT_SOURCE=/tmp/output/my-provider)
  2. Confirm that there exists a directory ${HARVEST_OUTPUT_SOURCE}/output/test/results
    (if not, you may be able to find and use ${HARVEST_OUTPUT_SOURCE}/workdir/test-test/results)
  3. Become the deploy user (sudo su deploy)
  4. Copy or move the result directory to the 'test data roots parent directory:
    cp -r "${HARVEST_OUTPUT_SOURCE}/output/test" "/home/deploy/vlo-data/test/my-provider"
    
  5. Add the dataroot definition to the /home/deploy/vlo/dataroots/testing-dataroots.xml. Only originName and rootFile have to be customised for the specific case. For example:
    <dataRoots>
       ...
       <DataRoot>
            <originName>My provider</originName>
            <rootFile>/srv/vlo-data/test/my-provider/results/cmdi/</rootFile>
            <prefix>http://alpha-vlo.clarin.eu/data/</prefix>
            <tostrip>/srv/vlo-data/</tostrip>
            <deleteFirst>false</deleteFirst>
      </DataRoot>
      ..
    </dataRoots>
    
  6. Restart the VLO compose project:
    (cd /home/deploy && ./control.sh vlo restart)
    
  7. After restart, you can run an import:
    (cd /home/deploy && ./control.sh vlo run-import)