Changes between Version 3 and Version 4 of OAIHarvester


Ignore:
Timestamp:
11/04/15 09:30:45 (9 years ago)
Author:
Menzo Windhouwer
Comment:

More on future plans

Legend:

Unmodified
Added
Removed
Modified
  • OAIHarvester

    v3 v4  
    130130
    131131Planning and roadmap:
    132 * switch to the !ListRecords scenario
    133 * get rid of OCLC harvester2 library, which prevents timeouts etc. for specific endpoints
     132* switch to the !ListRecords scenario, where batches of records are requested from the providers
     133* get rid of OCLC harvester2 library, which prevents specific timeouts etc. per endpoint
    134134* get rid of always building a DOM, which blows up memory consumption
    135135* create a new OAI harvester viewer
     136
     137=== The new OAI harvester viewer ===
     138
     139This new viewer should provide some advantages over the current viewer:
     140
     141* paged listing of records harvested
     142* jump from records to the OAI request it originates from (in the !ListRecords scenario a request can contain multiple records which cannot be handled by the current viewer)
     143* keep some statistics over past runs, so a warning can be send when a 'sudden' drop in the number of records is experienced
     144* can also provide access to the archived harvests
     145
     146Additionaly the viewer can also be an access point for tools to assess the quality of the CMD records:
     147* run XSD validation on the records/a record
     148* run Schematron rules agains the records/a record, e.g., to check against best practices
     149* run the VLO importer to see if the records/a record would be included in the VLO and which facet values it would deliver
     150* check the profiles used, e.g., in CMDI 1.2 one could check if deprecated profiles are used or now already how well they cover the VLO facets
     151* calculate a quality score (see [http://www.lrec-conf.org/proceedings/lrec2014/pdf/1011_Paper.pdf LREC 2014 paper])
     152
     153These tools could by run by default on all records or allow to select a specific record to check, but also allow the upload of a record.
     154
     155More in the OAI domain we could also trigger a run against a OAI validator, e.g, [http://validator.oaipmh.com/], and/or allow to trigger a harvest for a specific endpoint. The latter might need a specific setup/installation to not interfere with the periodic CLARIN harvest.
    136156
    137157----