Changes between Version 5 and Version 6 of VLO/Meetings/20180904


Ignore:
Timestamp:
09/04/18 14:59:53 (6 years ago)
Author:
Twan Goosen
Comment:

Notes: curation section

Legend:

Unmodified
Added
Removed
Modified
  • VLO/Meetings/20180904

    v5 v6  
    4343Critical tasks:
    4444* De-duplication ([https://github.com/clarin-eric/VLO/issues/113 #13])
    45 * Decoupling of importer logic from importer application 'shell', specifically to make this available to the curation module ([https://github.com/clarin-eric/VLO/issues/118 #118], also [https://github.com/clarin-eric/VLO/issues/50 #50])
    46 * Landing page prominence ([https://github.com/clarin-eric/VLO/issues/153 #153]), also in search results (part of [https://github.com/clarin-eric/VLO/issues/164 #164])
     45 * Front end and back end aspects can be developed more or less separately
     46 * At least for testing/exploration, also allow for grouping by other fields than the custom 'similarity signature'
     47* Decoupling of importer logic from importer application 'shell', specifically to make this available to the curation module ([https://github.com/clarin-eric/VLO/issues/181 #181], also [https://github.com/clarin-eric/VLO/issues/50 #50])
     48* Landing page prominence ([https://github.com/clarin-eric/VLO/issues/153 #153])
     49 * Also in search results (part of [https://github.com/clarin-eric/VLO/issues/164 #164])
    4750 * Some related issues:
    48     * [https://github.com/clarin-eric/VLO/issues/173 #173]: Resource PID prominence, actionability
    49     * [https://github.com/clarin-eric/VLO/issues/184 #184]: Record PID prominence
     51   * [https://github.com/clarin-eric/VLO/issues/173 #173]: Resource PID prominence, actionability
     52   * [https://github.com/clarin-eric/VLO/issues/184 #184]: Record PID prominence
    5053
    51 Furthermore some minor, non-blocking tasks including Mopinion based reporting/feedback and some small bug fixes and UI/UX improvements (see milestone)
     54Additionally some minor, non-blocking tasks including Mopinion based reporting/feedback and some small bug fixes and UI/UX improvements (see milestone)
    5255
    5356=== Curation ===
     57
     58==== Link checker ====
     59
     60A first complete link checking run has been completed.
     61
     62MPI-PL have reported high server loads due to 'aggressive' polling by the link checker (see comments of #1052). A delay is being introduced and another full link checking operation will be carried out.
     63
     64The results and performance of the first few test runs will be analysed, and reported on (informally) at CLARIN 2018. At a later stage, the intention is to implement support for parsing and respecting `robots.txt` attributes (note: a [https://github.com/pandzel/RobotsTxt 3rd party Java library] exists for this). Exact parameters such as upper bound for acceptable request intervals have to be determined.
     65
     66Some improvements will be implemented with respect to the presentation of the results - overall and per collection. Aggregated counts per 'repsonse category' (probably based on response code, but also timeouts and other reponse issues) will be provided. Sample URLs for large collections (>50 records) should be selected to be informative and representative, and should primarily give a sample of the problematic cases. In the long term a view on the entire URL set should be available. Currently the full report XML is available for closer inspection.
    5467
    5568=== Beyond 4.6 ===
     
    6073* Aiming for 4.6.0 release 2018-11-10
    6174* Next joint meeting: at CLARIN 2018
     75* Wolfgang will visit the CLARIN office in Nijmegen on 2018-09-07 to work on [https://github.com/clarin-eric/VLO/issues/181 #181]