Context Navigation

20180904

Timestamp:: 09/04/18 14:59:53 (6 years ago)
Author:: Twan Goosen
Comment:: Notes: curation section

Legend:

: Unmodified
: Added
: Removed
: Modified

VLO/Meetings/20180904

-                      v5
+                      v6
 Critical tasks:
 * De-duplication ([https://github.com/clarin-eric/VLO/issues/113 #13])
+* Decoupling of importer logic from importer application 'shell', specifically to make this available to the curation module ([https://github.com/clarin-eric/VLO/issues/118 #118], also [https://github.com/clarin-eric/VLO/issues/50 #50])
+* Landing page prominence ([https://github.com/clarin-eric/VLO/issues/153 #153]), also in search results (part of [https://github.com/clarin-eric/VLO/issues/164 #164])
+ * Front end and back end aspects can be developed more or less separately
+ * At least for testing/exploration, also allow for grouping by other fields than the custom 'similarity signature'
+* Decoupling of importer logic from importer application 'shell', specifically to make this available to the curation module ([https://github.com/clarin-eric/VLO/issues/181 #181], also [https://github.com/clarin-eric/VLO/issues/50 #50])
+* Landing page prominence ([https://github.com/clarin-eric/VLO/issues/153 #153])
+ * Also in search results (part of [https://github.com/clarin-eric/VLO/issues/164 #164])
  * Some related issues:
     * [https://github.com/clarin-eric/VLO/issues/173 #173]: Resource PID prominence, actionability
     * [https://github.com/clarin-eric/VLO/issues/184 #184]: Record PID prominence
+   * [https://github.com/clarin-eric/VLO/issues/173 #173]: Resource PID prominence, actionability
+   * [https://github.com/clarin-eric/VLO/issues/184 #184]: Record PID prominence
 Furthermore some minor, non-blocking tasks including Mopinion based reporting/feedback and some small bug fixes and UI/UX improvements (see milestone)
+Additionally some minor, non-blocking tasks including Mopinion based reporting/feedback and some small bug fixes and UI/UX improvements (see milestone)
 === Curation ===
+==== Link checker ====
+A first complete link checking run has been completed.
+MPI-PL have reported high server loads due to 'aggressive' polling by the link checker (see comments of #1052). A delay is being introduced and another full link checking operation will be carried out.
+The results and performance of the first few test runs will be analysed, and reported on (informally) at CLARIN 2018. At a later stage, the intention is to implement support for parsing and respecting `robots.txt` attributes (note: a [https://github.com/pandzel/RobotsTxt 3rd party Java library] exists for this). Exact parameters such as upper bound for acceptable request intervals have to be determined.
+Some improvements will be implemented with respect to the presentation of the results - overall and per collection. Aggregated counts per 'repsonse category' (probably based on response code, but also timeouts and other reponse issues) will be provided. Sample URLs for large collections (>50 records) should be selected to be informative and representative, and should primarily give a sample of the problematic cases. In the long term a view on the entire URL set should be available. Currently the full report XML is available for closer inspection.
 === Beyond 4.6 ===
 …
 * Aiming for 4.6.0 release 2018-11-10
 * Next joint meeting: at CLARIN 2018
+* Wolfgang will visit the CLARIN office in Nijmegen on 2018-09-07 to work on [https://github.com/clarin-eric/VLO/issues/181 #181]