Changes between Initial Version and Version 1 of Taskforces/Curation/Meeting20160511


Ignore:
Timestamp:
06/13/16 09:23:58 (8 years ago)
Author:
davor.ostojic@oeaw.ac.at
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Taskforces/Curation/Meeting20160511

    v1 v1  
     1[[PageOutline]]
     2
     3= Curation TF - Centre Meeting 10-12 May, 2016 =
     4{{{
     5#!html
     6 <table style="border-collapse: collapse;" border="1">
     7  <tr>
     8    <td>Date</td>
     9    <td>11. 05. 2016</td>
     10  </tr>
     11  <tr>
     12    <td>Time</td>
     13    <td>15:10 - 17:00</td>
     14  </tr>
     15  <tr>
     16    <td>Location</td>
     17    <td>SURF Offices, Utrecht, NL</td>
     18  </tr>
     19  <tr>
     20    <td rowspan="8">Participants</td>
     21    <td>Claus Zinn</td>
     22  </tr>
     23  <tr><td>Davor Ostojic</td></tr>
     24        <tr><td>Francesca Frontini</td></tr>
     25        <tr><td>Henk van den Heuvel</td></tr>
     26        <tr><td>Menzo Windhouwer</td></tr>
     27        <tr><td>Mitchell Seaton</td></tr>
     28        <tr><td>Neeme Kahusk</td></tr>
     29        <tr><td>Pavel Stranak</td></tr>
     30</table>
     31}}}
     32
     33# Agenda #
     34
     35* Curation Module
     36* Availability / License Facet
     37* Normalisation Workflow
     38* !ResourceClass Facet
     39
     40
     41# Notes #
     42
     43Remark: This document gives an overview over discussed topics during the meeting.
     44Order of the topics in this document may not correspond to the original order from the meeting!
     45Some additional resources might be added.
     46
     47== [wiki:"Curation Module"] ==
     48
     49Menzo: add to the collection view a column for the profiles with scores and link profiles to the view page. It will be helpful for the Meerents institute to identify critical profiles
     50
     51Pavel: OLAC has similar tool and result representation is very good. It offers hints how to improve the score.
     52
     53Davor: there is a plan to develop such an application ([https://moqups.com/ostojic.davor@gmail.com/L9GiyDLr mockups])
     54
     55Davor: For profiles assessment both xsd and xml are needed. What happens when schemalocation attribute points the different location then component registry? In that case only xsd is available but for fetching xml url can reconstructed with profileID from CMDI’s header. Which schema to use then, from component registry or originally specified? (example: http://hdl.handle.net/11858/00-203Z-0000-0027-536B-3)
     56
     57Menzo: this situation should be considered as invalid. Schemas could be out of sync. Give penalty to the score for this. Schema from component registry should be used.
     58
     59Davor: There are situations when <!MdProfile> element has an url (component registry by the way) instead of profileId (clarin.eu:cr1:p_...). Hard to handle all of them
     60
     61Menzo: There are no strict rules what is allowed for <!MdProfile> element but id is preferred. It should be reported by the tool. Schematron could be use for such a validation.
     62
     63Davor: Should Schematron be consider in curation module?
     64
     65Menzo: why not it is an ISO standard. [[browser:CMDIValidator|CMDI Validator]] from Oliver Schonefeld is using it already. Validation rules can be externalized.
     66
     67Davor: Score calculation still needs to be coded (maybe this can be externalised as well)
     68
     69
     70== [wiki:Taskforces/Curation/ValueNormalization/License "Availability / License Facet"] ==
     71
     72Participants are informed about what has been changed:
     73* [https://docs.google.com/document/d/13__Kde_iOxvZsYpKZkmFuint5whzPda806VENFvxHUc/edit?usp=sharing google doc]
     74* [wiki:Taskforces/Curation/Meeting20151015 CAC2015]
     75
     76
     77Pavel: it is very confusing which values to use for availability / license. Would be nice to have guidelines
     78
     79Claus: CLARIN provides licence category calculator
     80
     81Pavel reported problems for LINDAT’s data regarding this facet. Needs to be investigated whether VLO importer interprets original values in a wrong way or necessary fields are not specified by LINDAT. It seems that curation instance of VLO (with old mappings) gives correct result.
     82
     83Agreement is to try to fix LINDATs data together and then from gathered experience to try to create some guidelines or best practices for others.
     84
     85
     86Email from Florian Schiel sent on 29/04/2016:
     87
     88''The module report a missing facet 'rightsHolder' on instances of profile 'media-corpus-profile', but the profile (and the instances I tested) contains the element <Owner> which is linked to concept ​http://hdl.handle.net/11459/CCR_C-2956_519a4aab-2f76-0fd3-090e-f0d6b81a7dbb''
     89
     90
     91Email from Hanna Hedeland sent on 02/05/2016:
     92
     93''Since I put this facet in the facetConcepts.xml sometime during the VLO-TF time, I was quite sure we should have this information, but in fact it's now represented by http://hdl.handle.net/11459/CCR_C-2956_519a4aab-2f76-0fd3-090e-f0d6b81a7dbb in our profiles, and this concept is not in facetsConcepts. On the other hand, I didn't know that there was a decision to use this facet (listed among the test facets), but I won't tell anyone if we just include it ;) So, if it should be regarded, could you please add this forth concept to the list in facetsConcepts or discuss it at the next meeting or however things are handled right now?''
     94
     95__Ticket is issued: #931__
     96
     97Email from Hanna Hedeland sent on 09/05/2016:
     98
     99''I'm afraid at least some of the issues I described after the launch of the latest VLO release are still there, I think they could be defined as mapping problems?
     100Even if http://hdl.handle.net/11459/CCR_C-5439_98bb103d-476a-7f62-54b4-bf9de24d2229 is provided, there are still entries with double values for distribution type (which should not be possible?), e.g. for https://vlo.clarin.eu/record?q=hzsk&docId=http_58__47__47_hdl.handle.net_47_11022_47_0000-0000-534B-F.
     101It is not clear how to achieve that VLO entries get the laundry tag "PLAN", and entries with seemingly identical CMDI "legally relevant" metadata don't seem to receive the identical set of laundry tags (https://vlo.clarin.eu/record?q=hzsk&docId=http_58__47__47_hdl.handle.net_47_11022_47_0000-0000-631F-F and https://vlo.clarin.eu/record?q=hzsk&docId=http_58__47__47_hdl.handle.net_47_11022_47_0000-0000-5C5F-0).''
     102
     103__Menzo: we need to check mappings and importer.__
     104 
     105
     106Pavel: superprofile should be consider, a profile containing mandatory fields like author, title, license, resourceType, from which all others are derived.
     107
     108== Normalisation workflow ==
     109
     110
     111Davor / Henk: How to discover values that haven’t been normalised?
     112
     113Menzo: Developers can query SOLR and obtain values.
     114
     115Davor: collect them during the import process.
     116
     117Davor: How to maintain controlled vocabularies, updating them and ensure that VLO always uses the latest version?
     118
     119Menzo: github might be a proper solution. It gives nice preview for csv files.  Push unmapped values to the github file (maintainer will be notified automatically).
     120
     121Francesca: Github provides users with in-browser editor
     122
     123== [wiki:Taskforces/Curation/ValueNormalization/ResourceType ResourceType] ==
     124
     125
     126Davor: Normalisation map created by Jan Odijk is still not used. Should we use it?
     127
     128No agreement on this (again).
     129
     130Davor: how about to create a concept and to link it with controlled vocabulary and recommend it for resType
     131
     132Menzo: CCR TF was thinking about doing something similar for all facets. Another option is to create a component.
     133
     134Henk: too many values without a hierarchy are senseless but creating a hierarchy could possibly lead to display issues  (maybe they can’t fit the screen).
     135
     136# Documents #
     137
     138* [[https://docs.google.com/document/d/1u0bS_PNP6R9dySHSoPHxyepNBPIwOqZJbb9bW3qkX9U/edit#|gdoc notes]]
     139