= Joint CMDI & Curation taskforces meeting 2017-09-18= ---- * What? * CMDI 1.2, best practices * Who? * Members of the [[Taskforces/CMDI|CMDI]] or the [[Taskforces/Curation|Curation]] taskforces and other people involved/interested in the specification of best practices for CMDI * When? * 18 September 2017, 11.30 - 13.00 CEST * Where? * See: [https://www.clarin.eu/content/programme-clarin-annual-conference-2017 programme CAC17] == Documents == - Best practices - [https://office.clarin.eu/v/CE-2017-1076-CMDI-Best-Practices.pdf first (draft) version of the CMDI Best Practice guide] as a CE document - [https://docs.google.com/document/d/1Cr_UqoQFPLt3ucovUuCxne7XVSMZ4awUupev2bTzewc/edit?usp=sharing draft CMDI Best Practices doc] on Google Docs - [attachment:CAC17-CMDI-BP-final.pdf final version of the CAC 17 abstract] - [https://trac.clarin.eu/wiki/Taskforces/CMDI/Meeting20170118#Documents existing (draft) best practices documents] - [https://trac.clarin.eu/wiki/Taskforces/CMDI/Meeting20161025-28 Notes from the Task Forces meeting @ CAC 2016] - [https://docs.google.com/document/d/1EAzx633Oq1P-Uplehy58jVmp4h0cYGNaN0rAxEHH9W4/edit?usp=sharing Live meeting notes] (final version of notes below). == Preparation == == Agenda == * Agenda * CLAVAS vocabularies * Status of proposed new vocabularies * CLARIN Concept Registry * Handling of new concept requests * Helpdesk responsibilities and workflows * Status/experience update - see [[Meeting20161025-28#VLO|last year's meeting notes]] * '''Best practices document''' == Notes == === CLAVAS === * Production instance @ Meertens on the way. Depends on release of OpenSKOS 2.1 (framework for vocabularies), code of which needs to be merged. `[TODO]`     * -> vocabularies.clarin.eu (not yet the final version) * ISO639-3 vocabulary is in there, but an old version - aligned with component registry ==== Vocabularies in preparation ==== ===== IANA media types (mime types) vocabulary =====         * Can be retrieved in XML format. This can be converted into a vocabulary (work by Menzo @ Meertens).         * (Nearly) ready to be included in CLAVAS in principle, only descriptions and URIs/handles (internal vs ‘natural’) need to be decided on.         * Add narrower concepts for e.g. TEI (more specific resource types than ‘plain’ media types allows for) by using skos:inScheme and skos:Broader         * On basis of this a component can be created for all media types         * `[TODO]` Subcollections need some thought/work, API does not know about such subsets ===== Licences =====         * Norway has licences spreadsheet; Menzo has converted this to SKOS. Not all information is represented there (yet).             * Concept scheme per licence family (e.g. GNU, CLARIN, CC)             * `[TODO]` Properties (PUB/ACA/RES + BY, NC, SA etc) to be encoded - probably we will have to define our own vocabulary for this         * Merging/aligning with curation TF’s licence vocabulary             * `[TODO]` This is the plan but has to be worked out         * Vocabulary ownership (‘content admin’)             * Legally: Oddrun talked to Gunn Inger. Ownership is now university. Should be ok to base a SKOS vocab on it.             * A person or group of persons has to be responsible for the content. Obvious candidates would be curation task force and/or legal issues committee             * `[TODO]` Discussion with experts (legal issues committee, ..) still pending ===== Organisations =====         * Menzo started creating a vocabulary based on ‘old’ organisations vocabulary from first version of CLAVAS that got out of sync and was lacking in (linked) information         * New vocabulary bootstrapped from information in the VLO         * Added information             * Spelling variations, multilingual labels             * Also typos -> hidden labels         * Intended applications             * Organisations component(s) - primarily as an open vocabulary             * Harmonisation of spelling variations         * Collaborative curation effort needed Ideally directly in XML, sharing the file on GitHub         * Look up/sync with authority file (e.g. viaf.org), use URI found there ===== Resource types(?) =====         * Curation efforts: Coverage of resource type in VLO is bad. Could be improved by deriving from resource type -> cross facet mapping, will be integrated in VLO.         * `[TODO]` Contact Jan Odijk about this?         * Workshop to get to such a unified vocabulary? (see other business)         * A lot of information is already there === CCR === * Importance stressed in best practices * Not much has happened recently, some contact among coordinators * New pending sets of proposals:     * IDS submitted proposals for 2 concepts     * Jan Odijk is proposing concepts based on CLARIN-NL…     * Concepts mapping directly to VLO facets * (Required) action by coordinators     * Discussion -> approval; Then Menzo can import them     * Menzo, Oddrun: will attempt to get the procedure on track again (Ineke is now involved again)     * “Coordination of coordinators” is lacking         * When would coordinators meet? Raise as issue to ERIC. `[TODO]`     * CCR coordinators can meet during CAC (over lunch?) --> `[Update]` Meeting took place in Budapest, Sept. 19 -- results: * We'll decide on the currently proposed (2) categories within 1 week (until Sept 29) * Default reaction time of coordinators to proposed new categories is 2 weeks * Former discussion on concepts that map VLO facets will be resumed * next meeting: early October * [FYI] CCR coordinators Wiki page: https://trac.clarin.eu/wiki/CCR-Coordinators === Helpdesk === * Nothing happened: no support requests in the past 100 days. * Request before were handled well === Best practices === * Draft at https://www.clarin.eu/content/cmdi-best-practice-guide * Will be presented at Bazaar and paper session     * `[TODO]` Can ask centre committee representative to mention this in plenary “Short reports on Committee meetings, by the respective chairs” slot     * Task force members are requested to be present to talk CMDI (BP) :) * Schematron implementation of some best practices     * By Menzo, incorporated in toolkit (dev branch, see https://github.com/clarin-eric/cmdi-toolkit/tree/develop/src/main/resources/toolkit/sch)         * Can be used to evaluate component files or records, e.g. using oXygen     * Implementation brings additional insight into best practices, could be reflected in next editing iteration (e.g. overlapping BPs)     * `[TODO]` To be integrated into curation module === Other business === * Workshop(s)/hackathon towards proposal for curation workflow and practices (see CLAVAS)     * Start with (for example) resource types - vocabulary + mapping definitions     * Small group of people (4-5) willing to work on this         * Scopes: content and workflow (separate + combined sessions)     * `[TODO]` be organised/hosted by Vienna         * Within the next quarter/half year == TODOs == * {'''!Meertens/Menzo'''} Complete OpenSKOS release (requires merge with OpenSKOS 2.1 code) * {'''!Meertens/Menzo/...'''} Media type subcollections in CLAVAS need some thought/work, API does not know about such subsets * {'''Curation TF/Menzo/...'''} Properties (PUB/ACA/RES + BY, NC, SA etc) to be encoded in licences vocabulary - probably we will have to define our own vocabulary for this * {'''Curation TF/Oddrun/Matej'''} Merging/aligning of licences vocabulary with curation TF’s licence vocabulary has to be worked out. * {'''Curation TF/Oddrun/Matej'''} Discuss licences vocabulary responsibilites ('ownership') with experts (legal issues committee, ..) still pending * {'''Matej'''} Organise workshop/hackathon to focus on creating a resource type vocabulary * {'''!Susanne/Oddrun/...'''} Push for meeting of CCR coordinators * {'''!Susanne/Oddrun/...'''} Report on meeting of coordinators * [Update SH:] Done * {'''!Twan/Menzo'''} Ask centre committee representative to mention this in plenary “Short reports on Committee meetings, by the respective chairs” slot * {'''Curation TF'''} Integrate best practices schematron into curation module