Changes between Version 2 and Version 3 of Taskforces/CMDI/Meeting20181008


Ignore:
Timestamp:
10/25/18 08:29:22 (6 years ago)
Author:
Twan Goosen
Comment:

notes

Legend:

Unmodified
Added
Removed
Modified
  • Taskforces/CMDI/Meeting20181008

    v2 v3  
    35350. CLAVAS vocabularies
    36360. Metadata curation
     37
     38== Notes ==
     39
     40Present: Menzo Windhouwer, Twan Goosen, Matej Durco, Ineke Schuurman, Neeme Kahusk, Hanna Hedeland, Thomas Eckart, Marcin Oleksy, Alexander König, Susanne Haaf, Oddrun Pauline Ohren, Francesca Frontini, Jussi Piitulainen, Mitchell Seaton, Jozef Misutka
     41
     42=== Best practices ===
     43
     44Current version (see https://www.clarin.eu/content/cmdi-best-practices-guide)
     45* Current state
     46    * Overleaf: draft version
     47    * Version 1.1.1 published (via GitHub)
     48* Recommended components/general info
     49    * Current state: still starting up initiative, looking for more input and collaborators (perhaps at bazaar)
     50    * There is some synergy with the resource families initiative
     51        * Talk to Jakob and Darja about their experience in developing the resource families and their ideas for recommended components
     52        * Note Twan: perhaps we can meet early December in Vienna on this?
     53    * Also ISO/DIS 24622-3 (CMDI standard part 3)
     54        * In the initial face. Alignment might be desirable.
     55        * Update: we had a brief meeting with Maria Gavrilidou (and Thorsten, Daan and Penny) at the conference. She will present to the ISO people (?) in November but will keep it quite general. We can have a meeting not too long after that to synchronise.
     56    * Separate (sub)task force
     57* Common use cases
     58    * Current status:
     59        * Being started up
     60        * Actual writing was detached from best practices guide
     61    * Next step after recommended components
     62    * Purpose changed over time; idea now is to gather information what to document in addition to ‘general info’
     63    * Recommendations complimentary to recommended components. Should refer to recommended components (depending on use case) but can/should also consist out of prose description
     64    * Resource families as input/use case?
     65        * Note Twan: perhaps we can meet early December in Vienna on this? (see below ‘recommended components’ above)
     66    * Multiple use cases can run in parallel as ‘pilot’
     67        * A template is developer for domain experts based on these pilots
     68    * Organisational structure:
     69        * Coordinators (current TFs/best practices group)
     70        * Domain experts
     71            * Do not have to be involved in coordination, topics outside their domain or broader discussions
     72
     73=== CLARIN Concept Registry ===
     74
     75Coordinators used (metadata in) VLO to investigate concepts, facets
     76* Generic vs specific concepts
     77    * Presentation Menzo: https://datashare.mpcdf.mpg.de/s/PnxhV7NKmzrMLwO
     78        * Identification of generic concepts on basis of VLO facets?
     79            * A lot of implicit context
     80            * Black/white listing of context (already in VLO)
     81                * Acceptable/rejectable context
     82                * Works although sometimes recursive context is needed Ato do it properly
     83            * More facets need to be evaluated
     84                * Expectation is that the number of concepts will stabilise
     85            * VLO needs to be adapted to evaluate broader (multiple levels of) context
     86            * Modellers will have to be educated on use of generic concepts
     87                * Perhaps multiple concept links on one item are needed
     88        * https://datashare.mpcdf.mpg.de/s/tn1tLvcSCUYG1oN
     89    * Specific approach
     90        * E.g. language definition - how broad a definition (e.g. also dialects, sociolects)
     91        * Facets in VLO do not have to directly correspond to a single concept
     92    * Generic vs specific:
     93        * Suggestion Matej: in VLO user should perhaps be able to choose between generic and specific?
     94* Feedback on VLO tooltips
     95    * Resource families can serve as input/use cases
     96    * More or less open question: who is actually in charge of the tooltips?
     97* What goes into the CCR, what in CLAVAS, what 'nowhere'?
     98    * https://legacy.gitbook.com/book/cmdi-taskforce/cmdi-best-practices/discussions/26
     99    * Conclusion/policy proposal (Menzo): instances do not go into CCR. There could be a concept scheme (vocabulary) in CLAVAS
     100    * Some (potential) vocabularies are made up out of concepts (e.g. resource type; ‘book’ would be a concept, a specific title would not; ‘century’ would but ‘16th century’ wouldn’t)
     101    * Not always clear but in principle CCR’s judgement
     102
     103=== CLAVAS vocabularies ===
     104
     105Vocabularies
     106* Ownership?
     107    * Bound to a person?
     108        * Should it even be allowed?
     109        * Perhaps… In any case there has to be a handover procedure
     110    * Delegation
     111        * Primary owner - likely the Curation task force
     112        * Actual work/discussion e.g. by legal issues committee
     113* Would be important to improve human readability/explorability of CLAVAS (UI like CCR)
     114* General procedure
     115    * Owner (TF) submits a proposal for a (version of  a) vocabulary to SCCTC
     116    * SCCTC reviews and approves
     117    * TODO: suggest this workflow to SCCTC
     118* Concrete vocabs
     119    * Media types
     120        * Already initiated
     121        * Requires use of schemes and collections
     122    * Licences
     123        * Already initiated
     124        * Effort on compiling licence vocabulary has to be revived
     125        * Further discussion required for a full vocabulary with detailed properties.
     126            * Simple version without complicated/controversial properties?
     127        * https://trac.clarin.eu/wiki/Taskforces/Curation/Meeting20170628
     128        *
     129    * Resource type vocab
     130        * Review and finalise list drafted by curation TF
     131* Option of custom/3rd party (open) vocabularies (EURAC/Alex case)?
     132    * If there are use cases support for other vocab services could be considered
     133    * We would need a well defined API so that others can run a compliant vocab service
     134    * Keeping the option open, but nothing decided
     135
     136=== Metadata curation ===
     137
     138* Report on work/status (Matej)
     139    * GitHub based ‘curation through value mapping’ workflow
     140    * Work on resource type
     141        * Within facet
     142        * Cross-facet (from profile)
     143    * (Semi-)automatic curation (mapping suggestions)
     144        * Compare to current values (Resource type)
     145        * Similarity analysis (Organisation)
     146        * Decomposition/ “deconcatenation” (Modality)
     147    * Link checking
     148        * Statistics in curation module
     149* (Other) facets
     150    * Organisation
     151        * Prepared by Alex; we also have a CLAVAS vocab (work in progress)
     152          * https://trac.clarin.eu/ticket/1069
     153          * https://github.com/acdh-oeaw/VLO-mapping/blob/master/value-maps/organisation.csv
     154          * 991 values
     155    * Modality
     156      * Suggestion: Mapping based approach
     157       * https://trac.clarin.eu/ticket/1073
     158       * https://github.com/acdh-oeaw/VLO-mapping/blob/master/value-maps/modality.csv
     159       * 162 values
     160* Can we improve coverage automatically?
     161    * Noise in other facets e.g. genre
     162    * Perhaps some profile to modality mapping (speech corpus?)
     163
     164== Post-meeting: hands-on curation session ==
     165
     166https://github.com/acdh-oeaw/VLO-mapping
     167* Should be applied in: https://vlo.acdh.oeaw.ac.at/
     168* [https://github.com/acdh-oeaw/VLO-mapping/blob/master/value-maps/resourceClass_tf-extended.csv Resource type map] (already in use)
     169* [https://github.com/acdh-oeaw/VLO-mapping/blob/master/value-maps/modality.csv Modality map]
     170
     171=== Modality facet ===
     172* Trac issue #1073
     173* [https://vlo.clarin.eu/values/modality Live VLO value selection]
     174* [Previous work by CLARIN-D: [[VLO-Taskforce/RecommendationsForFacets#Modality|recommendations for facet values]]
     175
     176* [https://github.com/acdh-oeaw/VLO-mapping/blob/modality-handson/value-maps/modality.csv modality-handson branch]
     177  * [https://github.com/acdh-oeaw/VLO-mapping/compare/master...acdh-oeaw:modality-handson#diff-450f33a704b4f7ec439794e3600cde80 Diff to master]
     178
     179=== Outcome ===
     180(Preliminary)
     181* We have jointly created a [https://github.com/acdh-oeaw/VLO-mapping/blob/42a103b4f09d25de0d7eb7034fc57d46114aec8b/value-maps/modality.csv first version] of a values map
     182* Further discussion about degree of interpretation is required
     183* Next steps: ??