wiki:Taskforces/CMDI/Meeting20181008

Joint CMDI & Curation taskforces meeting 2018-10-08 at CLARIN 2018


Documents

Preparation

Agenda

See(Google doc) for details and up-to-date information.

  1. Agenda
  2. Best practices
  3. CLARIN Concept Registry
  4. CLAVAS vocabularies
  5. Metadata curation

Notes

Present: Menzo Windhouwer, Twan Goosen, Matej Durco, Ineke Schuurman, Neeme Kahusk, Hanna Hedeland, Thomas Eckart, Marcin Oleksy, Alexander König, Susanne Haaf, Oddrun Pauline Ohren, Francesca Frontini, Jussi Piitulainen, Mitchell Seaton, Jozef Misutka

Best practices

Current version (see https://www.clarin.eu/content/cmdi-best-practices-guide)

  • Current state
    • Overleaf: draft version
    • Version 1.1.1 published (via GitHub)
  • Recommended components/general info
    • Current state: still starting up initiative, looking for more input and collaborators (perhaps at bazaar)
    • There is some synergy with the resource families initiative
      • Talk to Jakob and Darja about their experience in developing the resource families and their ideas for recommended components
      • Note Twan: perhaps we can meet early December in Vienna on this?
    • Also ISO/DIS 24622-3 (CMDI standard part 3)
      • In the initial face. Alignment might be desirable.
      • Update: we had a brief meeting with Maria Gavrilidou (and Thorsten, Daan and Penny) at the conference. She will present to the ISO people (?) in November but will keep it quite general. We can have a meeting not too long after that to synchronise.
    • Separate (sub)task force
  • Common use cases
    • Current status:
      • Being started up
      • Actual writing was detached from best practices guide
    • Next step after recommended components
    • Purpose changed over time; idea now is to gather information what to document in addition to ‘general info’
    • Recommendations complimentary to recommended components. Should refer to recommended components (depending on use case) but can/should also consist out of prose description
    • Resource families as input/use case?
      • Note Twan: perhaps we can meet early December in Vienna on this? (see below ‘recommended components’ above)
    • Multiple use cases can run in parallel as ‘pilot’
      • A template is developer for domain experts based on these pilots
    • Organisational structure:
      • Coordinators (current TFs/best practices group)
      • Domain experts
        • Do not have to be involved in coordination, topics outside their domain or broader discussions

CLARIN Concept Registry

Coordinators used (metadata in) VLO to investigate concepts, facets

  • Generic vs specific concepts
    • Presentation Menzo: https://datashare.mpcdf.mpg.de/s/PnxhV7NKmzrMLwO
      • Identification of generic concepts on basis of VLO facets?
        • A lot of implicit context
        • Black/white listing of context (already in VLO)
          • Acceptable/rejectable context
          • Works although sometimes recursive context is needed to do it properly
        • More facets need to be evaluated
          • Expectation is that the number of concepts will stabilise
        • VLO needs to be adapted to evaluate broader (multiple levels of) context
        • Modellers will have to be educated on use of generic concepts
          • Perhaps multiple concept links on one item are needed
      • https://datashare.mpcdf.mpg.de/s/tn1tLvcSCUYG1oN
    • Specific approach
      • E.g. language definition - how broad a definition (e.g. also dialects, sociolects)
      • Facets in VLO do not have to directly correspond to a single concept
    • Generic vs specific:
      • Suggestion Matej: in VLO user should perhaps be able to choose between generic and specific?
  • Feedback on VLO tooltips
    • Resource families can serve as input/use cases
    • More or less open question: who is actually in charge of the tooltips?
  • What goes into the CCR, what in CLAVAS, what 'nowhere'?
    • https://legacy.gitbook.com/book/cmdi-taskforce/cmdi-best-practices/discussions/26
    • Conclusion/policy proposal (Menzo): instances do not go into CCR. There could be a concept scheme (vocabulary) in CLAVAS
    • Some (potential) vocabularies are made up out of concepts (e.g. resource type; ‘book’ would be a concept, a specific title would not; ‘century’ would but ‘16th century’ wouldn’t)
    • Not always clear but in principle CCR’s judgement

CLAVAS vocabularies

Vocabularies

  • Ownership?
    • Bound to a person?
      • Should it even be allowed?
      • Perhaps… In any case there has to be a handover procedure
    • Delegation
      • Primary owner - likely the Curation task force
      • Actual work/discussion e.g. by legal issues committee
  • Would be important to improve human readability/explorability of CLAVAS (UI like CCR)
  • General procedure
    • Owner (TF) submits a proposal for a (version of a) vocabulary to SCCTC
    • SCCTC reviews and approves
    • TODO: suggest this workflow to SCCTC
  • Concrete vocabs
    • Media types
      • Already initiated
      • Requires use of schemes and collections
    • Licences
    • Resource type vocab
      • Review and finalise list drafted by curation TF
  • Option of custom/3rd party (open) vocabularies (EURAC/Alex case)?
    • If there are use cases support for other vocab services could be considered
    • We would need a well defined API so that others can run a compliant vocab service
    • Keeping the option open, but nothing decided

Metadata curation

Post-meeting: hands-on curation session

https://github.com/acdh-oeaw/VLO-mapping

Modality facet

Outcome

(Preliminary)

  • We have jointly created a first version of a values map
  • Further discussion about degree of interpretation is required
  • Next steps: ??
Last modified 6 years ago Last modified on 10/25/18 11:49:28