wiki:Taskforces/CMDI/Meeting20190930

Joint CMDI & Curation taskforces face to face meeting 2019-09-30


  • What?
    • All ongoing and planned CMDI & curation activities and strategy
  • Who?
    • Members of the CMDI or the Curation taskforces and other people involved/interested in the specification of best practices for CMDI
  • When?
    • Monday 30 September 2019, 14:00 - 16:00 CEST
  • Where?

Documents

Agenda

  1. Brief summary of state of affairs
  2. Recommended components: ISO task (CMDI pt3)
  3. ‘General info’ components and common use cases
  4. CMDI Documentation
  5. Curation activities
  6. Software stack
  7. Integrating external vocabularies
  8. CMDI task force ‘strategy’

Notes

Attendees: Oddrun Pauline Ohren (NatLib? of Norway) [please add your name] Krista Liin (CELR), Alexander König (Eurac Research/CLARIN-IT), Thomas Eckart (Uni Leipzig/CLARIN-D), Marcin Oleksy (Wrocław University of Science and Technology/CLARIN-PL), Andreas Nolda (BBAW), Maria Gavriilidou (ILSP/ Athena RC, CLARIN:EL), Twan Goosen, Dieter Van Uytvanck, Matej Ďurčo

Brief summary of state of affairs

Main activities and developments since last meetings (CAC 2018 and Centre Meeting 2019):

  • Refinements of Best Practices
  • Questionnaire on Common use cases (Andreas)
  • CLARIN Concept Registry - stalled (both Ineke and Menzo can’t contribute)
  • No progress on CLAVAS
  • Discussion and first investigations after future of CMDI discussion and review of CMDI & FAIR principles (principle I2 “use of FAIR vocabularies”)
  • ISO part3 work - recommended components

Recommended components: ISO recommended components task (ISO 24622-3)

ISO 24622-3 is the third part of the standard defining the component metadata model, the component specification language and recommended components. The plan thus far has been to have a document with normative and informative section, where normative would define/refer to concrete components for various domains and use cases. Components and their fields would be either obligatory, recommended or optional. The group has discussed and tried to define criteria for recommending components. Primarily: most used and perspective/community.

It is unclear whether completing this task in the short to mid term is feasible. There are deadlines coming from the ISO process.

Post-meeting note: as Maria had to leave the meeting before the strategy discussion (see below), we followed up with Maria and Penny (ILSP, ISO 24622-3 coordinators) later during the conference, and together we concluded that we should re-evaluate the need to pursue the completion of this part of the CMDI ISO standard within the originally envisioned timeframe, and try to prioritise efforts in favour of the development of components that directly serve the needs of the community, i.e. work on core metadata and development of recommended components for common resource types or other use cases. The exact course of actions in relation to these activities will have to be discussed with the ISO committee in the upcoming weeks/months.

‘General info’ components and common use cases

Overlap with CMDI ISO recommend components (see above). There are ideas on how to approach this, which is a main part of the strategy discussion. See strategy section below.

Common use cases questionnaire: tinyurl.com/common-use-cases (RO)

CMDI Documentation

Focus has been very much on best practices over the last 1-2 years but consensus is that we could put this on ‘maintenance mode’ for a while to focus on other activities. See also strategy section below.

Curation activities

Not discussed in detail

Software stack

(TG: there was one question/discussion item (about the curation module??) but I didn’t take a note so if anyone remembers please feel free to leave a summary)

Integrating external vocabularies

Collecting usable vocabularies from the wider metadata landscape for concepts and controlled vocabularies, and investigating integration of these into the CMD infrastructure is considered as an option to 1) make CMDI metadata more FAIR and 2) reduce dependency on CCR and CLAVAS.

This will be further investigated over the next few months. Aim is to complete a proposal around the end of the year.

Discussion required: what and where are the “FAIR vocabularies”?

Relevant work carried out previously/elsewhere: inventorisation of vocabularies both in Parthenos (D5.4 Reference Resources) and SSHOC (D3.1 Report on SSHOC data interoperability problems), DDI-CV

See also strategy below.

CMDI task force ‘strategy’ discussion

Several of the ongoing discussion items and organisational circumstances can/should come together into a strategy for the task force(s) aimed at producing output (improvements and new solutions) that are of immediate value to our community and infrastructure.

Matej suggest: don’t invest into easier modelling, but on easier metadata creation/authoring. Users don’t have to model their own profile, as long as they can select an existing one based on resource type.

Dieter: go towards core recommendations. Aim to answer the question: what to do, e.g. which profile to use? Task forces should do less but do it better, perhaps one task at a time in smaller, achievable tasks.

Concepts and vocabularies

In many cases there are alternatives to using the CCR - e.g. wikidata and other external resources. Matej: Extended collaborative efforts not coming from the ground. Should we use from others that we can?

‘Core’ metadata

Should be one of the first tasks. Datacite schema v4 can be used as a basis. Other information sources for strategic decisions on core metadata:

In particular the task of creating metadata for the resources in the Resource Families can serve as a good testbed/pilot to gather requirements for core metadata and at the same time investigate options of using existing (non-CLARIN) vocabularies and concepts.

Profiles for specific resource classes

There are a few obvious main profile groups to look at

  • Text corpus
    • NaLiDa
    • META-SHARE
    • CLARIN D-Space
    • CLARIN-DK
  • Lexical resource
    • NaLiDa
    • META-SHARE
    • Bamdes
    • DLU
  • Speech corpus
    • NaLiDa
    • HZSK
    • DLU

A metadata harmonisation workshop could be organised, aiming to get creators/advocates of different profiles for one resource type together and discuss/analyze? Also compare their commonalities to generic schemata such as Datacite. On basis of that [union of different profiles], decide what information should be optional, recommended, mandatory?

For a complete ‘off the self’ workflow, we also need to be able to recommend an editor. COMEDI works as a general purpose editor but doesn’t support CMDI 1.2 yet.

Last modified 5 years ago Last modified on 10/28/19 10:09:55