wiki:CMDI 1.2/Vocabularies/Summary

Version 13 (modified by Twan Goosen, 10 years ago) (diff)

--

This page is a subpage of CMDI 1.2

External vocabularies in CMDI 1.2: Executive summary

This page provides an executive summary of the issue and proposed solution fully described in CMDI 1.2/Vocabularies.

Issue description

The objective is to utilise external vocabularies as value domains for CMDI elements and CMDI attributes. While the solution on the model level should be generic and service-agnostic, it will be designed specifically to work with the OpenSKOS-based CLAVAS vocabulary service.

The desired workflow is that metadata modellers can associate a vocabulary (identified by its URI) with an element in their components and profiles. The metadata creator will then be able to pick values from the specified vocabulary or (in the case of open vocabularies) still choose to use a custom value that does not appear in the vocabulary. Editors like Arbil need to be extended to access the CLAVAS public API for retrieving potential values from vocabularies.

CLAVAS

OpenSKOS is a vocabulary service by which users may publish, manage and use vocabulary data in the SKOS format. The data can be accessed by a set of publicly available RESTful API's. Also, some Dublin Core elements are included, but not indexed. CLAVAS is CLARIN-NL's application of OpenSKOS, hosted by the Meertens Institute (contact is Hennie Brugman?). The vocabularies will be available through a set of web services hosted at Meertens, supporting exploration, search and autocomplete. More information is available in the CLAVAS work plan document.

Currently, the following vocabularies are available in CLAVAS:

  • Languages
  • Organisations
  • All public ISOcat categories (only simple ones?)

Description of proposed solution

This section describes the solution as proposed at the Taskforce Meeting in Utrecht on 2014-2-12.

There are two ways of using the OpenSKOS vocabularies in CMDI:

  • Importing vocabularies as closed value domains for CMD_elements or Attribute. Since the vocabulary items are enumerated explicitly as a choice list in the elements in question, validation is possible.
  • Using one or a combination of OpenSKOS vocabularies for dynamic lookup and retrieval of values for a CMDI element or Attribute. Here a non-exclusive (open) use of items from the vocabulary must be assumed, as validation against such external vocabularies is not practicable.

Schema changes

The following changes to the General Component Schema accommodates vocabulary use for both CMD_Element and Attribute:

  • New element Vocabulary in ValueScheme replaces enumeration
  • Vocabulary optionally has an enumeration element. If so, this defines an internal, closed vocabulary (imported or locally specified). If enumeration is not present, then the Vocabulary will be considered to be external and open, and should be accessed by means of the API by tools supporting this.

Attributes for Vocabulary

  • @URI (mandatory)
  • @ValueProperty (mandatory; which field of the vocabulary items to return, typically prefLabel)
  • @ValueLanguage (optional; preferred language of the item field value)

Instance changes

An attribute ValueConceptLink (in the CMDI namespace) will be allowed on fields that have a vocabulary linked to hold the URI of the selected value, semantically marking the chosen option

Impact on tools

  • Metadata editors must facilitate vocabulary lookup. Arbil, as the most generic editor - should be prioritized.
  • Component Registry must facilitate import of vocabularies. Interface for specifying value domains for elements and Attributes must be updated.
  • Discovery services (VLO a.o.) could provide assistance for users through vocabularies. E.g. vocabulary-based browsing.