Changes between Version 11 and Version 12 of CmdiClavasIntegration


Ignore:
Timestamp:
08/21/13 12:11:38 (11 years ago)
Author:
twagoo
Comment:

added ticket query for related tickets

Legend:

Unmodified
Added
Removed
Modified
  • CmdiClavasIntegration

    v11 v12  
    168168
    169169Vocabularies from ISOCat will be provided to OpenSKOS (the exact details as to how this is going to be done still have to be worked out), Arbil and ComponentRegistry will query these vocabularies through OpenSKOS.
     170
     171== Related tickets ==
     172[[TicketQuery(keywords~=c= Integration of CLAVAS into CMDI =
     173
     174{{{
     175#!html
     176<h3>Contents</h3>
     177}}}
     178[[PageOutline(1-5, , inline)]]
     179
     180== Introduction ==
     181
     182CLAVAS is a CLARIN-NL project that addresses a vocabulary service. It is hosted by the Meertens Institute, contact is [[henniebrugman|Hennie Brugman]]. The vocabularies will be available through a set of web services hosted at Meertens. CLAVAS is based on [http://openskos.org OpenSKOS], where vocabularies map to 'concept schemes' in OpenSKOS. More information is available in the [attachment:CLAVAS-Workplan-1.1.docx CLAVAS work plan document].
     183
     184This page covers integration of CLAVAS with [[CmdiIndex|CMDI]]. The idea, roughly speaking, is that metadata modelers can associate a vocabulary (identified by its URI) with an element in their components and profiles. The metadata creator will then be able to pick values from the specified vocabulary or (in the case of open vocabularies) still choose to use a custom value that does not appear in the vocabulary. Editors like [http://tla.mpi.nl/tools/arbil Arbil] need to be extended to access the CLAVAS public API for retrieving potential values from vocabularies.
     185
     186== Overview ==
     187
     188=== Status of CLAVAS ===
     189
     190==== Public API ====
     191
     192A [http://editor.openskos.org/api public API] that allows one to search for collections and concepts within a scheme (vocabulary) is available.
     193
     194The [http://openskos.org/api#autocomlete autocomplete API] can be used to get all items in an identified vocabulary or a subset in JSON or RDF by providing the URI of a vocabulary as a parameter.
     195
     196Eventually a CLAVAS-specific (as opposed to openskos.org) service will become available.
     197
     198==== Management interface and editor ====
     199
     200A [http://editor.openskos.org/editor concept scheme editor] is available but not publicly accessible.
     201
     202==== Available data ====
     203
     204There are working examples that suffice for use during development, e.g. the languages vocabulary. (URL!)
     205
     206A vocabulary of institutes will become available on a short time scale.
     207
     208=== Specifying vocabularies in CMDI specifications, schemata and instances ===
     209
     210==== Specifying vocabularies in CMDI instances ====
     211
     212===== Open vocabularies =====
     213
     214Each concept within a vocabulary is identified by an OpenSKOS specific URI and optionally has a reference to a 'source' URI (e.g. ISOCat). For fields that link to vocabularies as open vocabularies, we want to store one of these URIs as an attribute in CMDI metadata '''instances''', e.g.:
     215
     216{{{
     217#!xml
     218<language cmd:ValueConceptLink="http://cdb.iso.org/lg/CDB-00138580-001">Dutch</language>
     219}}}
     220
     221Notice the ''cmd''-prefix to disambiguate and prevent clashes with potential custom attributes of this name.
     222
     223Each item in the vocabulary has an OpenSKOS URI and optionally a 'source URI' (which generally will come from the primary data source, e.g. ISOCat), as in the example above.  There will be a deterministic fall-back mechanism in Arbil that chooses the source URI if available, otherwise the CLAVAS URI.
     224
     225The value that serves as the text content should come from one of the child elements of the concept definition. Typically this will be the preferred label as specified in the vocabulary item returned from the API but it could also come from another element (e.g. to choose between item full name and item code). Which path to use should be determined in the component specification (possibly part of the vocabulary URI).
     226
     227===== Closed vocabularies =====
     228
     229Closed vocabularies will be no different from standard CMDI closed vocabularies on the instance level. All required information will be available from the schema due to the vocabulary import (see below).
     230
     231==== Specifying vocabularies in CMDI component specifications ====
     232
     233Taking the above into account, an element specification in a CMDI component or profile could look something like:
     234
     235===== Open vocabularies =====
     236
     237Specified using attributes on CMD_Element
     238* !ValueScheme has to be string. For this we add a schematron rule to the general component schema.
     239* Vocabulary URI (can be URL or URN) gets specified in Vocabulary attribute
     240* Value property field optionally specified (default=prefLabel), either as a parameter on the URI or as a separate attribute VocabValueProperty (second example). TODO: DECIDE
     241 * Use case is language vocabulary that provides versions of ISO-639 per item
     242 *  It would be nice if we could pass the 'label' selection on to the API so that a pre-selection can happen server side, returning it in a specially marked element or attribute (making the processing of the response more uniform)
     243
     244Example:
     245
     246{{{
     247#!xml
     248<CMD_Element
     249    name="Institution"
     250    CardinalityMax="1"
     251    CardinalityMin="1"
     252    ValueScheme="string"
     253    Vocabulary="http://openskos.org/institutions?label=name"
     254/>
     255}}}
     256
     257OR
     258
     259{{{
     260#!xml
     261<CMD_Element
     262    name="Institution"
     263    CardinalityMax="1"
     264    CardinalityMin="1"
     265    ValueScheme="string"
     266    Vocabulary="http://openskos.org/institutions"
     267    VocabValueProperty="name"
     268/>
     269}}}
     270
     271
     272===== Closed vocabularies =====
     273
     274Closed vocabularies will be 'imported' into the component design-time, resulting in an internalized 'snapshot copy' of the vocabulary at the time of creation. The ComponentRegistry will be extended with functionality to allow this. The vocabulary URI will be stored in the component specification and transferred to the schema so that editors can query the API for additional information but this is optional as all information including the item URI's will be available from the schema.
     275
     276We will add the vocabulary uri as an attribute to the element and re-use the existing !ConceptLink attribute on the enumeration items to store the identifier of individual vocabulary items.
     277
     278{{{
     279#!xml
     280<CMD_Element
     281    name="Language"
     282    CardinalityMax="1"
     283    CardinalityMin="1"
     284    Vocabulary="http://openskos.org/api/languages?label=iso-639-3">
     285    <ValueScheme>
     286      <enumeration>
     287         <item ConceptLink="http://cdb.iso.org/lg/CDB-00138580-001">Dutch</item>
     288         <item ConceptLink="http://cdb.iso.org/lg/CDB-00138512-001">French</item>
     289      </enumeration>
     290    </ValueScheme>
     291</CMD_Element>
     292}}}
     293
     294Text content comes from the selected label. ConceptLink has the URI for each item in the vocabulary. There probably is no need for AppInfo (separate display label). Notice that there currently is no way to represent multilingual vocabularies, so the language will have to be specified in the vocabulary URI with a fallback to the default language of the vocabulary.
     295
     296==== Specifying vocabularies in CMDI profile XSD's ====
     297
     298The values of the vocabulary related attributes could go straight into the generated profile XSD, pretty much like the "datcat"-attributes and read like that from the schema by client applications.
     299
     300===== Open vocabularies =====
     301
     302Example, assuming the solution with separate attributes for vocabulary id and label specifiers:
     303
     304{{{
     305#!xml
     306<xs:element
     307  name="Institute" 
     308  ann:displaypriority="1"
     309  dcr:datcat="http://www.isocat.org/datcat/DC-3785"
     310  cmd:Vocabulary="http://openskos.org/institutions"
     311  cmd:VocabValueProperty="name">
     312  <xs:complexType>
     313    <xs:simpleContent>
     314      <xs:extension base="xs:string">
     315        <xs:attribute ref="xml:lang"/>
     316      </xs:extension>
     317    </xs:simpleContent>
     318  </xs:complexType>
     319</xs:element>
     320}}}
     321
     322===== Closed vocabularies =====
     323
     324Example:
     325
     326{{{
     327#!xml
     328<xs:simpleType
     329  name="simpletype-iso-639-3-code-clarin.eu.cr1.c_123456789"
     330  cmd:Vocabulary="http://openskos.org/api/languages?label=iso-639-3">
     331  <xs:restriction base="xs:string">
     332    <xs:enumeration value="Dutch" dcr:datcat="http://cdb.iso.org/lg/CDB-00138512-001" />
     333    <xs:enumeration value="French" dcr:datcat="http://cdb.iso.org/lg/CDB-00138512-001" />
     334  </xs:restriction>
     335</xs:simpleType>
     336}}}
     337
     338=== CLAVAS vocabulary sources ===
     339
     340Vocabularies from ISOCat will be provided to OpenSKOS (the exact details as to how this is going to be done still have to be worked out), Arbil and ComponentRegistry will query these vocabularies through OpenSKOS.
     341
     342== Related tickets ==
     343[[TicketQuery(keywords~=clavas,format=table,col=component|summary|milestone)]]