wiki:CmdiVirtualLanguageObservatory/ManualMapping/UDS
These are the xPaths of our CMDI elements and the facette to be mapped. If no xPath is provided, we explain very briefly why. One can find also comments preceeded by #

Facettes are taken after checking automatic mapping of our CMDI profile with Menzo's tool.

http://lux13.mpi.nl/isocat/clarin/vlo/mapping/check?prof=clarin.eu:cr1:p_1288172614026

Facet: id
/CMD/Header/MdSelfLink

Facet: collection
/CMD/Header/MdCollectionDisplayName

Facet: projectName
# We had that annotated as creator http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#terms-creator
# We also use creator element for the institution where the resource was produced. As it is, dissambiguation has to be done manually to provide a correct mapping

Facet: name
/CMD/Components/OLAC-DcmiTerms/title

Facet: year
/CMD/Components/OLAC-DcmiTerms/date
# We provide sometimes instead of a exact date, a range.

Facet: continent
# we don't have this information annotated in our metadata
# maybe it can be inferred from coverage information

Facet: country
#  We use coverage http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#terms-coverage
# But under coverage, we currently annotate geographical and temporal information. Therefore, disambiguation has to be done by postprocessing (matching TGN entries).
# See also: https://trac.clarin.eu/ticket/388

Facet: language
/CMD/Components/OLAC-DcmiTerms/language

Facet: languages
/CMD/Components/OLAC-DcmiTerms/language
# What is the difference with "language"?

Facet: organisation
/CMD/Components/OLAC-DcmiTerms/publisher
# http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=terms#terms-publisher
# According to the ISOcat categories, organisation can be "The organization that was leading the creation project or that is responsible for accessing the resource and the contact person is affiliated with.". Do we need some kind of distinction here, or is it OK as it is?

Facet: genre
# We don't have this information in our CMDI metadata. What does happen if more than a genre/register is included in a corpus? Can we have more than a genre element in the metadata? This would be the case of the CroCo corpus. Genre is a bit tricky notion. Wouldn't it be better to split this into different aspects describing the communicative situation within the text was produced? Like modality, channel...

Facet: subject
/CMD/Components/OLAC-DcmiTerms/subject
# We have several subject elements, but in the facets, only one per record is showed.

Facet: description
/CMD/Components/OLAC-DcmiTerms/description
# We have descriptions both in English and German. Shall we always provide descriptions in English? Can the VLO show only one version?

Facet: resourceType
/CMD/Components/OLAC-DcmiTerms/type
# We have several instances of type for a record, like "collection" and "corpus" although only one is showed.

Facet: nationalProject
# We don't have this explicitly annotated, but the information already appears properly mapped.

Facet: text
# What does it mean?

Facet: _componentProfile
/CMD/Header/MdProfile

Facet: tag
# We don't use it
Last modified 10 years ago Last modified on 11/11/13 16:48:28