Version 5 (modified by 10 years ago) (diff) | ,
---|
This page is a subpage of CMDI 1.2
Namespace per profile
- Namespace per profile
The issue
In OAI-PMH Section 3.4 metadataPrefix and Metadata Schema its made clear that to be compliant a metadata record should have an schema location that matches the URL of the schema registered for the metadataPrefix. Due to the flexible nature of CMDI there can currently be many schemata associated with the metadataPrefix, i.e., one per CMD profile.
The solutions for this issue can either based on agreements within the CLARIN community on using OAI-PMH for CMDI (solutions 1 to 3), but it can also mean changes to CMDI with regards to namespaces (other solutions).
Proposed solutions
First solution: be pragmatic
One can be pragmatic and conclude that we have been using OAI-PMH for harvesting CMDI for several years now, so this non-compliance can be ignored.
Pros
Everything stays as it is
Cons
Non-compliance does indicate a problem, and will puzzle implementers
Centre impact
None, as nothing changes
Implementation examples
None, as nothing changes
Discussion
Oliver (IDS)?: NACK: CLARIN is about standards, interfaces and sustainability; this solution utilizes OAI-PMH in non obvious means and therefore violates CLARIN's principles. We should not do this.
Second solution: profile specific metadataPrefixes
A metadataPrefix per profile, e.g., cmdi0554, cmdi0571, cmdi2312. Each of these metadataPrefixes is linked to a different schema.
A first version of this has been implemented. The harvester can list multiple metadataPrefixes per provider endpoint. When a provider adds a new metadataPrefix this currently still requires an update of the harvester configuration to actually request the records offered for that prefix. There can be an agreed pattern in the CLARIN community, e.g., harvest every metadataPrefix starting with 'cmdi'. In that case the harvester doesn't need additional configuration but can infer the to be used metadataPrefixes itself.
Pros
Compliance, and partially implemented
Cons
Needs additional configuration per provider or CLARIN specific agreements on the use of OAI-PMH
Centre impact
The centers that currently use multiple CMD profiles but use only one cmdi metadataPrefix need to implement the metadataPrefix per profile approach
Implementation examples
None
Discussion
Oliver (IDS)?: NACK: This is rather a crude hack than a solution, because it (again) utilizes OAI-PMH in non obvious means. We should not do this.
Third solution: up to the centers
Leave it up to the centers to choose between the first or second solution.
Pros
If you don't care about compliance you can leave everything as it is
Cons
Mixed compliance within the CLARIN community. Still needs some additional configuration or CLARIN specific agreements on the use of OAI-PMH
Centre impact
Depends on the wanted compliance level
Implementation examples
None
Discussion
Oliver (IDS)?: NACK: Mixed compliance within CLARIN is a recipe for disaster in a (near|distant) future. We should definitely not do this.
Fourth solution: CMD envelop and payload specific schemas and namespaces
The envelope of a CMD record is fixed and described by the minimal CMD schema (TODO: needs to be synced with the latest version of the envelope generated by the CMDI XSD XSLT). We can bind this schema to the metadataPrefix and also use it in the instance. The profile specific schema would then only describe the profile specific part of the CMD record. However the namespace schema binding in xsi:schemaLocation only allows us to use a namespace once, which means we need two namespaces one for the envelope and one for the payload:
- http://www.clarin.eu/cmd/envelope namespace URI associated with http://infra.clarin.eu/cmd/xsd/minimal-cmdi.xsd
- http://www.clarin.eu/cmd/payload namespace URI associated with the profile specific XSD
Pros
Compliance with OAI-PMH
Cons
Namespace changes for all CMD records
Centre impact
- All tools that work with CMD records need to be changed
- All CMD records need to be changed
Implementation examples
OAIHandler?verb=ListMetadataFormats
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"> <responseDate>2013-12-02T17:28:30Z</responseDate> <request verb="ListMetadataFormats" >http://oai.clarin-beta.dans.knaw.nl/oaicat/OAIHandler</request> <ListMetadataFormats> <metadataFormat> <metadataPrefix>oai_dc</metadataPrefix> <schema>http://www.openarchives.org/OAI/2.0/oai_dc.xsd</schema> <metadataNamespace>http://www.openarchives.org/OAI/2.0/oai_dc/</metadataNamespace> </metadataFormat> <metadataFormat> <metadataPrefix>cmdi</metadataPrefix> <schema>http://infra.clarin.eu/cmd/xsd/minimal-cmdi.xsd</schema> <metadataNamespace>http://www.clarin.eu/cmd/envelope</metadataNamespace> </metadataFormat> </ListMetadataFormats> </OAI-PMH>
Minimal CMDI XSD
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:cmd="http://www.clarin.eu/cmd/envelope" xmlns:dcr="http://www.isocat.org" targetNamespace="http://www.clarin.eu/cmd/envelope" attributeFormDefault="unqualified" elementFormDefault="qualified"> ... </xs:schema>
Profile schema
<xs:schema xmlns:cmd="http://www.clarin.eu/cmd/payload" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dcr="http://www.isocat.org/ns/dcr" xmlns:ann="http://www.clarin.eu" targetNamespace="http://www.clarin.eu/cmd/payload" elementFormDefault="qualified"> ... <xs:element name="ToolService"> <xs:complexType> <xs:sequence> ... </xs:sequence> </xsl:complexType> </xs:element> ... </xsl:schema>
CMD record
<cmd-e:CMD xmlns:cmd-e="http://www.clarin.eu/cmd/envelope" xmlns:cmd-p="http://www.clarin.eu/cmd/payload" xsi:schemaLocation="http://www.clarin.eu/cmd/envelope http://infra.clarin.eu/cmd/xsd/minimal-cmdi.xsd http://www.clarin.eu/cmd/payload http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1311927752306/xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" CMDVersion="1.1"> <cmd-e:Header> ... </cmd-e:Header> ... <cmd-e:Components> <cmd-p:ToolService> ... </cmd-p:ToolService> </cmd-e:Components> </cdm-e:CMD>
Discussion
Oliver (IDS)?: This is better than solution 1-3, but still has the issue of using XML namespaces in non obvious ways. The XML namespace specification section 3 Declaring Namespaces states the following on uniqueness:
The namespace name, to serve its intended purpose, SHOULD have the characteristics of uniqueness and persistence.
If we use the same namespace name (= URI) for different schemas, we are violating the XML namespace specification. We should not do this.
Fifth solution: profile specific payload namespaces
Same as the fourth solution but instead of a fixed namespace to be used by all profiles each profile payload gets its own namespace.
Pros
- Compliance with OAI-PMH.
- Unique namespaces per profile payload, which enables better default XML handling:
- schema based object mappings are often based on the assumption that a combo of namespace and element name is unique
- validator may cache schemas based on namespaces, with reuse of a namespace for a different profile the cache might have to be explicitly flushed
Cons
- Namespace changes for all CMD records
- Generic tools needs to be able to handle the diversity of namespaces, e.g., by ignoring or skipping them:
- XPath 1.0: *[local-name()='ToolService']
- XPath 2.0: *:ToolService
Centre impact
- All tools that work with CMD records need to be changed
- All CMD records need to be changed
Implementation examples
Profile schema
<xs:schema xmlns:cmd="http://www.clarin.eu/cmd/payload/clarin.eu:cr1:p_1311927752306" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:dcr="http://www.isocat.org/ns/dcr" xmlns:ann="http://www.clarin.eu" targetNamespace="http://www.clarin.eu/cmd/payload" elementFormDefault="qualified"> ... <xs:element name="ToolService"> <xs:complexType> <xs:sequence> ... </xs:sequence> </xsl:complexType> </xs:element> ... </xsl:schema>
CMD record
<cmd-e:CMD xmlns:cmd-e="http://www.clarin.eu/cmd/envelope" xmlns:cmd-p="http://www.clarin.eu/cmd/payload/clarin.eu:cr1:p_1311927752306" xsi:schemaLocation="http://www.clarin.eu/cmd/envelope http://infra.clarin.eu/cmd/xsd/minimal-cmdi.xsd http://www.clarin.eu/cmd/payload/clarin.eu:cr1:p_1311927752306 http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1311927752306/xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" CMDVersion="1.1"> <cmd-e:Header> ... </cmd-e:Header> ... <cmd-e:Components> <cmd-p:ToolService> ... </cmd-p:ToolService> </cmd-e:Components> </cdm-e:CMD>
Discussion
Oliver (IDS)?: Even though this solution has the largest impact on centres, it is (IMHO) the best solution, because it is most standards compliant and allows (if multiple ID issue is soluved properly) smooth integration with OAI-PMH. The longer, we postpone this solution, the larger the pain for the centers will become, so we better make that decision now and be done with it.
Tickets
Tickets in the CMDI 1.2 milestone with the keyword keyword:
Discussion
Discuss the topic in general below this point
Attachments (4)
-
flattenCMDnamespace.xsl (1.6 KB) - added by 10 years ago.
XSLT to let a CMDI file use only one or no namespace. The code works in XSL 1.0 and 2.0, just change the version.
-
flattenCMDnamespace.2.xsl (2.6 KB) - added by 10 years ago.
New version of the flattener XSL uses a namespace prefix if a namespace is wanted.
-
cmdi-xml-validator-20140325.zip (14.7 KB) - added by 10 years ago.
CMD record validator using faulty caching
-
eXist-db-failed-import.png (343.5 KB) - added by 10 years ago.
eXist-db can't import a valid CMD record due to XSD caching
Download all attachments as: .zip