Note: this issue has now been addressed with a schematron check (= solution 1 + a warning, to be incorporated somehow in the Component Registry)
problem
cmdi-price component:
<CMD_ComponentSpec isProfile="false"> <Header> <ID>clarin.eu:cr1:c_1271859438115</ID> <Name>Price</Name> <Description>Component Price contains information about the (different) price(s) for a resource</Description> </Header> <CMD_Component name="Price"> <CMD_Element CardinalityMax="unbounded" CardinalityMin="1" ValueScheme="string" ConceptLink="http://www.isocat.org/datcat/DC-2460" name="Price"/> </CMD_Component> </CMD_ComponentSpec>
This results in instances such as:
<Price> <Price></Price> </Price>
I know that this is not a problem in XML but one of my colleagues here says: Bad style same element within the same element with different content models. Do you have any recommendations for us? I mean usually I would say that it should be <Price><Amount>...<Currency>...<Description>...</Price> etc. avoiding the "problem". He is sort of right and if he wants to use RegEx? for processing, it is certainly nicer.
possible solutions
1. leave it as it is
pro:
- less work
con:
- not really stylish
- potentially confusing
2. styleguide: use a suffix to indicate components
(non-binding) advice for the creators of a component to use a certain suffix, eg "c_"
(or similar: use the plural for the components, singular for embedded elements)
<CMD_Component name="c_Price"> <CMD_Element CardinalityMax="unbounded" CardinalityMin="1" ValueScheme="string" ConceptLink="http://www.isocat.org/datcat/DC-2460" name="Price"/> </CMD_Component>
pro:
- less confusion
- still leaves the possibility to freely choose the instance tags (eg for interoperability with TEI headers)
con:
- non-binding, so confusion is still possible if users do not adhere to the advice
3. automatically append a suffix when generating XSD
<CMD_Component name="Price"> <CMD_Element CardinalityMax="unbounded" CardinalityMin="1" ValueScheme="string" ConceptLink="http://www.isocat.org/datcat/DC-2460" name="Price"/> </CMD_Component>
results in:
<xs:element name="c_Price"> <xs:complexType> <xs:sequence> <xs:element maxOccurs="unbounded" minOccurs="1" type="xs:string" dcr:datcat="http://www.isocat.org/datcat/DC-2460" name="Price"/> </xs:sequence> <xs:attribute name="ref" type="xs:IDREF"/> </xs:complexType> </xs:element>
and thus in:
<c_Price> <Price></Price> </c_Price>
pro:
- binding, so no confusion anymore
con:
- intransparent, because the instance tags (c_Price) look different than the component tags (Price), although this might be masked by the software (hiding the c_ prefix eg)
- impossible to "emulate" other schemas literally like TEI (eg <c_textDesc> tags instead of <textDesc>)
4. remove outer components in XSLT
change the XSD-generation XSLT so that it removes the outer component, so you would get
... <Price></Price> ...
instead of
<Price> <Price></Price> </Price>
pro:
- more intuitive (?)
- easier to use in arbil (?)
con:
- does not work when the component has other elements (eg in a <price> component, both <description> and <price>)
- (probably) complex changes to the XSLT