wiki:CMDI 1.2/Resource proxies/ResourceRelation

Version 9 (modified by oddrun.ohren@nb.no, 10 years ago) (diff)

--

This page is a subpage of CMDI 1.2

Specification of resource relations

The issue

CMDI 1.1 has an optional element /CMD/Resources/ResourceRelationsList that can look something like the following:

<ResourceRelationList>
 <ResourceRelation>
  <RelationType>describes</RelationType>
   <Res1 ref="a_text"/>
   <Res2 ref="a_photo"/>
  </ResourceRelation>
 </ResourceRelationList>
</Resources>

This is a relatively little used feature and it has even been argued that it can be removed. However it is being used in practice and has sensible use cases (at least theoretically). The problems with this implementation are the lack of clear semantics, a forced but implicit relational direction (e.g. Res1 describes Res2) and inelegant naming (Res1, Res2).

Proposed solutions

First solution

In an e-mail discussion (17 January 2012), Dieter? proposed the following:

- about the <ResourceRelationList>, with the input from Torsten and
Menzo I would like to propose a structure like:

<ResourceRelationList>
  <ResourceRelation>
    <RelationType dcr:datcat="http://www.isocat.org/datcat/DC-4009">
     annotates
    </RelationType>
    <Source ref="rp1"/>
    <Target ref="rp2"/>
  </ResourceRelation>
</ResourceRelationList>

This has:

-- a machine-readable relationtype (a datcat) while maintaining a prose
text description possibility

-- a clearly directed graph nature for the relation (source/target)

For symmetric relations that means that if there is a bidirectional
relation 2 ResourceRelations need to be specified (A -> B and B -> A)

We could make this change as no one is currently using ResourceRelationList.

Pros

  • It adds semantic grounding of the relation

Cons

  • Forces direction on the relation
  • Limits the number of resources taking part in the relation to two.

Centre impact

Tools that generate/process resource relation lists will need to be adapted

Discussion

Discuss this solution proposal in this section

Second solution

A more flexible solution was discussed recently. It would have for example:

<ResourceRelationList>
  <ResourceRelation>
    <RelationType dcr:datcat="http://www.isocat.org/datcat/DC-2318">annotates</RelationType>
    <Resource ref="rp1">
      <Role dcr:datcat="http://www.isocat.org/datcat/DC-4009">annotation</Role> 
    </Resource>
    <Resource ref="rp2"/>
      <Role dcr:datcat="http://www.isocat.org/datcat/DC-2656">annotated</Role> 
    </Resource>
  </ResourceRelation>
</ResourceRelationList>

The dcr:datcat attributes should probably be optional.

The maximal number of <Resource> elements should be unbounded, allowing for relations between any number of resources. (EXAMPLE? USE CASE?)

Pros

  • Semantic marking of both relation type and roles of resources
  • Option to have more than two resources involved in a relation
  • No forced relational direction

Cons

  • More verbose
  • More processing of datacategories
  • No default direction (while most cases will be covered by subject-object in that order)

Centre impact

Tools that generate/process resource relation lists will need to be adapted

Discussion

Discuss this solution proposal in this section

Tickets

Tickets in the CMDI 1.2 milestone with the keyword resourcerelationlist:

Ticket Summary Owner Component Priority Status
No tickets found

Discussion

Florian (BAS): The first proposal makes a lot of sense. But if such a general relation mechanism is implemented, we should also consider to remove the special relation 'isPartOf' (see seperate issue) and deal this as any other relation. The second proposal is not very lucid to me. Can anybody add a practical use case where this is necessary?

Oliver (IDS)?: The second version is a generalization of the first one. For some relations, it might lead to a more compact representation of the 1:N relations, e.g. you have 5 different annotations of a text file (e.g. different tools create POS annotations). With the first version, you'll need 5 ResourceRelation elements, in the latter case only one. However, I don't think proposed XML serialization for solution 2 could me made better, e.g:

<ResourceRelationList>
  <ResourceRelation>
    <RelationType dcr:datcat="http://www.isocat.org/datcat/DC-2318">annotates</RelationType>
      <Resource role="source" dcr:datcat="http://www.isocat.org/datcat/DC-4009" ref="rp1"/>
      <Resource role="target" dcr:datcat="http://www.isocat.org/datcat/DC-2656" ref="rp2"/>
    </Resource>
  </ResourceRelation>
</ResourceRelationList>

where mandatory @ref references a ResourceProxy and @dcr:datcat is optional. We need to discuss, if we want to make @role mandatory or not. I don't have a strong feeling in either direction. However, porper XSD magic can be used to ensure, that only one source role and at least one target role exists (XSD 1.1 assertions). Theoretically, one could also model N:M relations with this mechanism (by providing more source roles) and we need to to discuss, if we want to allow this. If we decide for this general relation mechanism, I agree with Florian to get rid of the IsPartOfList.

Twan?: Thanks Oliver, I like your improved representation and constraint proposals. Especially if we want to broaden the use of 'resource relations', I think we must build in this kind of flexibility (including N:M relations, what would be the downside?). On that note, using this to represent IsPartOf relations we probably want to
(1) Rename ResourceRelationList to RelationList and move it out of Resources
(2) Provide ways of referring to the document itself AND other documents that are not resources in the document (i.e. a way to express "this is part of collection Y"). For example:

<RelationList>
  <ResourceRelation>
    <!-- omitted details -->
  </ResourceRelation>
  <MetadataRelation>
    <RelationType dcr:datcat="http://www.isocat.org/datcat/DC-1234">partOf</RelationType>
      <MetadataDocument role="part" dcr:datcat="http://www.isocat.org/datcat/DC-2345"/> <!-- No ref could denote 'this document' -->
      <MetadataDocument role="container" dcr:datcat="http://www.isocat.org/datcat/DC-3456" ref="../mycollection.cmdi"/>
  </MetadataRelation>
</RelationList>

which adds a lot of power to the (resource) relation list but of course also complexity and another level of indirection. Is that roughly what you had in mind, Florian and Oliver? If so, the question is: is is it worth the additional hassle or should the 'part of' realation for metadata documents keep a special status.

Oddrun:

Comment to Oliver and Twans suggestions: I agree with the improved generalized version of ResourceRelation. However, I tend to think that the IsPartOfList? should keep a special status. After all, the ResourceProxyList? is in effect a PARTS list, giving the downlinks in the hierarchy a special status. So why not also the uplinks? That way, the hierarchical (or DAG-like) resource structure can be clearly and explicitly expressed, separately from other relationships the resource as a whole or its individual parts may engage in.

Twan, I am not sure of the necessity of having a separate MetadataRelations?, unless you want to distinguish between

  • relations between metadata files as resources in their own right, and
  • relations between the resources represented by the metadata files.

In your example, my feeling is that the relation expressed is to hold between the resources, not the metadata.

Now the PARTS (ResourceProxyList?) list, the IsPartOfList? and the ResourceRelationList? combined provide all the structural information about the described resource that the owner wish to express, and should perhaps be wrapped together. If Resources doesn’t suit, we might rename it to ResourceSpec?, like this:

<ResourceSpec>
    <ResourceProxyList>
        <!-- this is in effect a PARTS list, i.e. the downlinks in the hierarchical structure --> 
        <ResourceProxy id="rp1"/> 
        <ResourceProxy id="rp2"/>
        <ResourceProxy id="rp3"/>
    </ResourceProxyList>
    <isPartOfList>
        <!-- the uplinks in the hierarchical structure, from THIS resource as a whole -->
        <IsPartOf>http://infra.clarin.eu/example/mycollection1.cmdi</IsPartOf>
        <IsPartOf>http://infra.clarin.eu/example/mycollection2.cmdi</IsPartOf>
    </isPartOfList>  
    <ResourceRelationList>
        <!-- internal relations  between resources listed in ResourceProxyList -->
        <!-- relations between resources listed in ResourceProxyList and other resources -->
        <!-- relations (excluding isPartOf as expressed by the isPartOfList) between THIS resource as a whole and other resources -->
        <ResourceRelation>
            <RelationType dcr:datcat="http://www.isocat.org/datcat/DC-2318">annotates</RelationType>
            <Resource role="source" dcr:datcat="http://www.isocat.org/datcat/DC-4009" ref="rp1"/>
            <Resource role="target" dcr:datcat="http://www.isocat.org/datcat/DC-2656" ref="rp2"/>
            </Resource>
        </ResourceRelation>
        <ResourceRelation>
            <RelationType dcr:datcat="http://www.isocat.org/datcat/DC-xxx1">partOf</RelationType>
            <Resource role="part" dcr:datcat="http://www.isocat.org/datcat/DC-yyy1" ref="rp3"/>
            <Resource role="container" dcr:datcat="http://www.isocat.org/datcat/DC-zzz1" ref="../anotherCollection.cmdi"/>
            </Resource>
        </ResourceRelation>
        <ResourceRelation>
            <RelationType dcr:datcat="http://www.isocat.org/datcat/DC-xxx2">toolsUsed</RelationType>
            <Resource role="part" dcr:datcat="http://www.isocat.org/datcat/DC-yyy2"/> <!-- no ref denotes the resource described by THIS document -->
            <Resource role="container" dcr:datcat="http://www.isocat.org/datcat/DC-zzz2" ref="../someAnnotatorTool.cmdi"/>
            </Resource>
        </ResourceRelation>
    </ResourceRelationList>
</ResourceSpec>

I realise this is very much like before, but with your improved relationship version. However, with a clear semantics, I think it is a good format.

More comments on relationships: We need to be clear when we are talking about n-ary relation with n>2 as opposed to a set of several binary relations. We also need to be clear on the semantics of the ResourceRelation element: Does one ResourceRelation element express one relationship only, or may it sometimes express several relationships as suggested by Oliver?

  • If we constrain ResourceRelation to represent one relationship, and go for solution 2, it is possible to express realtionships of higher dimensions than 2. That is, each resource listed in the ResourceRelation participates in the same relation,for example, any ResouceRelation? with 3 resources represents a ternary relation.
  • If we allow one ResourceRelation to represent more than one relationship, I think in effect we limit the expressive power to binary relations. Oliver's example with 5 annotations of the same resource expressed as one ResourceRelation would then represent 5 binary relations.

I think the first bullet (one ResourceElement? = one relation) gives the most generic and extendible solution. Then we may or may not limit ourselves to binary relations, and it is easy to extend to higher dimensions later, if appropriate.

Using datcats for relationships and roles sounds like a good idea, but we should take care how we use them. The examples in the original text above show the difficulties, for instance:

  • DC-4009 is used to represent the relationship annotates, but is defined in IsoCat? as "The application of a scheme to texts...", that is, an operation/action, not a relation.

How strict should we be in applying datcats to relationships, - is it sufficient to select datcats conveying the general idea of the relation, or must the datcat be explicitly defined as a relation (as in the other example using DC-2318)