Changes between Version 38 and Version 39 of FCS-spec


Ignore:
Timestamp:
08/29/11 08:51:40 (13 years ago)
Author:
vronk
Comment:

updated after 2011-08-24 talk in Nijmegen: dropped flat-version of searchResult; dropped ccs:Metadata

Legend:

Unmodified
Added
Removed
Modified
  • FCS-spec

    v38 v39  
    6565{{{
    6666 <ccs:Resource pid="{pid of the resource}“> 
    67     <ccs:Metadata cmd-link="{PID of the CMD-record}">  /* cmd-link is optional */
     67    <ccs:DataView type="metadata" pid="{PID of the CMD-record}" schema="">  /* pid is optional */
    6868        /* this is for metadata provided directly by the content provider
    69          * NOT the CMD  metadata.  */
     69         * CMD-format is prefered if available.  */
    7070      <ccs:f key="{any metadata-key}">{any metadata value}</ccs:f> /* is this any useful? */
    7171        /* or rather direct metadata-fields like: */
     
    7373       <dc:author></dc:author>
    7474       ....
    75     </ccs:Metadata>
     75    </ccs:DataView>
    7676    <ccs:ResourceFragment pid="{offset relative to the parent resource PID}“>           
    77        <ccs:Metadata>   /* metadata pertaining to the specific (matching) fragment, like metadata on the "current" speaker */
    78        </ccs:Metadata>
    79 
    80       <ccs:DataView type="kwic">Some text with <kw>keyword</kw> highlighted</ccs:DataView>             
     77       <ccs:DataView type="metadata" schema="">   
     78            /* metadata pertaining to the specific (matching) fragment, like metadata on the "current" speaker */
     79       </ccs:DataView>
     80
     81      <ccs:DataView type="kwic"><c type="left" >Some text with </c><kw>keyword</kw><c type="right" >highlighted</c></ccs:DataView>             
    8182   
    82       <ccs:DataView type="text/xml"><meertens:any/>     
     83      <ccs:DataView type="text/xml" schema="{meertens/schema}"><meertens:any/>     
    8384      </ccs:DataView>           
    8485
    85       <ccs:DataView type="image/jpeg" href="{optional link  to the data}" >
     86      <ccs:DataView type="image/jpeg" ref="{optional link  to the data}" >
    8687      </ccs:DataView> 
    8788    </ccs:ResourceFragment>
     
    9495  So in particular it may also be collections, aggregating other Resources.
    9596  Allowed children are: `Resource`, `ResourceFragment`, `Metadata` and `DataView`
    96  ResourceFragment::
     97 !ResourceFragment::
    9798  A part of a resource, without own PID, i.e. something addressable with: PID of the Resource +  Fragment Identifier.
    9899  Fragment Identifier to be used depends on the resource type, it may be: XPointer, timecode, sequence-offset, etc.
    99100  Allowed children are:  `Metadata` and `DataView`
    100  DataView::
     101 !DataView::
    101102  the element carrying the typed data
    102103  Content can be anything that is in other namespace.
    103104  The content has to be possible inline or referenced. Important for Images and AV-Files.
    104  Metadata::
    105   optional element carrying metadata about the `Resource` or `ResourceFragment`.
    106   It can carry an optional parameter `cmd-link` with the PID of a CMD-record. (This only makes sense for `Resource/Metadata`)
     105 
     106  !DataView type='metadata' (obsoletes Metadata) ::
     107    optional (but strongly encouraged) element carrying metadata about the `Resource` or `ResourceFragment`.
     108    The metadata can be inline or referenced via attributes `@pid` or `@ref`.
     109    It can be basically any xml, it shall be described by a schema referenced in the @schema-parameter,
     110    but preference order is CMD-records, then other recognized standards (`dublincore`, `OLAC`)
    107111
    108112  Although the original idea was to "serialize" all such metadata-fields in a `<f key="{field-name}">`-element, I now prefer reusing existing namespaces.
     
    110114  `<dc:title>` seems preferable to `<f key="dc:title">`, right?
    111115
    112 However this nested approach seems not directly compatible with the established SRU-based systems, that rather work on flat fields.
    113 And while this can be overcome by providing converter XSL-stylesheets, the information we need seems expressable in a flat structure as well, that makes the more complex (nested) approach questionable:
     116(An alternative would be to flatten  the structure, but for now, we go with the nested one. Example of a flat structure: )
    114117
    115118{{{
     
    140143
    141144 kwic:: Keyword in context
     145    both keyword and (left/right) context are wrapped into elements, to avoid mixed-content
    142146 {{{
    143147 <ccs:DataView type="kwic">
    144     Junker Frauenlob , purre knix plautz - Ihr seid ein komischer Kauz - Habt ein Bärtlein von Haaren schwarz ,
    145     Ziehet es aus mit einem Tropfen Harz Prrrr - ho wird das lang , Kling klang - g - a - d -
    146     <kw>e</kw> , Scheiden thut weh - der Daus
    147  </ccs:DataView>
    148 }}}
    149  Alternatively - to avoid mixed content - the context could be enclosed in separate element as well:
    150  {{{
    151   <ccs:DataView type="kwic"><c>Some text with </c><kw>keyword</kw> <c>highlighted</c></ccs:DataView>           
    152 }}}
    153  Or in the extreme form, every token is wrapped in an element:
     148   <c type="left">Some text with </c><kw>keyword</kw> <c type="right">highlighted</c>
     149 </ccs:DataView>               
     150 }}}
     151  Alternatively every token could be wrapped as a element:
    154152 {{{
    155153  <ccs:DataView type="kwic">
     
    161159  </ccs:DataView>               
    162160}}}
    163   This comes close to the way the text is encoded in '''TCF''' and would accordingly allow to add (stand-off) annotation layers (lemma, POS, but also syntactic annotations).
    164 
    165   If there is some associated metadata (like bibliographic information about the source of the hit, this is to be encoded in a separate element `<ccs:Metadata>`.
     161  This comes close to the way the text is encoded in '''TCF''' and would accordingly allow to add (stand-off) annotation layers (lemma, POS, but also syntactic annotations). This shall be a separate DataView-type.
     162
     163  If there is some associated metadata (like bibliographic information about the source of the hit, this is to be encoded in a separate element `<ccs:DataView type="metadata">`.
    166164
    167165 Geographic data :: A geographic location, either as coordinates or some location (street, city, place).