Changes between Version 28 and Version 29 of FCS-Specification-ScrapBook


Ignore:
Timestamp:
02/11/14 14:32:01 (10 years ago)
Author:
oschonef
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FCS-Specification-ScrapBook

    v28 v29  
    6666
    6767 Data View::
    68     A data views is a mechanism to support different representations of search results, e.g. Keyword-In-Context view, an Image or a KML encoded Geo-location.
     68    A Data View is a mechanism to support different representations of search results, e.g. a "hits with highlights" view, an image or a geolocation.
     69
     70 Data View Payload, Payload::
     71    The actual content encoded within a Data View, i.e. a CMDI metadata record or a KML encoded geolocation.
    6972
    7073 PID::
     
    221224Endpoints `MUST` use the identifier `http://clarin.eu/fcs/1.0` for the ''responseItemType'' (= content for the `<sru:recordSchema>` element) in SRU responses.
    222225
    223 Endpoints `MAY` serialize hits as multiple Data Views, however they `MUST` provide the Generic Hits (HITS) Data View either encoded as a  Resource Fragment (if applicable), or otherwise within the Resource (if there is no reasonable resource fragment). Other Data Views `SHOULD` be put in a place that is logical for their content (as is to be determined by the Endpoint), e.g. a metadata data view would most likely be put directly below Resource and a Data View representing some annotation layers directly around the hit is more likely to belong within a Resource Fragment.
     226Endpoints `MAY` serialize hits as multiple Data Views, however they `MUST` provide the Generic Hits (HITS) Data View either encoded as a Resource Fragment (if applicable), or otherwise within the Resource (if there is no reasonable resource fragment). Other Data Views `SHOULD` be put in a place that is logical for their content (as is to be determined by the Endpoint), e.g. a metadata data view would most likely be put directly below Resource and a Data View representing some annotation layers directly around the hit is more likely to belong within a Resource Fragment.
    224227
    225228[=#REF_Example_1]Example 1:
     
    227230<fcs:Resource xmlns:fcs="http://clarin.eu/fcs/1.0" pid="http://hdl.handle.net/4711/00-15">
    228231  <fcs:DataView type="application/x-clarin-fcs-hits+xml">
    229       <!-- data view content omitted -->
     232      <!-- data view payload omitted -->
    230233  </fcs:DataView>
    231234</fcs:Resource>
     
    238241  <fcs:ResourceFragment>
    239242    <fcs:DataView type="application/x-clarin-fcs-hits+xml">
    240       <!-- data view content omitted -->
     243      <!-- data view payload omitted -->
    241244    </fcs:DataView>
    242245  </fcs:ResourceFragment>
     
    251254  <fcs:DataView type="application/x-cmdi+xml"
    252255                pid="http://hdl.handle.net/4711/08-15-1" ref="http://repos.example.org/file/08_15_1.cmdi">
    253       <!-- data view content omitted -->
     256      <!-- data view payload omitted -->
    254257  </fcs:DataView>
    255258  <fcs:ResourceFragment pid="http://hdl.handle.net/4711/08-15-2" ref="http://repos.example.org/file/text_08_15.html#sentence2">
    256259    <fcs:DataView type="application/x-clarin-fcs-hits+xml">
    257       <!-- data view content omitted -->
     260      <!-- data view payload omitted -->
    258261    </fcs:DataView>
    259262  </fcs:ResourceFragment>
     
    267270
    268271==== Data View ====
    269 A ''Data View'' serves as a container for representing search results within CLARIN-FCS. Data Views are designed to allow for different representations of results, i.e. they are deliberately kept open to allow further extensions with more supported data view formats. The content of a Data View is called `payload`. Each Data View is identified by a MIME type ([#REF_RFC_6838 RFC6838], [#REF_RFC_3023 RFC3023]). If no existing MIME type can be used, implementors `SHOULD` define a properer private mime type. The type if the Data View is recorded in the `@type` attribute if the `<fcs:DataView>` element.
    270 
    271 The content of a Data View can either be deposited ''inline'' or by ''reference''. In the case of ''inline'', the payload of the Data View `MUST` serialized as an XML fragment below the `<fcs:DataView>` element. This method is the preferred methods payloads that can easily serialized in XML. In the case of by ''reference'', the content cannot easily deposited inline, i.e. it is binary content. In this case, the Data View `MUST` include a `@ref` or `@pid` attribute that links location for Clients to download the payload. This location `SHOULD` be ''openly accessible'', i.e. data can be downloaded freely without any need to perform a login.
    272 
    273 For the ''basic'' profile, the Data Views ''Generic Hits'', ''Component Metadata'', ''Image'' and ''Geolocation'' are defined in this specification. Endpoints `MAY` define custom Data Views, but Clients conforming to the ''basic'' profile `MAY` chose to ignore these.
    274 
    275 The examples in the following sections ''show only''
     272A ''Data View'' serves as a container for representing search results within CLARIN-FCS. Data Views are designed to allow for different representations of results, i.e. they are deliberately kept open to allow further extensions with more supported data view formats. The content of a Data View is called ''Payload''. Each Payload is typed and the type of the Payload is recorded in the `@type` attribute if the `<fcs:DataView>` element. The Payload type is is identified by a MIME type ([#REF_RFC_6838 RFC6838], [#REF_RFC_3023 RFC3023]). If no existing MIME type can be used, implementors `SHOULD` define a properer private mime type.
     273
     274The Payload of a Data View can either be deposited ''inline'' or by ''reference''. In the case of ''inline'', it `MUST` be serialized as an XML fragment below the `<fcs:DataView>` element. This method is the preferred methods payloads that can easily serialized in XML. In the case of by ''reference'', the content cannot easily deposited inline, i.e. it is binary content. In this case, the Data View `MUST` include a `@ref` or `@pid` attribute that links location for Clients to download the payload. This location `SHOULD` be ''openly accessible'', i.e. data can be downloaded freely without any need to perform a login.
     275
     276For the ''basic'' profile, the Data Views ''Generic Hits'', ''Component Metadata'', ''Image'' and ''Geolocation'' are defined in this specification. Endpoints `MAY` define custom Data Views, but Clients conforming to the ''basic'' profile `MAY` choose to ignore them. The ''Generic Hits'' Data View is mandatory, thus all Endpoints `MUST` provide hits represented in the ''Generic Hits'' Data View.
     277
     278'''NOTE''': The examples in the following sections ''show only'' the payload with the enclosing `<fcs:DataView>` element of a Data View. Of course, the Data View must be embedded either in a `<fcs:Resource>` or a `<fcs:ResourceFragment>` element. The  `@pid` and `@ref` attributes have been omitted for all ''inline'' payload types.
     279
    276280===== Generic Hits (HITS) =====
    277281||=Description         =|| The representation of the hit ||
    278282||=MIME type           =|| `application/x-clarin-fcs-hits+xml` ||
    279283||=Payload Disposition =|| ''inline'' ||
    280 The ''Generic Hits'' Data View contains the serialization of a search result hit.
    281 Example (single hit marker):
    282 {{{#!xml
    283 <Result xmlns="http://clarin.eu/fcs/1.0/hits">
    284     The quick brown <Hit>fox</Hit> jumps over the lazy dog.
    285 </Result>
    286 }}}
    287 Example (multiple hit markers; using XML namespace prefix instead):
    288 {{{#!xml
    289 <hits:Result xmlns:hits="http://clarin.eu/fcs/1.0/hits">
    290     The quick brown <Hit>fox</Hit> jumps over the lazy dog.
    291 </hits:Result>
    292 }}}
     284The ''Generic Hits'' Data View contains the serialization of a search result hit. It supports multiple maskers for suppling highlighting for the hit. Each hit `SHOULD` be presented within the context of a complete sentence. If that is not possible due to the nature of the type of the resource, the the Endpoint `SHALL` provide an equivalent reasonable unit of context (e.g. within a phrase of a orthographic transcription of an utterance). All Endpoints `MUST` provide hits represented in this Data View. The XML fragment of the Generic Hits payload `MUST` be valid according to the XML schema "[source:FederatedSearch/schema/DataView-Hits.xsd DataView-Hits.xsd]" ([source:FederatedSearch/schema/DataView-Hits.xsd?format=txt download])
     285 * Example (single hit marker):
     286{{{#!xml
     287<!-- potential @pid and @ref attributes omitted -->
     288<fcs:DataView type="application/x-clarin-fcs-hits+xml">
     289  <hits:Result xmlns:hits="http://clarin.eu/fcs/1.0/hits">
     290    The quick brown <hits:Hit>fox</hits:Hit> jumps over the lazy dog.
     291  </hits:Result>
     292</fcs:DataView>
     293}}}
     294 * Example (multiple hit markers):
     295{{{#!xml
     296<!-- potential @pid and @ref attributes omitted -->
     297<fcs:DataView type="application/x-clarin-fcs-hits+xml">
     298  <hits:Result xmlns:hits="http://clarin.eu/fcs/1.0/hits">
     299    The quick brown <hits:Hit>fox</hits:Hit> jumps over the lazy <hits:Hit>dog</hits:Hit>.
     300  </hits:Result>
     301</fcs:DataView>
     302}}}
     303
    293304
    294305===== Component Metadata (CMDI) =====
     
    297308||=Payload Disposition =|| ''inline'' or ''reference'' ||
    298309The ''Component Metadata'' Data View allows to embed a CMDI metadata record that ''applicable'' to the specific context into the Endpoint response, e.g. metadata about the resource in which the hit was produced. If this CMDI record is applicable for the entire Resource, is `SHOULD` be put in a `<fcs:DataView>` element below the `<fcs:Resource>` element. If it is applicable to the Resource Fragment, i.e. it contains more specialized metadata than the metadata for the encompassing resource, it `SHOULD` be put in a `<fcs:DataView>` element below the `<fcs:ResourceFragment>` element. Endpoints `SHOULD` provide the payload ''inline'', but Endpoints `MAY` also use the ''reference'' method. If an Endpoint uses the ''reference'' method, the CMDI metadata record `MUST` be downloadable without any restrictions.
     310 * Example (inline):
     311{{{#!xml
     312<!-- potential @pid and @ref attributes omitted -->
     313<fcs:DataView type="application/x-cmdi+xml">
     314  <CMD xmlns="http://www.clarin.eu/cmd/" CMDVersion="1.1">
     315    <!-- content omitted -->
     316  </CMD>
     317</fcs:DataView>
     318}}}
     319
     320 * Example (referenced):
     321{{{#!xml
     322<!-- potential @pid attribute omitted -->
     323<fcs:DataView type="application/x-cmdi+xml" ref="http://repos.example.org/resources/4711/0815.cmdi" />
     324}}}
     325
    299326
    300327===== Images (IMG) =====
     
    303330||=Payload Disposition =|| ''reference'' ||
    304331
    305 The ''Image'' Data View
     332The ''Image'' Data View allows top provide an image, that is relevant to the hit, e.g. a facsimile of the source of a transcription.
     333
     334 * Example:
     335{{{#!xml
     336<!-- potential @pid attribute omitted -->
     337<fcs:DataView type="image/png" ref="http://repos.example.org/resources/4711/0815.png" />
     338}}}
     339
     340
    306341===== Geolocation (GEO) =====
    307342||=Description         =|| An geographic location related to the hit ||
    308343||=MIME type           =|| `application/vnd.google-earth.kml+xml` ||
    309344||=Payload Disposition =|| ''inline'' ||
    310 The ''Geolocation'' Data View allows to geolocalize the hit. If `MUST` be encoded using the XML representation of the Keyhole Markup Language (KML)]. The KML fragment `MUST` comply with the [=#REF_KML_Spec KML specification].
    311 
    312 Example:
    313 <kml xmlns="http://www.opengis.net/kml/2.2">
    314   <Placemark>
    315     <name>Simple placemark</name>
    316     <description>Attached to the ground. Intelligently places itself
    317        at the height of the underlying terrain.</description>
    318     <Point>
    319       <coordinates>-122.0822035425683,37.42228990140251,0</coordinates>
    320     </Point>
    321   </Placemark>
    322 </kml>
    323 
    324 
    325 *WIP*
    326 The type of each data view is identified by the {{{type}}} attribute of the {{{<fcs:DataView>}}} element. The value if defined to be a [http://en.wikipedia.org/wiki/MIME_Type MIME type].  The following formats are currently being considered:
    327  Keyword-In-Context (KWIC)::
    328    Description: a keyword-in-context view, where each hit should be presented within the context of a complete sentence (if possible) or any other reasonable unit of context (e.g. if sentences cannot be determined by the endpoint). The keyword-in-context data view is '''mandatory''' for all endpoints. The appropriate XML schema can be found at [source:FederatedSearch/Resource-KWIC.xsd Resource-KWIC.xsd] ([source:FederatedSearch/Resource-KWIC.xsd?format=txt download]). \\
    329    Type: {{{application/x-clarin-fcs-kwic+xml}}} \\
    330    Example for a keyword-in-context data view (formatted for brevity):
    331 {{{#!xml
    332 <fcs:DataView type="application/x-clarin-fcs-kwic+xml">
    333   <kwic:kwic xmlns:kwic="http://clarin.eu/fcs/1.0/kwic">
    334     <kwic:c type="left">Some text with the </kwic:c>
    335     <kwic:kw>keyword</kwic:kw>
    336     <kwic:c type="right">highlighted.</kwic:c>
    337   </kwic:kwic>
     345The ''Geolocation'' Data View allows to geolocalize a hit. If `MUST` be encoded using the XML representation of the Keyhole Markup Language (KML)]. The KML fragment `MUST` comply with the [#REF_KML_Spec KML specification].
     346
     347 * Example:
     348{{{#!xml
     349<!-- potential @pid and @ref attributes omitted -->
     350<fcs:DataView type="application/vnd.google-earth.kml+xml">
     351  <kml xmlns="http://www.opengis.net/kml/2.2">
     352    <Placemark>
     353      <name>IDS Mannheim</name>
     354      <description>Institut für Deutsche Sprache, R5 6-13, 68161 Mannheim, Germany</description>
     355      <Point>
     356        <coordinates>8.4719510,49.4883700,0</coordinates>
     357      </Point>
     358    </Placemark>
     359  </kml>
    338360</fcs:DataView>
    339361}}}