Changes between Version 22 and Version 23 of Taskforces/FCS/FCS-Specification-Draft


Ignore:
Timestamp:
10/30/15 11:37:19 (9 years ago)
Author:
Oliver Schonefeld
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Taskforces/FCS/FCS-Specification-Draft

    v22 v23  
    11{{{
    22#!div class="system-message"
    3 '''NOTE''': This page is work-in-progress. Final draft is scheduled to be delivered by 2015-10-31.
     3'''NOTE''': This page is work-in-progress. Final draft is scheduled to be delivered by 2015-10-31. (We will miss the deadline ...)
    44}}}
    55[[PageOutline(1-6)]]
     
    227227Add stuff required for advanced capability.
    228228}}}
    229 Endpoints need to provide information about their capabilities to support auto-configuration of Clients. The ''Endpoint Description'' mechanism provides the necessary facility to provide this information to the Clients. Endpoints `MUST` encode their capabilities using an XML format and embed this information into the SRU/CQL protocol as described in section [#explain Operation ''explain'']. The XML fragment generated by the Endpoint for the Endpoint Description `MUST` be valid according to the XML schema "[source:FederatedSearch/schema/Endpoint-Description.xsd Endpoint-Description.xsd]" ([source:FederatedSearch/schema/Endpoint-Description.xsd?format=txt download]).
     229Endpoints need to provide information about their capabilities to support auto-configuration of Clients. The ''Endpoint Description'' mechanism provides the necessary facility to provide this information to the Clients. Endpoints `MUST` encode their capabilities using an XML format and embed this information into the SRU/CQL protocol as described in section [#explain Operation ''explain'']. The XML fragment generated by the Endpoint for the Endpoint Description `MUST` be valid according to the XML schema "[source:FederatedSearch/schema/Core_2/Endpoint-Description.xsd Endpoint-Description.xsd]" ([source:FederatedSearch/schema/Core_2/Endpoint-Description.xsd?format=txt download]).
    230230
    231231The XML fragment for ''Endpoint Description'' is encoded as an `<ed:EndpointDescription>` element, that contains the following attributes and children:
     
    388388</ed:EndpointDescription>
    389389}}}
     390{{{
     391#!div style="border: 1px solid #000000; font-size: 75%"
     392TODO: describe the above example
     393}}}
    390394
    391395== Searching
     
    451455If the Endpoint can provide both, a persistent identifier as well as a URI, for either Resource or Resource Fragment, they `SHOULD` provide both. When working with results, Clients `SHOULD` prefer persistent identifiers over regular URIs.
    452456
    453 Resource and Resource Fragment are serialized in XML and Endpoints `MUST` generate responses that are valid according to the XML schema "[source:FederatedSearch/schema/Resource.xsd Resource.xsd]" ([source:FederatedSearch/schema/Resource.xsd?format=txt download]). A Resource is encoded in the form of a `<fcs:Resource>` element, a ''Resource Fragment'' in the form of a `<fcs:ResourceFragment>` element. The content of a Data View is wrapped in a `<fcs:DataView>` element. `<fcs:Resource>` is the top-level element and `MAY` contain zero or more `<fcs:DataView>` elements and `MAY` contain zero or more `<fcs:ResourceFragment>` elements. A `<fcs:ResourceFragment>` element `MUST` contain one or more `<fcs:DataView>` elements.
     457Resource and Resource Fragment are serialized in XML and Endpoints `MUST` generate responses that are valid according to the XML schema "[source:FederatedSearch/schema/Core_2/Resource.xsd Resource.xsd]" ([source:FederatedSearch/schema/Core_2/Resource.xsd?format=txt download]). A Resource is encoded in the form of a `<fcs:Resource>` element, a ''Resource Fragment'' in the form of a `<fcs:ResourceFragment>` element. The content of a Data View is wrapped in a `<fcs:DataView>` element. `<fcs:Resource>` is the top-level element and `MAY` contain zero or more `<fcs:DataView>` elements and `MAY` contain zero or more `<fcs:ResourceFragment>` elements. A `<fcs:ResourceFragment>` element `MUST` contain one or more `<fcs:DataView>` elements.
    454458
    455459The elements `<fcs:Resource>`, `<fcs:ResourceFragment>` and `<fcs:DataView>` `MAY` carry a `@pid` and/or a `@ref` attribute, which allows linking to the original data represented by the Resource, Resource Fragment, or Data View. A `@pid` attribute `MUST` contain a valid persistent identifier, a `@ref` `MUST` contain valid URI, i.e. a "plain" URI without the additional semantics of being a persistent reference. If the Endpoint cannot provide a `@pid` attribute for a `<fcs:Resource>`, they `SHOULD` provide a `@ref` attribute. Endpoint `SHOULD` add either a `@pid` or `@ref` attribute to either the `<fcs:Resource>` or the `<fcs:ResourceFragment>` element, if possible to both elements. Endpoints are `RECOMMENDED` to give `@pid` attributes, if they can provide them.
     
    520524||=Payload Delivery             =|| ''send-by-default'' (`REQUIRED`) ||
    521525||=Recommended Short Identifier =|| `hits` (`RECOMMENDED`) ||
    522 The ''Generic Hits'' Data View serves as the ''most basic'' agreement in CLARIN-FCS for serialization of search results and `MUST` be implemented by all Endpoints. In many cases, this Data View can only serve as an (lossy) approximation, because resources at Endpoints are very heterogeneous. For instance, the Generic Hits Data View is probably not the best representation for a hit result in a corpus of spoken language, but an architecture like CLARIN-FCS requires one common representation to be implemented by all Endpoints, therefore this Data View was defined. The Generic Hits Data View supports multiple markers for supplying highlighting for an individual hit, e.g. if a query contains a (boolean) conjunction, the Endpoint can use multiple markers to provide individual highlights for the matching terms. An Endpoint `MUST NOT` use this Data View to aggregate several hits within one resource. Each hit `SHOULD` be presented within the context of a complete sentence. If that is not possible due to the nature of the type of the resource, the Endpoint `MUST` provide an equivalent reasonable unit of context (e.g. within a phrase of an orthographic transcription of an utterance). The `<hits:Hit>` element within the `<hits:Result>` element is not enforced by the XML schema, but Endpoints are `RECOMMENDED` to use it. The XML fragment of the Generic Hits payload `MUST` be valid according to the XML schema "[source:FederatedSearch/schema/DataView-Hits.xsd DataView-Hits.xsd]" ([source:FederatedSearch/schema/DataView-Hits.xsd?format=txt download]).
     526||=XML Schema                   =|| [source:FederatedSearch/schema/Core_2/DataView-Hits.xsd DataView-Hits.xsd] ([source:FederatedSearch/schema/Core_2/DataView-Hits.xsd?format=txt download]) ||
     527The ''Generic Hits'' Data View serves as the ''most basic'' agreement in CLARIN-FCS for serialization of search results and `MUST` be implemented by all Endpoints. In many cases, this Data View can only serve as an (lossy) approximation, because resources at Endpoints are very heterogeneous. For instance, the Generic Hits Data View is probably not the best representation for a hit result in a corpus of spoken language, but an architecture like CLARIN-FCS requires one common representation to be implemented by all Endpoints, therefore this Data View was defined. The Generic Hits Data View supports multiple markers for supplying highlighting for an individual hit, e.g. if a query contains a (boolean) conjunction, the Endpoint can use multiple markers to provide individual highlights for the matching terms. An Endpoint `MUST NOT` use this Data View to aggregate several hits within one resource. Each hit `SHOULD` be presented within the context of a complete sentence. If that is not possible due to the nature of the type of the resource, the Endpoint `MUST` provide an equivalent reasonable unit of context (e.g. within a phrase of an orthographic transcription of an utterance). The `<hits:Hit>` element within the `<hits:Result>` element is not enforced by the XML schema, but Endpoints are `RECOMMENDED` to use it. The XML fragment of the Generic Hits payload `MUST` be valid according to the XML schema "[source:FederatedSearch/schema/Core_2/DataView-Hits.xsd DataView-Hits.xsd]" ([source:FederatedSearch/schema/Core_2/DataView-Hits.xsd?format=txt download]).
    523528 * Example (single hit marker):
    524529{{{#!xml
     
    546551||=Payload Delivery             =|| ''send-by-default'' (`REQUIRED`) ||
    547552||=Recommended Short Identifier =|| `adv` (`RECOMMENDED`) ||
     553||=XML Schema                   =|| [source:FederatedSearch/schema/Core_2/DataView-Advanced.xsd DataView-Advanced.xsd] ([source:FederatedSearch/schema/Core_2/DataView-Advanced.xsd?format=txt download]) ||
    548554
    549555{{{