Changes between Version 61 and Version 62 of Taskforces/FCS/FCS-Specification-Draft


Ignore:
Timestamp:
06/13/17 12:27:49 (7 years ago)
Author:
Leif-Jöran
Comment:

Adding operation=explain 2.0 example

Legend:

Unmodified
Added
Removed
Modified
  • Taskforces/FCS/FCS-Specification-Draft

    v61 v62  
    582582||=XML Schema                   =|| [source:FederatedSearch/schema/Core_2/DataView-Advanced.xsd DataView-Advanced.xsd] ([source:FederatedSearch/schema/Core_2/DataView-Advanced.xsd?format=txt download]) ||
    583583
    584 The ''Advanced (ADV)'' Data View serves as the natual serialization of search results for ''Advanced Search'' queries. The ADV Data View supports structured information in one or more annotation layers. The annotations are streams (ranges) over the signal in a stand-off like format with start and end offsets. The list of `Segment` elements building a stream can be of type `item` for character-based streams or `timestamp` for audio streams (granularity up to 0.001s). The Endpoint is responsible for choosing the proper offsets for the segments. The segments `MUST` be possible to align over all annotation layers. For character streams the recommendation is Unicode Normalization Form ''KC''. Segments `MAY` also have an endpoint specific reference indicated by an URI that could be shown in the Aggregator,e.g. to open an audio player or other viewer with contents from the Search Engine. The list of `Layer` elements contains `Span` elements making references to the segments. A Span inherits the start and end offsets from its segments and contains the actual annotation as its content. It `MAY` also carry information about the original annotation value in an `@alt-value` attribute. The document order of the `Layer` elements define the view order in the Aggregator. Each Layer has a ''Layer Type Identifier'' and a ''Layer Identifier''.  The Endpoint `SHOULD` at least resturn all layers that were referenced in the query. It `MAY` return more layers. The attribute `@highlight` is used to mark Spans as hits. Multiple hit markers are supported and the Aggregator `MAY` display them visually distinct. It is up to the Endpoint to decide what should be marked as a hit, but the recommendation is to mark everything referenced in the query.
     584The ''Advanced (ADV)'' Data View serves as the natual serialization of search results for ''Advanced Search'' queries. The ADV Data View supports structured information in one or more annotation layers. The annotations are streams (ranges) over the signal in a stand-off like format with start and end offsets. The list of `Segment` elements building a stream can be of type `item` for character-based streams or `timestamp` for audio streams (granularity up to 0.001s). The Endpoint is responsible for choosing the proper offsets for the segments. The segments `MUST` be possible to align over all annotation layers. For character streams the recommendation is Unicode Normalization Form ''KC''. Segments `MAY` also have an endpoint specific reference indicated by an URI that could be shown in the Aggregator, e.g. to open an audio player or other viewer with contents from the Search Engine. The list of `Layer` elements contains `Span` elements making references to the segments. A `Span` inherits the start and end offsets from its segments and contains the actual annotation as its content. It `MAY` also carry information about the original annotation value in an `@alt-value` attribute. The document order of the `Layer` elements define the view order in the Aggregator. Each Layer has a ''Layer type identifier'' and a ''Layer identifier''.  The Endpoint `SHOULD` at least return all layers that were referenced in the  Advanced Search query. It `MAY` return more layers. The attribute `@highlight` is used to mark Spans as hits. Multiple hit markers are supported and the Aggregator `MAY` display them visually distinct. It is up to the Endpoint to decide what should be marked as a hit, but the recommendation is to mark everything referenced in the Advanced Search query.
    585585
    586586{{{#!comment
     
    699699=== Versioning and Extensions
    700700==== Backwards Compatibility #backwardsCompatibility
    701 {{{
    702 #!div style="border: 1px solid #000000; font-size: 75%"
    703 TODO: check and proof-read
    704 }}}
    705 
    706 Clients `MUST` be compatible to CLARIN-FCS 1.0, thus must implement SRU 1.2. If a Client uses CLARIN-FCS 1.0 to talk to an Endpoint, it `MUST NOT` use features beyond the Basic Search capability. Clients `MUST` implement a heuristic to automatically determine which CLARIN-FCS protocol version, i.e. which version of the SRU protocol, can be used talk an Endpoint.
     701Clients `MUST` be compatible to CLARIN-FCS 1.0, thus `MUST` implement SRU 1.2. If a Client uses CLARIN-FCS 1.0 to talk to an Endpoint, it `MUST NOT` use features beyond the Basic Search capability. Clients `MUST` implement a heuristic to automatically determine which CLARIN-FCS protocol version, i.e. which version of the SRU protocol, can be used talk an Endpoint.
    707702
    708703Clients `MUST` be able to process the legacy XML namespaces:
     
    846841}}}
    847842
     843{{{#!xml
     844<sruResponse:explainResponse>
     845  <sruResponse:version>2.0</sruResponse:version>
     846  <sruResponse:record>
     847    <sruResponse:recordSchema>http://explain.z3950.org/dtd/2.0/</sruResponse:recordSchema>
     848    <sruResponse:recordXMLEscaping>xml</sruResponse:recordXMLEscaping>
     849    <sruResponse:recordData>
     850      <zr:explain>
     851        <zr:serverInfo protocol="SRU" version="2.0" transport="http">
     852          <zr:host>127.0.0.1</zr:host>
     853          <zr:port>8080</zr:port>
     854          <zr:database>korp-endpoint</zr:database>
     855        </zr:serverInfo>
     856        <zr:databaseInfo>
     857          <zr:title lang="se">Språkbankens korpusar</zr:title>
     858          <zr:title lang="en" primary="true">The Språkbanken corpora</zr:title>
     859          <zr:description lang="se">Sök i Språkbankens korpusar.</zr:description>
     860          <zr:description lang="en" primary="true">Search in the Språkbanken corpora.</zr:description>
     861          <zr:author lang="en">Språkbanken (The Swedish Language Bank)</zr:author>
     862          <zr:author lang="se" primary="true">Språkbanken</zr:author>
     863        </zr:databaseInfo>
     864        <zr:indexInfo>
     865          <zr:set identifier="http://clarin.eu/fcs/resource" name="fcs">
     866            <zr:title lang="se">Clarins innehållssökning</zr:title>
     867            <zr:title lang="en" primary="true">CLARIN Content Search</zr:title>
     868          </zr:set>
     869        <zr:index search="true" scan="false" sort="false">
     870          <zr:title lang="en" primary="true">Words</zr:title>
     871          <zr:map primary="true">
     872             <zr:name set="fcs">words</zr:name>
     873          </zr:map>
     874        </zr:index>
     875      </zr:indexInfo>
     876      <zr:schemaInfo>
     877        <zr:schema identifier="http://clarin.eu/fcs/resource" name="fcs">
     878          <zr:title lang="en" primary="true">CLARIN Content Search</zr:title>
     879        </zr:schema>
     880      </zr:schemaInfo>
     881      <zr:configInfo>
     882        <zr:default type="numberOfRecords">250</zr:default>
     883        <zr:setting type="maximumRecords">1000</zr:setting>
     884      </zr:configInfo>
     885    </zr:explain>
     886  </sruResponse:recordData>
     887</sruResponse:record>
     888<sruResponse:echoedExplainRequest>
     889  <sruResponse:version>2.0</sruResponse:version>
     890</sruResponse:echoedExplainRequest>
     891<sruResponse:extraResponseData>
     892  <ed:EndpointDescription version="2">
     893    <ed:Capabilities>
     894      <ed:Capability>http://clarin.eu/fcs/capability/basic-search</ed:Capability>
     895      <ed:Capability>http://clarin.eu/fcs/capability/advanced-search</ed:Capability>
     896    </ed:Capabilities>
     897    <ed:SupportedDataViews>
     898      <ed:SupportedDataView id="hits" delivery-policy="send-by-default">application/x-clarin-fcs-hits+xml</ed:SupportedDataView>
     899      <ed:SupportedDataView id="adv" delivery-policy="send-by-default">application/x-clarin-fcs-adv+xml</ed:SupportedDataView>
     900      <ed:SupportedDataView id="cmdi" delivery-policy="need-to-request">application/x-cmdi+xml</ed:SupportedDataView>
     901    </ed:SupportedDataViews>
     902    <ed:SupportedLayers>
     903      <ed:SupportedLayer id="word" result-id="http://spraakbanken.gu.se/ns/fcs/layer/word">text</ed:SupportedLayer>
     904      <ed:SupportedLayer id="lemma" result-id="http://spraakbanken.gu.se/ns/fcs/layer/lemma">lemma</ed:SupportedLayer>
     905      <ed:SupportedLayer id="pos" result-id="http://spraakbanken.gu.se/ns/fcs/layer/pos">pos</ed:SupportedLayer>
     906    </ed:SupportedLayers>
     907    <ed:Resources>
     908      <ed:Resource pid="hdl:10794/suc">
     909        <ed:Title xml:lang="sv">SUC-korpusen</ed:Title>
     910        <ed:Title xml:lang="en">The SUC corpus</ed:Title>
     911        <ed:Description xml:lang="sv">Stockholm-Umeå-korpusen hos Språkbanken.</ed:Description>
     912        <ed:Description xml:lang="en">The Stockholm-Umeå corpus at Språkbanken.</ed:Description>
     913        <ed:LandingPageURI>https://spraakbanken.gu.se/resurser/suc</ed:LandingPageURI>
     914        <ed:Languages>
     915          <ed:Language>swe</ed:Language>
     916        </ed:Languages>
     917        <ed:AvailableDataViews ref="hits"/>
     918        <ed:AvailableLayers ref="word"/>
     919      </ed:Resource>
     920    </ed:Resources>
     921  </ed:EndpointDescription>
     922</sruResponse:extraResponseData>
     923</sruResponse:explainResponse>
     924}}}
     925
    848926== Operation ''scan'' #scan
    849927The ''scan'' operation of the SRU protocol is currently not used in the ''Basic Search'' or ''Advanced Search'' capability of CLARIN-FCS. Future capabilities may use this operation, therefore it is `NOT RECOMMENDED` for Endpoints to define custom extensions that use this operation.