Changes between Version 5 and Version 6 of Taskforces/FCS/FCS-Specification-Draft


Ignore:
Timestamp:
10/21/15 09:29:10 (9 years ago)
Author:
Oliver Schonefeld
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Taskforces/FCS/FCS-Specification-Draft

    v5 v6  
    55[[PageOutline(1-6)]]
    66= CLARIN Federated Content Search (CLARIN-FCS) - Core 2.0
     7
     8= Introduction
     9{{{
     10#!div style="border: 1px solid #000000; font-size: 75%"
     11TODO: Proof-read/Check sub-sections.
     12}}}
    713The goal of the ''CLARIN Federated Content Search (CLARIN-FCS) - Core'' specification is to introduce an ''interface specification'' that decouples the ''search engine'' functionality from its ''exploitation'', i.e. user-interfaces, third-party applications, and to allow services to access heterogeneous search engines in a uniform way.
    814
    9 = Introduction
    10 {{{
    11 #!div style="border: 1px solid #000000; font-size: 75%"
    12 All following sub-sections to be updated as required.
    13 }}}
    1415== Terminology
    1516The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [#REF_RFC_2119 RFC2119].
     
    167168The CLARIN-FCS Interface Specification defines a set of capabilities, an extensible result format and a set of required operations. CLARIN-FCS is built on the SRU/CQL standard and additional functionality required for CLARIN-FCS is added through SRU/CQL's extension mechanisms.
    168169
    169 Specifically, the CLARIN-FCS Interface Specification consists of two major parts, a set of formats, and a transport protocol. The ''Endpoint'' component is a software component that acts as a bridge between a ''Client'' and a ''Search Engine'' and passes the requests sent by the ''Client'' to the ''Search Engine''. The ''Search Engine'' is a custom software component that allows the search of language resources in a Repository. The ''Endpoint'' implements the ''Transport Protocol'' and acts as a mediator between the CLARIN-FCS specific formats and the idiosyncrasies of ''Search Engines'' of the individual Repositories. The following figure illustrates the overall architecture:
     170Specifically, the CLARIN-FCS Interface Specification consists of two parts, a set of formats, and a transport protocol. The ''Endpoint'' component is a software component that acts as a bridge between a ''Client'' and a ''Search Engine'' and passes the requests sent by the ''Client'' to the ''Search Engine''. The ''Search Engine'' is a custom software component that allows the search of language resources in a Repository. The ''Endpoint'' implements the ''Transport Protocol'' and acts as a mediator between the CLARIN-FCS specific formats and the idiosyncrasies of ''Search Engines'' of the individual Repositories. The following figure illustrates the overall architecture:
    170171{{{
    171172                   +---------+
     
    198199In general, the work flow in CLARIN-FCS is as follows: a Client submits a query to an Endpoint. The Endpoint translates the query from CQL or FCS-QL to the query dialect used by the Search Engine and submits the translated query to the Search Engine. The Search Engine processes the query and generates a result set, i.e. it compiles a set of hits that match the search criterion. The Endpoint then translates the results from the Search Engine-specific result set format to the CLARIN-FCS result format and sends them to the Client.
    199200
    200 == "Discovery Phase"
     201== Discovery #Discovery
     202The ''Discovery'' step allows a Client to gather information about an Endpoint, in particular which capabilities are supported or which resources are available for searching.
     203
    201204=== Capabilities
    202 {{{
    203 #!div style="border: 1px solid #000000; font-size: 75%"
    204 Add and describe advanced capability.
    205 }}}
    206 === Endpoint Description
     205A ''Capability'' defines a certain feature set that is part of CLARIN-FCS, e.g. what kind of queries are supported. Each Endpoint implements some (or all) of these Capabilities. The Endpoint will announce the capabilities it provides to allow a Client to auto-tune itself (see section [#endpointDescription Endpoint Description]). Each Capability is identified by a ''Capability Identifier'', which uses the URI syntax. The following Capabilities are defined in CLARIN-FCS defined:
     206||= Name              =||= Capability Identifier                           =||= Summary                                     =||
     207|| ''Basic Search''    || `http://clarin.eu/fcs/capability/basic-search`    || Simple full-text searching                    ||
     208|| ''Advanced Search'' || `http://clarin.eu/fcs/capability/advanced-search` || Searching in structured and/or annotated data ||
     209
     210Endpoints `MUST` implement the ''Basic Search'' Capability. Endpoints `MUST NOT` invent custom Capability Identifiers and `MUST` only use the values defined above.
     211
     212
     213=== Endpoint Description #endpointDescription
    207214{{{
    208215#!div style="border: 1px solid #000000; font-size: 75%"
    209216Add stuff required for advanced capability.
    210217}}}
    211 == "Search Phase"
    212 === "FCS-QL"
    213 {{{
    214 #!div style="border: 1px solid #000000; font-size: 75%"
    215 New Section. \\
    216 More subsections for this section?
    217 }}}
     218
     219== Searching
     220In the ''Searching'' step the Client performs the actual search request to a to previously [#Discovery discovered] Endpoint.
     221
     222=== Basic Search
     223About basic search
     224
     225=== Advanced Search
     226About advanced search
     227==== Layers
     228About available layers
     229==== FCS-QL
     230About available layers
     231
    218232=== Result Format
    219233==== Resource and !ResourceFragment