Changes between Version 5 and Version 6 of Taskforces/FCS/FCS-Specification-Draft
- Timestamp:
- 10/21/15 09:29:10 (9 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Taskforces/FCS/FCS-Specification-Draft
v5 v6 5 5 [[PageOutline(1-6)]] 6 6 = CLARIN Federated Content Search (CLARIN-FCS) - Core 2.0 7 8 = Introduction 9 {{{ 10 #!div style="border: 1px solid #000000; font-size: 75%" 11 TODO: Proof-read/Check sub-sections. 12 }}} 7 13 The goal of the ''CLARIN Federated Content Search (CLARIN-FCS) - Core'' specification is to introduce an ''interface specification'' that decouples the ''search engine'' functionality from its ''exploitation'', i.e. user-interfaces, third-party applications, and to allow services to access heterogeneous search engines in a uniform way. 8 14 9 = Introduction10 {{{11 #!div style="border: 1px solid #000000; font-size: 75%"12 All following sub-sections to be updated as required.13 }}}14 15 == Terminology 15 16 The key words `MUST`, `MUST NOT`, `REQUIRED`, `SHALL`, `SHALL NOT`, `SHOULD`, `SHOULD NOT`, `RECOMMENDED`, `MAY`, and `OPTIONAL` in this document are to be interpreted as described in [#REF_RFC_2119 RFC2119]. … … 167 168 The CLARIN-FCS Interface Specification defines a set of capabilities, an extensible result format and a set of required operations. CLARIN-FCS is built on the SRU/CQL standard and additional functionality required for CLARIN-FCS is added through SRU/CQL's extension mechanisms. 168 169 169 Specifically, the CLARIN-FCS Interface Specification consists of two majorparts, a set of formats, and a transport protocol. The ''Endpoint'' component is a software component that acts as a bridge between a ''Client'' and a ''Search Engine'' and passes the requests sent by the ''Client'' to the ''Search Engine''. The ''Search Engine'' is a custom software component that allows the search of language resources in a Repository. The ''Endpoint'' implements the ''Transport Protocol'' and acts as a mediator between the CLARIN-FCS specific formats and the idiosyncrasies of ''Search Engines'' of the individual Repositories. The following figure illustrates the overall architecture:170 Specifically, the CLARIN-FCS Interface Specification consists of two parts, a set of formats, and a transport protocol. The ''Endpoint'' component is a software component that acts as a bridge between a ''Client'' and a ''Search Engine'' and passes the requests sent by the ''Client'' to the ''Search Engine''. The ''Search Engine'' is a custom software component that allows the search of language resources in a Repository. The ''Endpoint'' implements the ''Transport Protocol'' and acts as a mediator between the CLARIN-FCS specific formats and the idiosyncrasies of ''Search Engines'' of the individual Repositories. The following figure illustrates the overall architecture: 170 171 {{{ 171 172 +---------+ … … 198 199 In general, the work flow in CLARIN-FCS is as follows: a Client submits a query to an Endpoint. The Endpoint translates the query from CQL or FCS-QL to the query dialect used by the Search Engine and submits the translated query to the Search Engine. The Search Engine processes the query and generates a result set, i.e. it compiles a set of hits that match the search criterion. The Endpoint then translates the results from the Search Engine-specific result set format to the CLARIN-FCS result format and sends them to the Client. 199 200 200 == "Discovery Phase" 201 == Discovery #Discovery 202 The ''Discovery'' step allows a Client to gather information about an Endpoint, in particular which capabilities are supported or which resources are available for searching. 203 201 204 === Capabilities 202 {{{ 203 #!div style="border: 1px solid #000000; font-size: 75%" 204 Add and describe advanced capability. 205 }}} 206 === Endpoint Description 205 A ''Capability'' defines a certain feature set that is part of CLARIN-FCS, e.g. what kind of queries are supported. Each Endpoint implements some (or all) of these Capabilities. The Endpoint will announce the capabilities it provides to allow a Client to auto-tune itself (see section [#endpointDescription Endpoint Description]). Each Capability is identified by a ''Capability Identifier'', which uses the URI syntax. The following Capabilities are defined in CLARIN-FCS defined: 206 ||= Name =||= Capability Identifier =||= Summary =|| 207 || ''Basic Search'' || `http://clarin.eu/fcs/capability/basic-search` || Simple full-text searching || 208 || ''Advanced Search'' || `http://clarin.eu/fcs/capability/advanced-search` || Searching in structured and/or annotated data || 209 210 Endpoints `MUST` implement the ''Basic Search'' Capability. Endpoints `MUST NOT` invent custom Capability Identifiers and `MUST` only use the values defined above. 211 212 213 === Endpoint Description #endpointDescription 207 214 {{{ 208 215 #!div style="border: 1px solid #000000; font-size: 75%" 209 216 Add stuff required for advanced capability. 210 217 }}} 211 == "Search Phase" 212 === "FCS-QL" 213 {{{ 214 #!div style="border: 1px solid #000000; font-size: 75%" 215 New Section. \\ 216 More subsections for this section? 217 }}} 218 219 == Searching 220 In the ''Searching'' step the Client performs the actual search request to a to previously [#Discovery discovered] Endpoint. 221 222 === Basic Search 223 About basic search 224 225 === Advanced Search 226 About advanced search 227 ==== Layers 228 About available layers 229 ==== FCS-QL 230 About available layers 231 218 232 === Result Format 219 233 ==== Resource and !ResourceFragment