Changes between Version 8 and Version 9 of Taskforces/FCS/FCS-Specification-Draft


Ignore:
Timestamp:
10/21/15 13:11:30 (9 years ago)
Author:
Oliver Schonefeld
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Taskforces/FCS/FCS-Specification-Draft

    v8 v9  
    211211=== Capabilities
    212212A ''Capability'' defines a certain feature set that is part of CLARIN-FCS, e.g. what kind of queries are supported. Each Endpoint implements some (or all) of these Capabilities. The Endpoint will announce the capabilities it provides to allow a Client to auto-tune itself (see section [#endpointDescription Endpoint Description]). Each Capability is identified by a ''Capability Identifier'', which uses the URI syntax. The following Capabilities are defined in CLARIN-FCS defined:
    213 ||= Name              =||= Capability Identifier                           =||= Summary                                     =||
     213||=Name               =||=Capability Identifier                            =||=Summary                                      =||
    214214|| ''Basic Search''    || `http://clarin.eu/fcs/capability/basic-search`    || Simple full-text searching                    ||
    215215|| ''Advanced Search'' || `http://clarin.eu/fcs/capability/advanced-search` || Searching in structured and/or annotated data ||
     
    247247
    248248=== Advanced Search
    249 About advanced search
     249The ''Advanced Search'' capability allows searching in annotated data. Queries can be across annotation layer, e.g. token and part-of-speech layer. CLARIN-FCS defined a set of search-able annotation layers with certain semantics and syntax. Endpoints `SHOULD` support as many different, of course depending n the resource type, annotation layers as possible.
     250
    250251==== Layers
    251 ||= Identifier =||= Annotation Tier Description                                           =||= Syntax =||= Examples (without quotes) =||
    252 || `token`      || Appropriate tokenisation of resource, i.e. words                        || String       || "Dog", "cat", "walked" ||
    253 || `lemma`      || Lemmatisation of tokens                                                 || String   || "good", "walking", "dog" ||
    254 || `pos`        || Part-of-Speech annotations                                              || [#REF_UD_POS Universal POS tags] || "NOUN", "VERB", "ADJ" ||
    255 || `orth`       || Orthographic transcription of (mostly) spoken resources                 || String || "dug", "cat", "wolking" ||
    256 || `norm`       || Orthographic normalization of (mostly) spoken resources                 || String || "dog", "cat", "walking" ||
    257 || `phonetic`   || Phonetic transcription                              || [#REF_SAMPA Speech Assessment Methods Phonetic Alphabet (SAM-PA)] || "'du:", "'vi:-d6 'ha:-b@n" ||
    258 || `ne`         || Named entities  || String || "Utrecht", "Poland", "Felix the Cat" ||
    259 || `text`       || Annotation tier that is used in [#basicSearch Basic Search]             || String || "Dog", "cat" "walked"                ||
     252||=Identifier =||=Annotation Tier Description                                =||=Syntax                          =||=Examples (without quotes)           =||
     253|| `token`     || Appropriate tokenisation of resource, i.e. words            || ''String''                       || "Dog", "cat", "walked"               ||
     254|| `lemma`     || Lemmatisation of tokens                                     || ''String''                       || "good", "walking", "dog"             ||
     255|| `pos`       || Part-of-Speech annotations                                  || [#REF_UD_POS Universal POS tags] || "NOUN", "VERB", "ADJ"                ||
     256|| `orth`      || Orthographic transcription of (mostly) spoken resources     || ''String''                       || "dug", "cat", "wolking"              ||
     257|| `norm`      || Orthographic normalization of (mostly) spoken resources     || ''String''                       || "dog", "cat", "walking"              ||
     258|| `phonetic`  || Phonetic transcription                                      || [#REF_SAMPA SAMPA]               || "'du:", "'vi:-d6 'ha:-b@n"           ||
     259|| `names`     || Named entities                                              || ''String''                       || "Utrecht", "Poland", "Felix the Cat" ||
     260|| `text`      || Annotation tier that is used in [#basicSearch Basic Search] || ''String''                       || "Dog", "cat" "walked"                ||
     261
     262The column Syntax describes the inventory of symbols that a Client `MUST` use with a corresponding annotation layer; the value ''String'' denotes that symbols are arbitrary Unicode Strings, i.e. no fixed inventory of symbols are defined.
    260263
    261264==== FCS-QL