Context Navigation

FCS-Specification-Draft

Timestamp:: 10/21/15 13:11:30 (9 years ago)
Author:: Oliver Schonefeld
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

Taskforces/FCS/FCS-Specification-Draft

-                      v8
+                      v9
 === Capabilities
 A ''Capability'' defines a certain feature set that is part of CLARIN-FCS, e.g. what kind of queries are supported. Each Endpoint implements some (or all) of these Capabilities. The Endpoint will announce the capabilities it provides to allow a Client to auto-tune itself (see section [#endpointDescription Endpoint Description]). Each Capability is identified by a ''Capability Identifier'', which uses the URI syntax. The following Capabilities are defined in CLARIN-FCS defined:
 ||= Name              =||= Capability Identifier                           =||= Summary                                     =||
+||=Name               =||=Capability Identifier                            =||=Summary                                      =||
 || ''Basic Search''    || `http://clarin.eu/fcs/capability/basic-search`    || Simple full-text searching                    ||
 || ''Advanced Search'' || `http://clarin.eu/fcs/capability/advanced-search` || Searching in structured and/or annotated data ||
 …
 === Advanced Search
+About advanced search
+The ''Advanced Search'' capability allows searching in annotated data. Queries can be across annotation layer, e.g. token and part-of-speech layer. CLARIN-FCS defined a set of search-able annotation layers with certain semantics and syntax. Endpoints `SHOULD` support as many different, of course depending n the resource type, annotation layers as possible.
 ==== Layers
+||= Identifier =||= Annotation Tier Description                                           =||= Syntax =||= Examples (without quotes) =||
+|| `token`      || Appropriate tokenisation of resource, i.e. words                        || String       || "Dog", "cat", "walked" ||
+|| `lemma`      || Lemmatisation of tokens                                                 || String   || "good", "walking", "dog" ||
+|| `pos`        || Part-of-Speech annotations                                              || [#REF_UD_POS Universal POS tags] || "NOUN", "VERB", "ADJ" ||
+|| `orth`       || Orthographic transcription of (mostly) spoken resources                 || String || "dug", "cat", "wolking" ||
+|| `norm`       || Orthographic normalization of (mostly) spoken resources                 || String || "dog", "cat", "walking" ||
+|| `phonetic`   || Phonetic transcription                              || [#REF_SAMPA Speech Assessment Methods Phonetic Alphabet (SAM-PA)] || "'du:", "'vi:-d6 'ha:-b@n" ||
+|| `ne`         || Named entities  || String || "Utrecht", "Poland", "Felix the Cat" ||
+|| `text`       || Annotation tier that is used in [#basicSearch Basic Search]             || String || "Dog", "cat" "walked"                ||
+||=Identifier =||=Annotation Tier Description                                =||=Syntax                          =||=Examples (without quotes)           =||
+|| `token`     || Appropriate tokenisation of resource, i.e. words            || ''String''                       || "Dog", "cat", "walked"               ||
+|| `lemma`     || Lemmatisation of tokens                                     || ''String''                       || "good", "walking", "dog"             ||
+|| `pos`       || Part-of-Speech annotations                                  || [#REF_UD_POS Universal POS tags] || "NOUN", "VERB", "ADJ"                ||
+|| `orth`      || Orthographic transcription of (mostly) spoken resources     || ''String''                       || "dug", "cat", "wolking"              ||
+|| `norm`      || Orthographic normalization of (mostly) spoken resources     || ''String''                       || "dog", "cat", "walking"              ||
+|| `phonetic`  || Phonetic transcription                                      || [#REF_SAMPA SAMPA]               || "'du:", "'vi:-d6 'ha:-b@n"           ||
+|| `names`     || Named entities                                              || ''String''                       || "Utrecht", "Poland", "Felix the Cat" ||
+|| `text`      || Annotation tier that is used in [#basicSearch Basic Search] || ''String''                       || "Dog", "cat" "walked"                ||
+The column Syntax describes the inventory of symbols that a Client `MUST` use with a corresponding annotation layer; the value ''String'' denotes that symbols are arbitrary Unicode Strings, i.e. no fixed inventory of symbols are defined.
 ==== FCS-QL