Changes between Version 43 and Version 44 of Taskforces/FCS/FCS-Specification-Draft


Ignore:
Timestamp:
11/10/15 14:16:20 (9 years ago)
Author:
Oliver Schonefeld
Comment:
  • changed 'text' layer description, some notes about segmentation

Legend:

Unmodified
Added
Removed
Modified
  • Taskforces/FCS/FCS-Specification-Draft

    v43 v44  
    422422
    423423=== Advanced Search
    424 The ''Advanced Search'' capability allows searching in annotated data. Queries can be across annotation layer, e.g. token and part-of-speech layer. CLARIN-FCS defined a set of searchable annotation layers with certain semantics and syntax. Endpoints `SHOULD` support as many different, of course depending on the resource type, annotation layers as possible.
     424The ''Advanced Search'' capability allows searching in annotated data, that is represented in annotation layers. An annotation ''layer'' contains annotations of a specific type, e.g. lemma or part-of-speech layer. Queries can be across annotation layer.
     425
     426CLARIN-FCS defined a set of searchable annotation layers with certain semantics and syntax. Endpoints `SHOULD` support as many different, of course depending on the resource type, annotation layers as possible.
    425427
    426428==== Layers #layers
     429Each Layer is assumed to be ''segmented'', e.g. to allow for searching for a single lemma. However, CLARIN-FCS does not endorse a specific segmentation, i.e. the segmentation of Layers is in the domain of the Endpoint and ''opaque'' to CLARIN-FCS. CLARIN-FCS '''does not''' endorse nor assume a ''formal linguistic relation'' or ''formal linguistic hierarchy'' between two items on two different layers.
     430
    427431||=Layer Type Identifier =||=Annotation Layer Description                                =||=Syntax                          =||=Examples (without quotes)           =||
    428 || `lemma`                || Lemmatisation of tokens                                     || ''String''                       || "good", "walk", "dog"             ||
     432|| `text`                 || Textual representation of resource, also the layer that is used in [#basicSearch Basic Search] || ''String''                       || "Dog", "cat" "walking", "better"                ||
     433|| `lemma`                || Lemmatisation                                               || ''String''                       || "good", "walk", "dog"             ||
    429434|| `pos`                  || Part-of-Speech annotations                                  || [#REF_UD_POS Universal POS tags] || "NOUN", "VERB", "ADJ"                ||
    430435|| `orth`                 || Orthographic transcription of (mostly) spoken resources     || ''String''                       || "dug", "cat", "wolking"              ||
    431436|| `norm`                 || Orthographic normalization of (mostly) spoken resources     || ''String''                       || "dog", "cat", "walking", "best"              ||
    432437|| `phonetic`             || Phonetic transcription                                      || [#REF_SAMPA SAMPA]               || "'du:", "'vi:-d6 'ha:-b@n"           ||
    433 || `text`                 || Annotation layer that is used in [#basicSearch Basic Search] || ''String''                       || "Dog", "cat" "walking", "better"                ||
    434438
    435439The column ''Layer Type Identifier'' denotes the identifier for a layer. It is used in [#fcsQL FCS-QL] queries and the XML serialization for the [#advancedDataView Advanced Data View]. All valid identifiers are defined in the table above, all other identifiers are reserved and `MUST NOT` be used. Clients and Endpoints `MAY` create custom Layer Type Identifiers, e.g. for testing proposed. If they so so, the custom Layer Type identifiers `MUST` start with the String `x-`, e.g. `x-customLayer`.