Changes between Version 43 and Version 44 of Taskforces/FCS/FCS-Specification-Draft
- Timestamp:
- 11/10/15 14:16:20 (9 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
Taskforces/FCS/FCS-Specification-Draft
v43 v44 422 422 423 423 === Advanced Search 424 The ''Advanced Search'' capability allows searching in annotated data. Queries can be across annotation layer, e.g. token and part-of-speech layer. CLARIN-FCS defined a set of searchable annotation layers with certain semantics and syntax. Endpoints `SHOULD` support as many different, of course depending on the resource type, annotation layers as possible. 424 The ''Advanced Search'' capability allows searching in annotated data, that is represented in annotation layers. An annotation ''layer'' contains annotations of a specific type, e.g. lemma or part-of-speech layer. Queries can be across annotation layer. 425 426 CLARIN-FCS defined a set of searchable annotation layers with certain semantics and syntax. Endpoints `SHOULD` support as many different, of course depending on the resource type, annotation layers as possible. 425 427 426 428 ==== Layers #layers 429 Each Layer is assumed to be ''segmented'', e.g. to allow for searching for a single lemma. However, CLARIN-FCS does not endorse a specific segmentation, i.e. the segmentation of Layers is in the domain of the Endpoint and ''opaque'' to CLARIN-FCS. CLARIN-FCS '''does not''' endorse nor assume a ''formal linguistic relation'' or ''formal linguistic hierarchy'' between two items on two different layers. 430 427 431 ||=Layer Type Identifier =||=Annotation Layer Description =||=Syntax =||=Examples (without quotes) =|| 428 || `lemma` || Lemmatisation of tokens || ''String'' || "good", "walk", "dog" || 432 || `text` || Textual representation of resource, also the layer that is used in [#basicSearch Basic Search] || ''String'' || "Dog", "cat" "walking", "better" || 433 || `lemma` || Lemmatisation || ''String'' || "good", "walk", "dog" || 429 434 || `pos` || Part-of-Speech annotations || [#REF_UD_POS Universal POS tags] || "NOUN", "VERB", "ADJ" || 430 435 || `orth` || Orthographic transcription of (mostly) spoken resources || ''String'' || "dug", "cat", "wolking" || 431 436 || `norm` || Orthographic normalization of (mostly) spoken resources || ''String'' || "dog", "cat", "walking", "best" || 432 437 || `phonetic` || Phonetic transcription || [#REF_SAMPA SAMPA] || "'du:", "'vi:-d6 'ha:-b@n" || 433 || `text` || Annotation layer that is used in [#basicSearch Basic Search] || ''String'' || "Dog", "cat" "walking", "better" ||434 438 435 439 The column ''Layer Type Identifier'' denotes the identifier for a layer. It is used in [#fcsQL FCS-QL] queries and the XML serialization for the [#advancedDataView Advanced Data View]. All valid identifiers are defined in the table above, all other identifiers are reserved and `MUST NOT` be used. Clients and Endpoints `MAY` create custom Layer Type Identifiers, e.g. for testing proposed. If they so so, the custom Layer Type identifiers `MUST` start with the String `x-`, e.g. `x-customLayer`.