| 229 | Endpoints need to provide information about their capabilities to support auto-configuration of Clients. The ''Endpoint Description'' mechanism provides the necessary facility to provide this information to the Clients. Endpoints `MUST` encode their capabilities using an XML format and embed this information into the SRU/CQL protocol as described in section [#explain Operation ''explain'']. The XML fragment generated by the Endpoint for the Endpoint Description `MUST` be valid according to the XML schema "[source:FederatedSearch/schema/Endpoint-Description.xsd Endpoint-Description.xsd]" ([source:FederatedSearch/schema/Endpoint-Description.xsd?format=txt download]). |
| 230 | |
| 231 | The XML fragment for ''Endpoint Description'' is encoded as an `<ed:EndpointDescription>` element, that contains the following attributes and children: |
| 232 | * one `@version` attribute (`REQUIRED`) on the `<ed:EndpointDescription>` element. The value of the `@version` attribute `MUST` be `2`. |
| 233 | * one `<ed:Capabilities>` element (`REQUIRED`) that contains one or more `<ed:Capability>` elements \\ |
| 234 | The content of the `<ed:Capability>` element is a Capability Identifier, that indicates the capabilities, that are supported by the Endpoint. For valid values for the Capability Identifier, see section [#capabilities Capabilities]. This list `MUST NOT` include duplicate values. |
| 235 | * one `<ed:SupportedDataViews>` element (`REQUIRED`) \\ |
| 236 | A list of Data Views that are supported by this Endpoint. This list is composed of one or more `<ed:SupportedDataView>` elements. The content of a `<ed:SupportedDataView>` `MUST` be the MIME type of a supported Data View, e.g. `application/x-clarin-fcs-hits+xml`. Each `<ed:SupportedDataView>` element `MUST` carry a `@id` and a `@delivery-policy` attribute. The value of the `@id` attribute is later used in the `<ed:Resource>` element to indicate, which Data View is supported by a resource (see below). Endpoints `SHOULD` use the recommended short identifier for the Data View. The `@delivery-policy` indicates, the Endpoint's delivery policy, for that Data View. Valid values are `send-by-default` for the ''send-by-default'' and `need-to-request` for the ''need-to-request'' delivery policy. \\ |
| 237 | This list `MUST NOT` include duplicate entries, i.e. no MIME type must appear more than once. \\ |
| 238 | The value of the `@id` attribute `MUST NOT` contain the characters `,` (comma) or `;` (semicolon) |
| 239 | * one `<ed:SupportedLayers>` element (`REQUIRED` if Endpoint supports ''Advanced Search'' capability) \\ |
| 240 | A list of Layers that are generally supported by this Endpoint. This list is composed of one or more `<ed:SupportedLayer>` elements. The content of a `<ed:SupportedLayer>` `MUST` be the identifier of a Layer (see [#layers section "Layers"]), e.g. `orth`. Each `<ed:SupportedLayer>` element `MUST` carry an `@id` and a `@delivery-policy` attribute. The value of the `@id` attribute is later used in the `<ed:Resource>` element to indicate, which Data View is supported by a resource (see below). The `@result-id` attribute is used in the Advanced Data View (see [#advancedDataView section "Advanced Data View"]). Each `<ed:SupportedLayer>` element `MAY` carry an optional `@qualifier` attribute. It is used a a qualifier in a FCS-QL search term in to address this specific layer. \\ |
| 241 | This list `MUST NOT` include duplicate entries, i.e. no Layer with the same `@result-id` MIME type must appear more than once. \\ |
| 242 | The value of the `@id` or `@result-id` attribute `MUST NOT` contain the characters `,` (comma) or `;` (semicolon) |
| 243 | The value of the `@qualifier` attribute `MUST NOT` contain characters other than `a`-`z`,`A`-`Z`,`0`-`9` and `-` (hyphen). |
| 244 | The `<ed:SupportedLayer>` element `MAY` carry an `@alt-value-info` and `@alt-value-info-uri` attribute; `@alt-value-info` `SHOULD` contain a sort description about the layer, e.g. the original tag set used; `@alt-value-info-uri` `MUST` contain a well-formed URI and `SHOULD` point to a web site with further information, e.g. about the original tag set and how the translation to FCS is done. Client, e.g. the Aggregator, can display this information together with the search result. |
| 245 | * one `<ed:Resources>` element (`REQUIRED`) \\ |
| 246 | A list of (top-level) resources that are available, i.e. searchable, at the Endpoint. The `<ed:Resources>` element contains one or more `<ed:Resource>` elements (see below). The Endpoint `MUST` declare at least one (top-level) resource. |
| 247 | |
| 248 | The `<ed:Resource>` element contains a basic description of a resource that is available at the Endpoint. A resource is a searchable entity, e.g. a single corpus. The `<ed:Resources>` has a mandatory `@pid` attribute that contains persistent identifier of the resource. This value `MUST` be the same as the ''!MdSelfLink'' of the CMDI record describing the resource. The `<ed:Resources>` element contains the following children: |
| 249 | * one or more `<ed:Title>` elements (`REQUIRED`) \\ |
| 250 | A human readable title for the resource. A `REQUIRED` `@xml:lang` attribute indicates the language of the title. An English version of the title is `REQUIRED`. The list of titles `MUST NOT` contain duplicate entries for the same language. |
| 251 | * zero or more `<ed:Description>` elements (`OPTIONAL`) \\ |
| 252 | An optional human-readable description of the resource. It `SHOULD` be at most one sentence. A `REQUIRED` `@xml:lang` attribute indicates the language of the description. If supplied, an English version of the description is `REQUIRED`. The list of descriptions `MUST NOT` contain duplicate entries for the same language. |
| 253 | * zero or one `<ed:LandingPageURI>` element (`OPTIONAL`) \\ |
| 254 | A link to a website for the resource, e.g. a landing page for a resource, i.e. a web-site that describes a corpus. |
| 255 | * one `<ed:Languages>` element (`REQUIRED`) \\ |
| 256 | The (relevant) languages available within the resource. The `<ed:Languages>` element contains one or more `<ed:Language>` elements. The content of a `<ed:Language>` element `MUST` be a ISO 639-3 three letter language code. This element should be repeated for all languages (relevant) available ''within'' the resource, however this list `MUST NOT` contain duplicate entries. |
| 257 | * one `<ed:AvailableDataViews>` element (`REQUIRED`) \\ |
| 258 | The Data Views that are available for the resource. The `<ed:AvailableDataViews>` element `MUST` carry a `@ref` attribute, that contains a whitespace separated list of id values, that correspond to value of the appropriate `@id` attribute for the `<ed:SupportedDataView>` elements that are referenced. \\ |
| 259 | In case of sub-resources, each Resource `SHOULD` support all Data Views that are supported by the parent resource. However, every resource `MUST` declare all available Data Views independently, i.e. there is no implicit inheritance semantic. |
| 260 | * one `<ed:AvailableLayers>` element (`REQUIRED` if Endpoint supports ''Advanced Search'' capability). The `<ed:AvailableLayers>` element `MUST` carry a `@ref` attribute, that contains a whitespace separated list of id values, that correspond to the value of the appropriate `@id` attribute for the `<ed:SupportedLayer>` elements that are referenced. \\ |
| 261 | In case of sub-resources, each Resource `SHOULD` support all Layers that are supported by the parent resource. However, every resource `MUST` declare all available Layers independently, i.e. there is no implicit inheritance semantic. |
| 262 | * zero or one `<ed:Resources>` element (`OPTIONAL`) \\ |
| 263 | If a resource has searchable sub-resources, the Endpoint `MUST` supply additional finer grained resource elements, which are wrapped in a `<ed:Resources>` element. A sub-resource is a searchable entity within a resource, e.g. a sub-corpus. |
| 264 | |
| 265 | [=#REF_Example_4]Example 4: |
| 266 | {{{#!xml |
| 267 | <ed:EndpointDescription xmlns:ed="http://clarin.eu/fcs/endpoint-description" version="2"> |
| 268 | <ed:Capabilities> |
| 269 | <ed:Capability>http://clarin.eu/fcs/capability/basic-search</ed:Capability> |
| 270 | </ed:Capabilities> |
| 271 | <ed:SupportedDataViews> |
| 272 | <ed:SupportedDataView id="hits" delivery-policy="send-by-default">application/x-clarin-fcs-hits+xml</ed:SupportedDataView> |
| 273 | </ed:SupportedDataViews> |
| 274 | <ed:Resources> |
| 275 | <!-- just one top-level resource at the Endpoint --> |
| 276 | <ed:Resource pid="http://hdl.handle.net/4711/0815"> |
| 277 | <ed:Title xml:lang="de">Goethe Corpus</ed:Title> |
| 278 | <ed:Title xml:lang="en">Goethe Korpus</ed:Title> |
| 279 | <ed:Description xml:lang="de">Der Goethe Korpus des IDS Mannheim.</ed:Description> |
| 280 | <ed:Description xml:lang="en">The Goethe corpus of IDS Mannheim.</ed:Description> |
| 281 | <ed:LandingPageURI>http://repos.example.org/corpus1.html</ed:LandingPageURI> |
| 282 | <ed:Languages> |
| 283 | <ed:Language>deu</ed:Language> |
| 284 | </ed:Languages> |
| 285 | <ed:AvailableDataViews ref="hits" /> |
| 286 | </ed:Resource> |
| 287 | </ed:Resources> |
| 288 | </ed:EndpointDescription> |
| 289 | }}} |
| 290 | [#REF_Example_4 Example 4] shows a simple Endpoint Description for an Endpoint that only supports the ''Basic Search'' Capability and only provides the Generic Hits Data View, which is indicated by a `<ed:SupportedDataView>` element. This element carries a `@id` attribute with a value of `hits`, the recommended value for the short identifier, and indicates a delivery policy of ''send-by-default'' by the `@delivery-policy` attribute. It only provides one top-level resource identified by the persistent identifier `http://hdl.handle.net/4711/0815`. The resource has a title as well as a description in German and English. A landing page is located at `http://repos.example.org/corpus1.html`. The predominant language in the resource contents is German. Only the Generic Hits Data View is supported for this resource, because the `<ed:AvailableDataViews>` element only references the `<ed:SupporedDataView>` element with the `@id` with a value of `hits`. |
| 291 | |
| 292 | [=#REF_Example_5]Example 5: |
| 293 | {{{#!xml |
| 294 | <ed:EndpointDescription xmlns:ed="http://clarin.eu/fcs/endpoint-description" version="2"> |
| 295 | <ed:Capabilities> |
| 296 | <ed:Capability>http://clarin.eu/fcs/capability/basic-search</ed:Capability> |
| 297 | </ed:Capabilities> |
| 298 | <ed:SupportedDataViews> |
| 299 | <ed:SupportedDataView id="hits" delivery-policy="send-by-default">application/x-clarin-fcs-hits+xml</ed:SupportedDataView> |
| 300 | <ed:SupportedDataView id="cmdi" delivery-policy="need-to-request">application/x-cmdi+xml</ed:SupportedDataView> |
| 301 | </ed:SupportedDataViews> |
| 302 | <ed:Resources> |
| 303 | <!-- top-level resource 1 --> |
| 304 | <ed:Resource pid="http://hdl.handle.net/4711/0815"> |
| 305 | <ed:Title xml:lang="de">Goethe Corpus</ed:Title> |
| 306 | <ed:Title xml:lang="en">Goethe Korpus</ed:Title> |
| 307 | <ed:Description xml:lang="de">Der Goethe Korpus des IDS Mannheim.</ed:Description> |
| 308 | <ed:Description xml:lang="en">The Goethe corpus of IDS Mannheim.</ed:Description> |
| 309 | <ed:LandingPageURI>http://repos.example.org/corpus1.html</ed:LandingPageURI> |
| 310 | <ed:Languages> |
| 311 | <ed:Language>deu</ed:Language> |
| 312 | </ed:Languages> |
| 313 | <ed:AvailableDataViews ref="hits" /> |
| 314 | </ed:Resource> |
| 315 | <!-- top-level resource 2 --> |
| 316 | <ed:Resource pid="http://hdl.handle.net/4711/0816"> |
| 317 | <ed:Title xml:lang="de">Mannheimer Morgen newspaper Corpus</ed:Title> |
| 318 | <ed:Title xml:lang="en">Zeitungskorpus des Mannheimer Morgen</ed:Title> |
| 319 | <ed:LandingPageURI>http://repos.example.org/corpus2.html</ed:LandingPageURI> |
| 320 | <ed:Languages> |
| 321 | <ed:Language>deu</ed:Language> |
| 322 | </ed:Languages> |
| 323 | <ed:AvailableDataViews ref="hits cmdi" /> |
| 324 | <ed:Resources> |
| 325 | <!-- sub-resource 1 of top-level resource 2 --> |
| 326 | <ed:Resource pid="http://hdl.handle.net/4711/0816-1"> |
| 327 | <ed:Title xml:lang="de">Mannheimer Morgen newspaper Corpus (before 1990)</ed:Title> |
| 328 | <ed:Title xml:lang="en">Zeitungskorpus des Mannheimer Morgen (vor 1990)</ed:Title> |
| 329 | <ed:LandingPageURI>http://repos.example.org/corpus2.html#sub1</ed:LandingPageURI> |
| 330 | <ed:Languages> |
| 331 | <ed:Language>deu</ed:Language> |
| 332 | </ed:Languages> |
| 333 | <ed:AvailableDataViews ref="hits cmdi" /> |
| 334 | </ed:Resource> |
| 335 | <!-- sub-resource 2 of top-level resource 2 --> |
| 336 | <ed:Resource pid="http://hdl.handle.net/4711/0816-2"> |
| 337 | <ed:Title xml:lang="de">Mannheimer Morgen newspaper Corpus (after 1990)</ed:Title> |
| 338 | <ed:Title xml:lang="en">Zeitungskorpus des Mannheimer Morgen (nach 1990)</ed:Title> |
| 339 | <ed:LandingPageURI>http://repos.example.org/corpus2.html#sub2</ed:LandingPageURI> |
| 340 | <ed:Languages> |
| 341 | <ed:Language>deu</ed:Language> |
| 342 | </ed:Languages> |
| 343 | <ed:AvailableDataViews ref="hits cmdi" /> |
| 344 | </ed:Resource> |
| 345 | </ed:Resources> |
| 346 | </ed:Resource> |
| 347 | </ed:Resources> |
| 348 | </ed:EndpointDescription> |
| 349 | }}} |
| 350 | The more complex [#REF_Example_5 Example 5] show an Endpoint Description for an Endpoint that, similar to [#REF_Example_4 Example 4], supports the ''Basic Search'' capability. In addition to the Generic Hits Data View, it also supports the CMDI Data View. The delivery polices are ''send-by-default'' for the Generic Hits Data View and ''need-to-request'' for the CMDI Data View. The Endpoint has two top-level resources (identified by the persistent identifiers `http://hdl.handle.net/4711/0815` and `http://hdl.handle.net/4711/0816`. The second top-level resource has two independently searchable sub-resources, identified by the persistent identifier `http://hdl.handle.net/4711/0816-1` and `http://hdl.handle.net/4711/0816-2`. All resources are described using several properties, like title, description, etc. The first top-level resource provides only the Generic Hits Data View, while the other top-level resource including its children provide the Generic Hits and the CMDI Data Views. |
| 351 | |
| 352 | [=#REF_Example_6]Example 6: |
| 353 | {{{#!xml |
| 354 | <ed:EndpointDescription xmlns:ed="http://clarin.eu/fcs/endpoint-description" version="2"> |
| 355 | <ed:Capabilities> |
| 356 | <ed:Capability>http://clarin.eu/fcs/capability/basic-search</ed:Capability> |
| 357 | <ed:Capability>http://clarin.eu/fcs/capability/advanced-search</ed:Capability> |
| 358 | </ed:Capabilities> |
| 359 | <ed:SupportedDataViews> |
| 360 | <ed:SupportedDataView id="hits" delivery-policy="send-by-default">application/x-clarin-fcs-hits+xml</ed:SupportedDataView> |
| 361 | </ed:SupportedDataViews> |
| 362 | <!-- ADV-FCS --> |
| 363 | <SupportedLayers> |
| 364 | <SupportedLayer id="l1" result-id="http://endpoint.example.org/Layers/orth1">orth</SupportedLayer> |
| 365 | <SupportedLayer id="l2" result-id="http://endpoint.example.org/Layers/pos1" qualifier="x">pos</SupportedLayer> |
| 366 | <SupportedLayer id="l3" result-id="http://endpoint.example.org/Layers/pos2" qualifier="y" |
| 367 | alt-value-info="STTS tagset" |
| 368 | alt-value-info-uri="http://repos.example.org/tagset_doc.html">pos</SupportedLayer> |
| 369 | <SupportedLayer id="l4" result-id="http://endpoint.example.org/Layers/word" type="empty">word</SupportedLayer> |
| 370 | <SupportedLayer id="l5" result-id="http://endpoint.example.org/Layers/lemma1">lemma</SupportedLayer> |
| 371 | </SupportedLayers> |
| 372 | |
| 373 | <ed:Resources> |
| 374 | <!-- just one top-level resource at the Endpoint --> |
| 375 | <ed:Resource pid="http://hdl.handle.net/4711/0815"> |
| 376 | <ed:Title xml:lang="de">Goethe Corpus</ed:Title> |
| 377 | <ed:Title xml:lang="en">Goethe Korpus</ed:Title> |
| 378 | <ed:Description xml:lang="de">Der Goethe Korpus des IDS Mannheim.</ed:Description> |
| 379 | <ed:Description xml:lang="en">The Goethe corpus of IDS Mannheim.</ed:Description> |
| 380 | <ed:LandingPageURI>http://repos.example.org/corpus1.html</ed:LandingPageURI> |
| 381 | <ed:Languages> |
| 382 | <ed:Language>deu</ed:Language> |
| 383 | </ed:Languages> |
| 384 | <ed:AvailableDataViews ref="hits" /> |
| 385 | <AvailableLayers ref="l1 l2 l3 l4 l5" /> |
| 386 | </ed:Resource> |
| 387 | </ed:Resources> |
| 388 | </ed:EndpointDescription> |
| 389 | }}} |