Changes between Version 2 and Version 3 of FCS-specification


Ignore:
Timestamp:
04/17/12 11:57:29 (12 years ago)
Author:
dietuyt
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FCS-specification

    v2 v3  
    1515
    1616In general each CLARIN-center participating will provide at least the following services:
    17 •       Provide one or more resources
    18 •       Support Content-search within those resources
    19 •       Return search-hits in the agreed-upon format
    20 •       Support query-expansion if possible
    21 •       Support the selection of a sub-part of the offered resources to perform content-search on that sup-part
    22 •       Provide support for the sub-part selection by providing CMDI metadata at the same, reasonable , granularity
     17 * Provide one or more resources
     18 * Support Content-search within those resources
     19 * Return search-hits in the agreed-upon format
     20 * Support query-expansion if possible
     21 * Support the selection of a sub-part of the offered resources to perform content-search on that sup-part
     22 * Provide support for the sub-part selection by providing CMDI metadata at the same, reasonable , granularity
    2323 
    2424= Global Design Thoughts / The Aggregator =
     
    7474This basic request serves to announce server's capabilities and should allow the client to configure itself automatically. The explain response should, ideally, provide a list of ISOcatted indexes as possible search indexes. If there is no ISOcat equivalent the CCS-context*  set is to be used. We provide a telling example (as seen within the context of the explain response as defined on the SRU/CQL website):
    7575
    76 {{{
     76{{{#!xml
    7777<indexInfo>
    7878<set identifier="isocat.org/datcat" name="isocat"/>
     
    105105=== Scan ===
    106106
    107 We foresee the scan operation as a way of signaling to the calling program/user/aggregator the available resources available for searching at the endpoint. This in contrast to the definition in SRU, where scan is a way to browse a list of keywords. The value of the scanClause parameter  should be cmd.collection.
     107We foresee the scan operation as a way of signaling to the calling program/user/aggregator the available resources available for searching at the endpoint. This in contrast to the definition in SRU, where scan is a way to browse a list of keywords. The value of the scanClause parameter should be '''fcs.resource'''.
    108108
    109109To this the endpoint will return a list of terms, which are searchable collections. Their identifiers can than be used to restrict the search by passing one (or more) as parameters in x-cmd-context in the searchRetrieve operation.
    110110
    111111Again, we provide a telling example:
    112 {{{
     112{{{#!xml
    113113<sru:scanResponse xmlns:sru="http://www.loc.gov/zing/srw/" >
    114114<sru:version>1.2</sru:version>
    115115  <sru:terms> 
    116116    <sru:term>
    117           <sru:value>MPI86949#</sru:value>
     117          <sru:value>hdl:1839/00-0000-0000-0001-53A5-2</sru:value>
    118118          <sru:numberOfRecords>12098</sru:numberOfRecords>
    119119          <sru:displayTerm>The CGN-Corpus (Corpus Gesproken Nederlands)</sru:displayTerm>
    120120    </sru:term>
    121121    <sru:term>
    122           <sru:value>MPI1296694#</sru:value>
     122          <sru:value>http://corpus1.mpi.nl/qfs1/media-archive/mirrored_corpora/childes/Corpusstructure/childes.imdi</sru:value>
    123123          <sru:numberOfRecords>42</sru:numberOfRecords>
    124124          <sru:displayTerm>Childes corpus</sru:displayTerm>
     
    127127  <sru:echoedScanRequest>
    128128    <sru:version>1.2</sru:version>
    129     <sru:scanClause>cmd.collections</sru:scanClause>
     129    <sru:scanClause>fcs.resource</sru:scanClause>
    130130    <sru:responsePosition></sru:responsePosition>
    131131    <sru:maximumTerms>42</sru:maximumTerms>
     
    134134}}}
    135135
    136 Note that the values in the sru:value elements should be valid PID. These PIDs are ideally also available from within the matching CMDI metadata file . (see also below under “Restricting the search”.
     136Note that the values in the sru:value elements should be valid [http://www.clarin.eu/faq/3460 MdSelfLink]. These MdSelfLinks should also be available from within the matching CMDI metadata file (via a reference in the Header section - see also below under "Restricting the search").
     137
     138Additionally it is possible (but not obligatory) to perform extra Scan operations to retrieve subcollections, as in a [http://en.wikipedia.org/wiki/Tree_traversal tree traversal] algorithm.
     139
     140E.g. to find out the subcollections of the CGN-Corpus in the example above one would perform the following scan operation: http://clarin_srucql_endpoint?operation=Scan&version=1.2&scanClause=fcs.resource=hdl:1839/00-0000-0000-0001-53A5-2
     141
     142{{{#!xml
     143<sru:scanResponse xmlns:sru="http://www.loc.gov/zing/srw/" >
     144<sru:version>1.2</sru:version>
     145  <sru:terms> 
     146    <sru:term>
     147          <sru:value>hdl:1839/00-0000-0000-0003-467E-9</sru:value>
     148          <sru:numberOfRecords>300</sru:numberOfRecords>
     149          <sru:displayTerm>Annotation types</sru:displayTerm>
     150    </sru:term>
     151    <sru:term>
     152          <sru:value>hdl:1839/00-0000-0000-0003-4682-F</sru:value>
     153          <sru:numberOfRecords>400</sru:numberOfRecords>
     154          <sru:displayTerm>Components</sru:displayTerm>
     155    </sru:term>
     156    <sru:term>
     157          <sru:value>hdl:1839/00-0000-0000-0003-4692-D</sru:value>
     158          <sru:numberOfRecords>350</sru:numberOfRecords>
     159          <sru:displayTerm>Regions</sru:displayTerm>
     160    </sru:term>
     161  </sru:terms>
     162  <sru:echoedScanRequest>
     163    <sru:version>1.2</sru:version>
     164    <sru:scanClause>fcs.resource=hdl:1839/00-0000-0000-0001-53A5-2</sru:scanClause>
     165    <sru:responsePosition></sru:responsePosition>
     166    <sru:maximumTerms>42</sru:maximumTerms>
     167  </sru:echoedScanRequest>
     168</sru:scanResponse>
     169}}}
     170
    137171
    138172=== SearchRetrieve ===
     
    154188There are several dataviews agreed upon. Each dataView will have an attribute “type”, which has as value the type of dataView contained. It is possible to, in the future, add different dataviews if required. It is mandatody to support the KWIC dataview (as this type is fairly straightforward to show as a list of results).
    155189Our KWIC dataView looks as follows:
    156 {{{
     190{{{#!xml
    157191<ccs:DataView type="kwic">
    158192        <c type="left">Some text with </c>