= FCS - Feature Matrix = This shall be a comprehensive list of individual features that every FCS-compliant search engine must/can implement. However it is currently out of sync with the current specification discussed in detail in [[FCS-specification]]. SRU/CQL [[http://www.loc.gov/standards/sru/specs/base-profile.html|proposes conformance levels]], but this is too coarse-grained and fuzzy. We try to be clearer here and go down to the level of individual features. ''Obviously the list is not complete. '''TODO:''' complete '' [[FCS-spec#Explainoperation|explain response]] :: Provide information about the configuration of the service, mainly about available indices and defaults. ''We have to examine, if `explain`-record can be the single authoritative source of information about a repository, that would carry all the configuration information discussed below.'' == Query == simple term search (Conformance level 0) :: accept a simple term query: a word or a phrase. {{{ query=fish query="system" query="language acquisition" query="She said \"Yes\"" }}} It is a search for occurrences of a full word or a phrase. In the example queries above ''fish'' and ''"system"'' are instances of a word, and ''"language acquisition"'' and ''"She said \"Yes\""'' are instances of a phrase. '''''TODO:''' decide when searching in lexica search only in the lemma-list or in the explanations as well (full-text)'' honor search context (Conformance level 0) :: Allow to restrict search to specific collections/resources specified as a parameter (see [[SearchContext]]) {{{ ?x-cmd-context={list[Resource-PID]} }}} wildcards (Conformance level 1-2) :: support wildcards like {{{ wor* *ove *oo* ?ool b?t }}} There could be also a version encoded with the help of `relation` like: `index startsWith wor` index search (Conformance level 1-2) :: support searchClause-queries: {{{ index relation searchTerm }}} Examples are {{{ dc.creator = anderson title adj "wonderful feelings" bib.dateIssued < 1998 }}} See more about indices in next chapter. supported relations single term :: what are the allowed relations in in a query of the form `index rel single-search-term`: * `=` - exact match (''on token? on annotation?'') * '<' - comparison on numeric values * `regex`? - regular expression supported relations multi-term :: for a search clause with multiple terms (like "rat hat cat") SRU proposes: * `any` - any of the terms (OR) * `all` - all of the terms (AND) * `adj` - terms in that order - a phrase * `all/window/#N` - terms within a given window (#N) (SRU/CQL 2.0 proposal) '''TODO''': discuss the lists of relations above in more detail: what do we want exactly? honor VC as search context (Conformance level 1-2) :: able to process a virtual collection as means of restricting the search-context. boolean search (Conformance level 1-2) :: {{{ AND OR PROX }}} == Indices == There are following operations on indices: * `search` * `scan` (/`aggregate`) * `output` (/`group`, /`sort`) It needs to be explicated in the description of the repository, which operations are supported on which indices. In SRU/CQL indices are defined in context sets. We plan to introduce [[FCS-spec#ContextSets|following context sets]]: ccs :: content indices like: {{{ kwic, pos, lemma }}} '''''TODO:''' we need some distinction between a '''lemma''' as one annotation layer in the full-text content and as head of a lexicon entry.'' isocat :: supporting index-search on `isocat` data categories. (Mapping internal indices to `isocat`.) cmd :: content search supporting also MD-filters. ("intensional filter" - see [[CDMDC]]) We should agree on some basic set of MD-indices at least for `output` that "should" be implemented to make life easier to software and users. Hot candidates are: * title/name (default string representation of the resource) * language * resource-type * creation/publication date * or simply inspire by the VLO-facets Additionally especially regarding metadata indices (the `cmd`-context set) we have to agree how to use existing context sets like ''dublincore''. == Search Result == provide result in ccs:Resource-format:: [source:FederatedSearch/ccsResource.xsd] provide resource reference :: `Resource@pid` provide !DataView@type=X :: In which formats can you provide the results? provide !DataView@type=kwic :: Can you provide basic keyword in context view? provide !DataView@type=metadata:: Do you provide information about the Resource (and/or !ResourceFragment)? What kind of information (what metadata-fields - describe by @schema parameter?) provide link to CMD-metadata :: Fill ccs:DataView@pid/@ref with reference to CMD-record. provide CMD-metadata :: alternatively resolve and embed the CMD-record {{{ }}} == supporting operations == scan indices :: implement `scan`-operation allowing to query the terms available in individual indices