Version 15 (modified by 13 years ago) (diff) | ,
---|
Towards a FCS Specification
In which we list things we agree should be in our future standard / specification.
Contents
SRU-CQL servlets
Explain response
The explain response should, ideally, provide a list of ISOCatted indexes as possible search indexes.
Example (tentative):
<indexInfo> <set identifier="http://www.isocat.org/datcat" name="isocat"/> <set identifier="clarin.eu/schema/ccs-v1.0" name="ccs"/> <index id="?"> <title lang="en">Part of Speech</title> <map><name set="isocat">partOfSpeech</name></map> </index> <index id="?"> <title lang="en">Words</title> <map><name set="ccs">words</name></map> </index> <index id="?"> <title lang="en">Phonetics</title> <map><name set="ccs">phonetics</name></map> </index> </indexInfo>
Search context
Restricting the search space shall be done via x-cmd-context
parameter (obsoleting: x-cmd-collections
).
Scan response
Search result
<ccs:Resource pid="{pid of the resource}“> <ccs:Metadata cmd-link="{PID of the CMD-record}"> /* cmd-link is optional */ /* this is for metadata provided directly by the content provider * NOT the CMD metadata. */ <ccs:f key="{any metadata-key}">{any metadata value}</ccs:f> </ccs:Metadata> <ccs:ResourceFragment pid="{offset relative to the parent resource PID}“> <ccs:Metadata> /* metadata pertaining to the specific (matching) fragment, like metadata on the "current" speaker */ </ccs:Metadata> <ccs:DataView type="text/xml"><meertens:any/> </ccs:DataView> <ccs:DataView type="image/jpeg" > <link>{optional link to the data}</link> </ccs:DataView> </ccs:ResourceFragment> </ccs:Resource>
- Resource
-
element representing a resource, carrying the identifier.
It may represent anything that has a PID (and a MDRecord).
So in particular it may also be collections, aggregating other Resources.
Allowed children are:
Resource
,ResourceFragment
,Metadata
andDataView
- ResourceFragment?
-
A part of a resource, without own PID, i.e. something addressable with: PID of the Resource + Fragment Identifier.
Fragment Identifier to be used depends on the resource type, it may be: XPointer, timecode, sequence-offset, etc.
Allowed children are:
Metadata
andDataView
- DataView?
- the element carrying the typed data Content can be anything that is in other namespace. The content has to be possible inline or referenced. Important for Images and AV-Files.
- Metadata
-
optional element carrying metadata about the
Resource
orResourceFragment
. It can carry an optional parametercmd-link
with the PID of a CMD-record. (This only makes sense for Resource/Metadata?
Data Views
All dataviews of specific types should be the same in all implementations. That is, if a service presents results a KWIC, that should be the same KWIC in all services. Here we propose one encoding for several types of dataviews. All implementing services must conform.
- kwic
- Keyword in context
<ccs:DataView type="kwic"> Junker Frauenlob , purre knix plautz - Ihr seid ein komischer Kauz - Habt ein Bärtlein von Haaren schwarz , Ziehet es aus mit einem Tropfen Harz Prrrr - ho wird das lang , Kling klang - g - a - d - <kw>e</kw> , Scheiden thut weh - der Daus </ccs:DataView>
- Geolocation
- A location on a map.
- Annotations
- A bunch of annotated text.
We start by supporting the TCF and EAF format as they have existing viewers.
<ccs:DataView type="annotations/eaf"> <link>http://corpus1.mpi.nl/qfs1/media-archive/demo/pewi/Annotations/elan-example1.eaf</link> </ccs:DataView>
For an example EAF file see: sample file
Formats and viewers:
Type | Format | Viewer | URL |
Annotations | EAF | Annexviewer provided by MPI | sample view |
Annotations | TCF | AnnotationViewer? provided by Tübingen | sample view |
Geolocation | KML | Google-Maps? |
FCS webservice
REST webinterface that does the aggregation (Metasearch) (map/reduce). Should support the x-cmd-context as given by the metadatasearch service, or by the user in the federated search website.
PazPar2 is a federated search service (metasearch).
Clarin Metadata Search
Metadata Search