wiki:EDC_FCS

CLARIN European Demonstrator Case and Federated Search

In two meetings (January 2010 at the Meertens Institute and May 2011 at the MPI for Psycholinguistics) of the so-called EDC (European Demonstrator Case) EU and national CLARIN partners have committed to creating a common federated content search infrastructure demonstrating the power of the emerging CLARIN infrastructure. The Dutch Search & Develop CLARIN NL subproject, that has the same ambitions, takes part in this effort and contributes with its four CLARIN NL centers in this infrastructure.

At the last EDC meeting a prototype was demonstrated that connected search engines from four prospective CLARIN centers with a simple UI allowing different corpora housed by the contributors to be selected for searching. At the meeting new contributors offered to participate: OTA, BAS, University of Tuebingen, IDS. Also new functional enhancements for the prototype were decided:

  • implementation of a default keyword search mode (substring)
  • integration with metadata search (CMDI)

The work is currently coordinated by: Daan Broeder (MPI for Psycholinguistics), Matej Durco (ICLTT), Marc Kemps-Snijders (Meertens Institute)

Following breaks down the work into individual issues:

EDC-Architecture
tentative reference architecture
FCS
Federated Content Search is mainly about the unifying interface to be implemented by individual Content Providers on top of their corpus search systems and the FCS-Aggregator - the actual service accomplishing the distributed search.
CDMDC
Combined Distributed Metadata Content Search is a next step, that shall allow to restrict the content search based on Metadata.
EDC-Workbench
the web application (web user interface), that allows the user to interact with the FederatedSearch and other related components, with the core use case being a combined metadata and content query. There is already an external wiki-page about this topic (vronk.net EDC-Workbench). It is slightly outdated, but a good starting point. A separate page - Viewable - elaborates on the issue of data types, formats and corresponding viewers.
EDC-Agenda
To collect and discuss tasks and assignements

Following diagram shall give an overview of the wiki-pages to this topic and their linking (implying some kind of dependence-relationship). Dashed nodes stand for external resources, dashed edges for "less important" (back-)links.

Linking between pages and resources within the EDC/FCS category.

Last modified 13 years ago Last modified on 05/26/11 13:41:37

Attachments (1)

Download all attachments as: .zip