wiki:CmdiPortal

Version 2 (modified by vronk, 15 years ago) (diff)

--

CMDI Portal

CmdiPortal aims to be an integrative web-based working environment for the CLARIN user. At least it must provide a MetadataBrowser - a web interface for querying CLARIN metadata stored in MetadataRepository?. In full bloom it should moreover provide personal workspace facilities, extensible component for viewing resources themselves (as opposed to only viewing their metadata) and tightly interoperate with the WorkflowEngine? and MetadataEditor components.

Web User Interface

The interface should/could integrate various components. From the point of view of the user it could be a comprehensive toolbox, where everything is reachable within a click. :)

a sketch of a possible portal user interface

This is a tentative sketch of a possible user interface already in a highly evolved stage. The building blocks are:

Search/Browse?
This is where every workflow starts. It has multiple tabs: Browsing (Catalogue), Searching, Bookmarks etc. But a more "organic", hybrid approach shall be examined (ie mixing of Browsing and Searching). Controls MDList.
MDList
Here the results of a search or the members of a category are listed.
~ One MD record in one row.
As we want the searching very flexibel etc. it may be, that the distinction between the Search block and MDList will not be that clear, ie that eg controls for refining the search will be integrated in the list.
MDDetail
A detail view for one Metadata record. Controlled by the MDList.
PrivateWorkspace
This block is similar to the Browse block and could be probably even integrated in the main Search/Browse? control. The important distinctive function is, that this block does not show data from (Central) MetaDataRepository?, but rather local data, ie data available on user's machine. Be it private resources, resources downloaded from the repository or results of processing existing resources. This component would accordingly (unlike all the other components) need access to local file system. This will require the user to install this component on her machine, be it a firefox extension, some mini-java-app or any other sort of standalone application. To not put pressure on the user to install anything, this component has to be optional, installable on demand. This component shall be very tiny, it should be really only concerned with managing and enriching the local data and providing the information to the rest of the system.
ResourceViewer
An extensible container for displaying the resources. See CmdiPortal#resource-viewer Section: Resource Viewer
WorkflowEngine?
In a advanced stage of evolution, there should be integration or a tight interoperability between the MD-Browser and the Workflow-Engine. The utopic scenario is to use MD-Browser for finding the Resources, both content and tools and just drag them into the graphical WorkflowEngine? pane, rearranging them, equipping them with necessary parameters and running the workflow, the results automatically being added in the Results tab of user's workspace.

Resource Viewer

While CMDI is primarily about metadata, to really make a difference to just a collection of links, it seems necessary to integrate a resource viewer in the long run. Of course every Resource type requires a different view and there are many ways one can look at every resource. The idea here is to provide a kind of an extensible container, which based on the provided metadata tries to reach the actual resources and display them as good as it can. The important thing here ist "extensible", meaning new kinds (novelty implementations) of displaying resources can be incorporated in the infrastructure on this hook.

So ResourceViewer? is a abstract class/interface, which will be base for different implementations. First two implementations, that would already provide for good show cases are CorpusViewer and ResourceStats.

ResourceStats?
A generic viewer trying to give an overall look on the resource. Probably it also needs to be typed respectively to resource type. It should give information about all kind of count (tokens, sentences, documents, entries, bib-usage, etc...)

Corpus Viewer

A considerable part of the language resources is text corpora. Corpora require a specific user interface, the minimum requirement being some kind of query-interface and a KWIC-display of the results. This can be extended in many ways, but this is a necessary minimum. Plus there is the server side to this, a Corpus Query System, capable of answering the queries exactly and fast based on its prebuilt indices.

There is already a few (very good) implementations of this system, every corpus has to use one. Some of them provide even a API, allowing to query the corpus from own/third applications. The idea for the CorpusViewer? is to provide a query interface + KWIC-display (+ everything else in the long run) based on the APIs provided, querying the corpora where it is, ie communicating with the corpus query system at the content provider's site. (Let's forget problems with workload and performance for now.)

There are two implementations (i know of) which seem to lend themselves happily / suit fine in this scenario:

Both are open source, provide a very expressive query language, scale fine, robust and proven in production environments. manatee/bonito is used by both Slovak and Czech National Corpora and Korpus Südtirol, ddc is used by DWDS - Berlin, Schweizer Text Korpus and a unique corpus cooperation project C4. (and most probably a few more) Furthermore manatee builds the base for SketchEngine? - a next generation query tool and ddc comes with integrated functionality for distributed corpora, ie the data are user-transparently spread across network. They both provide thin APIs in perl and/or python.

With such a Corpus Viewer implemented the CLARIN user would be (technically) able to query these existing corpora (running on these Corpus Query System) and a Corpus Service could be introduced to accomodate various "homeless" corpora from CLARIN member research groups, then also reachable via this components, integrated in the CMDI Portal.

Wouldn't that be nice? ;)

Profile

The proposed workspace shall be highly personalizable. This information shall be stored in a profile. Multiple profiles can be managed by one user (correspond to different workspaces).

The internal format for profiles be JSON.
We can have personal and project/team profiles.
A project team could/should share a common project profile.

We need to think of overlaying multiple profiles.

Attachments (3)

Download all attachments as: .zip