Changes between Version 2 and Version 3 of VLO/CMDI data workflow framework


Ignore:
Timestamp:
11/05/15 14:27:32 (9 years ago)
Author:
go.sugimoto@oeaw.ac.at
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • VLO/CMDI data workflow framework

    v2 v3  
     1
     2== VLO/CMDI data workflow framework ==
     3
     4
    15Note: The same content is also available at GoogleDoc [https://docs.google.com/document/d/1OoxDEFoZKhmotk7tbrElqcn79acKnj4T897sNctMYH8/edit?usp=sharing]
    26
     
    2428'''The dashboard''' is the key development of the core VLO framework. It will integrate all the data ingestion pipeline into one, creating a user-friendly GUI web interface with which VLO curator can work on data management much more efficiently and coherently in a uniform manner. Data integrity will be much more guaranteed within the complex data life cycle of VLO in one environment. The Dashboard approach is based on the well-known OAIS model, encompassing the three information packages: Submission Information Package (SIP), Archival Information Package (AIP), and Dissemination Information Package (DIP). It offers a very intuitive data management view, illustrating the step-by-step process of the entire data life cycle, starting from harvesting, converting, and validating, to indexing and distributing. Those who have no strong technical skill should be able to use it in a similar way to organise a mailbox of an email software. The functionalities should include (but not limited to):
    2529
    26 List of the datasets (OAI-PMH sets) bundled per data provider and per CLARIN centres/countries (MUST)
    27 Status and statistics of the sets within the ingestion pipeline (errors, progress indicator) (MUST) (export as PDF, XML, CSV etc (SHOULD))
    28 Simple visualisation of the statistics 2), including pie charts, bar charts etc (CLOUD)
     30*List of the datasets (OAI-PMH sets) bundled per data provider and per CLARIN centres/countries (MUST)
     31*Status and statistics of the sets within the ingestion pipeline (errors, progress indicator) (MUST) (export as PDF, XML, CSV etc (SHOULD))
     32*Simple visualisation of the statistics 2), including pie charts, bar charts etc (CLOUD)
    2933Browse the data quality reports per set (MUST) (export as PDF, XML, CSV etc (SHOULD))
    3034Send email of the data quality report to a data provider/CLARIN centre (SHOULD) (automatic email (COULD))
     
    5660In addition to the data management view of the Dashboard, it can offer a very simple GUI tool (eg a web form interface) to create and edit the concept mapping and the value mapping and normalisation (See figures below). It should translate the data filled into XSLT files (or equivalent) with which the data transformation/mapping will be carried out. Direct XML editing is also possible. The tool will reduce the work involving CSV, TXT, XML, XSLT by introducing a simple yet powerful collaborative web service within the Dashboard framework. Desirably, the editing in the form will not just typing a value itself, but provide a modal window to search and select from relevant extra services (CCR, CLAVAS) (note: there is no mock-up below for this function), so that the curator can double-check with and controlled by the mapping and normalisation values via API/auto-complete.
    5761
    58 
    59 == The enhanced MD authoring tool ==
    60  will communicate with the extra CLARIN services including the (Centre Registry), Component Registry, CLAVAS, and CCR. The base of this tool already exists in different CLARIN centres. COMEDI in Norway and DSpace in Czech/Poland are two of the good examples. The new tool may include the functionalities as follows (but not limited to):
     62'''The enhanced MD authoring tool''' will communicate with the extra CLARIN services including the (Centre Registry), Component Registry, CLAVAS, and CCR. The base of this tool already exists in different CLARIN centres. COMEDI in Norway and DSpace in Czech/Poland are two of the good examples. The new tool may include the functionalities as follows (but not limited to):
    6163
    6264Import of Component Registry CMDI profiles (especially the recommended profiles which will be defined by the curation team soon) and selection of them to create metadata (MUST)