Changes between Version 28 and Version 29 of VLO/CMDI data workflow framework


Ignore:
Timestamp:
11/13/15 11:57:59 (9 years ago)
Author:
go.sugimoto@oeaw.ac.at
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • VLO/CMDI data workflow framework

    v28 v29  
    124124== After a meeting 2015-11-11 (Menzo, Dieter, Davor, Matej and Go) ==
    125125
    126 This is the new diagram that Go comes up with after a technical discussion, which focuses on the Dashboard part. It may be different from what others think, but he thinks that the more modualisation of data ingestion steps will make the process much more clear. In his opinion, the roles of current harvester+viewer and curation modules are a little underspecified and overlapping, therefore, the diagram is an attempt to clarify a bit more what module/component will work on what in order to produce the expected results in Dashboard.
     126This is the new diagram that Go comes up with after a technical discussion, which focuses on the Dashboard part. It may be different from what others think, but he thinks that the more modualisation of data ingestion steps will make the process much clearer. In his opinion, the roles of current harvester+viewer and curation modules in relation to other part of the VLO ingestion are a little under-specified and overlapping, therefore, the diagram is an attempt to clarify a bit more what module/component will work on what in order to produce the expected results in the Dashboard in the future.
    127127
    128128[[Image( Dashboard workflow.png)]]
    129129
    130 One of the important aspect toward the Dashboard is that it has two primary functionalities. The first functionality is to control all the ingestion modules (harvester, ), so that it can, for example, manually stop the harvesting, or changes the mapping definition, and re-index the published data sets. The second functionality is to
    131 monitor the ingestion process. That means each module will communicate with the reports database to provide statistics about a particular data transaction. For instance, harvester will supply the statistics about the outcome of the harvesting, while mapper/normaliser will tell the coverage of facets and controlled vocabularies. Indexer will tell the total number of indexed records and broken links. Based on this database, the Dashboard will be able to produce data quality reports which can not only be viewed in the Dashboard itself, but also in a PDF file which each data provider can access. The reports database could provide API for internal and external services to create a viewer (eg harvesting viewer), but it is optional, because the Dashboard is the main interface.
     130One of the important aspects toward the Dashboard is that it has two primary functionalities. The first functionality is to control all the ingestion modules (harvester to indexer), so that it can, for example, manually stop the harvesting, or changes the mapping definition, and re-index the published data sets. It will serve as an additional service to the current almost full-automatic ingestion. The second functionality is to monitor the ingestion process. That means each module will communicate with the reports database to provide statistics about a particular data transaction. For instance, harvester will supply the statistics about the outcome of the harvesting, while mapper/normaliser will tell the coverage of facets and controlled vocabularies. Indexer will tell the total number of indexed records and broken links. Based on this database, the Dashboard will be able to produce data quality reports which can not only be viewed in the Dashboard itself, but also in a PDF file which each data provider can access. The reports database could provide API for internal and external services to create a viewer (eg harvesting viewer), but it is optional, because the Dashboard is the main interface.
    132131
    133132== Reference ==