Changes between Version 17 and Version 18 of IMDI2CMDI


Ignore:
Timestamp:
12/22/14 15:05:58 (9 years ago)
Author:
Twan Goosen
Comment:

description

Legend:

Unmodified
Added
Removed
Modified
  • IMDI2CMDI

    v17 v18  
    3939== Contents ==
    4040
    41 [[PageOutline(1-3, , inline)]]
     41[[PageOutline(2-3, , inline)]]
    4242
    4343----
     
    4545
    4646* ISLE2CLARIN
    47  * You can browse the code [https://github.com/TheLanguageArchive/ISLE2CLARIN here]
     47 * You can browse the code on !GitHub: [https://github.com/TheLanguageArchive/ISLE2CLARIN here]
    4848 * Clone from: {{{https://github.com/TheLanguageArchive/ISLE2CLARIN.git}}}
    4949* !MetadataTranslator stylesheets
    50  * You can browse the code [https://github.com/TheLanguageArchive/MetadataTranslator/tree/master/Translator/src/main/resources/templates here]
     50 * You can browse the code on !GitHub: [https://github.com/TheLanguageArchive/MetadataTranslator/tree/master/Translator/src/main/resources/templates here]
    5151 * Clone from: {{{https://github.com/TheLanguageArchive/MetadataTranslator.git}}}
    5252
    5353----
    54 == Usage information ==
     54== Description ==
    5555
    56 This is a version of {{{imdi2clarin.xsl}}} that batch processes a whole directory structure of imdi files, call it from the command line like this:
     56[http://tla.mpi.nl The Language Archive] has developed a tool to batch convert ISLE metadata (IMDI) files to Component Metadata (CMDI), primarily for the purpose of migrating their own archive to CMDI. This tool, [https://github.com/TheLanguageArchive/ISLE2CLARIN ISLE2CLARIN], performs the conversion on basis of an XSLT stylesheet and optionally performs validation on both the IMDI input and CMDI output. It can also be configured to skip certain files.
     57
     58The [https://github.com/TheLanguageArchive/MetadataTranslator/tree/master/Translator/src/main/resources/templates/imdi2cmdi stylesheet] that defines the actual transformation is contained in a separate project called the [https://github.com/TheLanguageArchive/MetadataTranslator MetadataTranslator]. This project contains a REST service, which can be run in a servlet container to perform (bi-directional for selected profiles) translations between IMDI and CMDI on the fly, and a library of stylesheets that define the underlying transformations, which is also used by the ISLE2CLARIN tool.
     59
     60An IMDI file is transformed into an instance of one out of a number of CMDI profiles, depending on the 'profile' of the IMDI. The main distinction is between '''IMDI Corpus''' and '''IMDI Session'''. Furthermore there are a number of specialised Session profiles that map to distinct CMDI profiles:
     61* CGN (Corpus spoken dutch)
     62* CNGT (Dutch sign language)
     63* DBD (Dutch Bilingual Database)
     64
     65The imdi2cmdi stylesheet depends on a set of language code mappings. For each language code that it encounters, it attempts to output the ISO-639-3 representation.
     66
     67The transformation from IMDI to CMDI is '''not''' guaranteed to be lossless. However, it has been tested on a representative selection of the IMDI archive of TLA and was found to retain all essential values. By design, some information is discarded: the links and specifications with respect to external vocabularies, for example, is not kept. The embedded history information gets updated in the transformation process.
     68
     69TLA has also developed a set of CMDI2IMDI transformations, which are available as a part of the Metadata Translator as well. These are provided 'as is' and transformations are likely to be lossy.
     70
     71----
     72=== Usage information ===
     73The easiest way to convert one or more IMDI files to CMDI is by running the ISLE2CLARIN executable jar:
    5774{{{
    58 java -jar saxon8.jar -it main batch-imdi2clarin.xsl
     75java -jar isle2clarin.jar <DIR with IMDI files>
    5976}}}
    60 The last template in {{{imdi2clarin.xsl}}} has to be modified to reflect the actual directory name.
    6177
    62 Two optional parameters can be provided:
    63 
    64  * ''collection'': a collection name can be specified for each record. This information is extrinsic to the IMDI file, so it is given as an external parameter. Omit this if you are unsure.
    65  * ''uri-base'': if this optional parameter is defined, the behaviour of this stylesheet changes in the following ways: If no archive handle is available for !MdSelfLink, the base URI is inserted there instead. All links (!ResourceProxy elements) that contain relative paths are resolved into absolute URIs in the context of the base URI. Omit this if you are unsure.
    66 
    67 === Technical information ===
    68 
    69 No Saxon specific extensions are used. However, the exact syntax used by a {{{collection()}}} invocation is implementation defined, so might need to be adapted when using another XSLT 2.0 engine then Saxon.
     78For more information and options, see the documentation of [https://github.com/TheLanguageArchive/ISLE2CLARIN ISLE2CLARIN].
    7079
    7180=== Questions and Answers ===