wiki:CLARIN-ISO liaison

This is working space for the preparation of data usable for the purpose of establishing and maintaining CLARIN's liaison with ISO. The result is going to be moved to the public space at GitHub.

Goal

Category A liaison between ISO Technical Committee 37 Subcommittee 4 "Language resource management" (ISO TC37 SC4) and CLARIN-ERIC. Such liaisons are open to "Organizations that make an effective contribution to the work of the technical committee or subcommittee for questions dealt with by this technical committee or subcommittee. Such organizations are given access to all relevant documentation and are invited to meetings. They may nominate experts to participate in a [Working Group]". (ISO Directives 1.17.2)

Rationale

  • CLARIN researchers have participated in ISO TC37 SC4 projects (see the -- incomplete! -- table below), but CLARIN-* as such does not get credited for this, because CLARIN-ERIC has no formal standing wrt ISO. For their engagement in ISO project, CLARIN researchers had to take off their CLARIN hats, so to say.
  • CLARIN researchers who were not members of national standards bodies (e.g. ANSI, DIN, etc.) have had no obvious way to influence the ISO process; with the liaison, CLARIN attains an official advisory role and designated members of CLARIN centres become able to participate in ISO meetings.
  • As a consequence of the above, CLARIN will be able to gain greater visibility in standardization circles, and (designated) CLARIN researchers will be able to gain access to the relevant documentation (proviso: this does not mean published standards, which are the only source of financing for ISO, and are therefore never circulated freely; experts are however offered access to documentation at draft stages and are able to publish the results of the work they have performed in the process of shaping the final form of ISO standards).

Additionally, several CLARIN Centres (Czech Wordnet, EstWN, GermaNet, SWordNet) make their WordNets available in the form of WordNet-LMF (a dialect of ISO LMF), and several others (DanNet, FinnWordNet, Polish WordNet) intend to or at least are considering this, as long as there is guarantee for this format to remain standardized and to guarantee interoperability. Given the CLARIN initiative towards enhancing the interoperability of lexico-semantic resources (happening right now, cf. the Tartu workshop, Jan 31 -- Feb 1, 2017) and the fact that "WordNet-LMF" already means several similar but not identical proposals, the time is perfect to make it possible for CLARIN researchers to feed the ISO process of LMF revision, which is also taking place this year. The more so that some of the centres (e.g. Wrocław Polytechnic) are preparing to make a decision on whether to join the group of centres exporting WordNet LMF as long as the result is interoperable with the others, and that can best be guaranteed by a uniform and unified standardization effort.

The fact that ISO specifications are not free may be a factor that could make the proposed ISO-CLARIN liaison a perspective less attractive than it would otherwise be. However, part of the process of the revision of LMF aims at providing LMF serialization in the TEI (see the table below). This, by virtue of the TEI-ISO agreement on open access, guarantees that the content of the otherwise costly ISO specifications will be mirrored in the TEI documents, which are (by the TEI statute) always open-access (CC BY). In other words, the potential input of CLARIN researchers to the process of the standardisation of WordNet-LMF will remain open and free for other CLARIN (and other) centres to implement.

Eligibility criteria

(from ISO Directives, 1.17.3)

When an organization applies for a liaison with an ISO technical committee/subcommittee, the Central Secretariat will check with the member body in the country in which the organization is located. If the member body does not agree that the eligibility criteria have been met, the matter will be referred to the TMB to define the eligibility.

The Central Secretariat will also ensure that the organization meets the following eligibility criteria:

  • it is not-for-profit;
  • is a legal entity — the Central Secretariat will request a copy of its statutes;
  • it is membership-based and open to members worldwide or over a broad region;
  • through its activities and membership demonstrates that it has the competence and expertise to contribute to the development of International Standards or the authority to promote their implementation; and
  • has a process for stakeholder engagement and consensus decision-making to develop the input it provides to ISO (see Guidance for ISO liaison organizations — Engaging stakeholders and building consensus http://www.iso.org/iso/guidance_liaison-organizations.pdf).

Resolution 2016-03.2 of ISO TC37 SC4, June 2016:

Liaison Type-A with CLARIN-ERIC

SC 4 requests Key-Sun Choi, secretary of SC 4, assisted by Andreas Witt, WG 6 convenor, to initiate a Type-A liaison with CLARIN-ERIC (Common Language Resources and Technology Infrastructure—European Research Infrastructure Consortium) for cooperative work in the area of language resource management.

Steps necessary to establish the liaison now (December 2016)

(Quoting/paraphrasing the SC4 Secretary, Mr. Key-Sun Choi, cf. ISO Directives 1.17.2):

  1. letter from the CLARN-ERIC Director about CLARIN-ERIC’s willingness to be A-Liaison with ISO/TC37/SC4,
  2. containing information on whether CLARIN-ERIC is a legal entity (in what kind of legal seat)
  3. appointing CLARIN-ERIC’s internal committee for liaison with SC4, to deal with all of SC4 documents
  4. appointing a contact person in CLARIN-ERIC

Items 3. and 4. appear clear: the relevant CLARIN committee for the liaison is the CSC, and the liaison officer has been named in the SC4 resolution from June 2016 (Andreas Witt), and this nomination will be hopefully upheld on the CLARIN side by both the CSC and the BoD.

ISO standards that CLARIN researchers are involved with

"TC/SC/WG" serves as the organizing column (Technical Committee, Subcommittee, optionally Working Group)

The "remarks" column may further specify the nature of involvement.

ISO spec (title) TC/SC/WG institutions (names) involved remarks
ISO 639-* "Codes for the representation of names of languages" TC37 SC2 WG1 MPI later CLARIN-ERIC (Sebastian Drude)
ISO 12620:2009 "Specification of data categories and management of a Data Category Registry for language resources" TC37 SC3 MPI, later Meertens Instituut (Daan Broeder, Mark Kemps-Snijders, Menzo Windhouwer), Universiteit Utrecht (Ineke Schuurman) current status: withdrawn (95.99) as of 2016-01-19; revived as CLARIN Concept Registry
ISO 24613:2008 "Lexical markup framework (LMF)" TC37 SC4 WG4 --> currently in the process of being revised and split into several focused specifications; work on TEI serialization of LMF done in the context of DARIAH, PARTHENOS and ENeL, with representatives from CLARIN, see below
WordNet-LMF Uni Tübingen (Erhard Hinrichs), others+other centres Several centres (to be named) develop their own WordNet-oriented flavours of LMF; others may join them if this is properly standardized
ISO/PWI 24613-2 "Lexical markup framework (LMF) - Part 2: Diachronic module" ILC-CNR (Monica Monachini, Fahad Khan)
ISO/PWI 24613-4 "Lexical markup framework (LMF) - Part 4: TEI serialisation" Jožef Stefan Institute (Tomaž Erjavec), Austrian Academy (Charly Mörth, Jack Bowers), IDS (Piotr Banski, Andreas Witt)
ISO/PWI 24615-3 "Syntactic annotation framework (SynAF) -- Part 3: TEI serialization (ISO TEIger)" TC37 SC4 WG6 IDS (Piotr Banski -- project co-leader) proposed work item accepted at the ISO Conference 2016
ISO 24619:2011 "Persistent identification and sustainable access (PISA)" TC37 SC4 IDS (Marc Kupietz, Andreas Witt), MPI (Peter Wittemburg, Daan Broeder)
ISO 24622-1:2015 "Component Metadata Infrastructure (CMDI) -- Part 1: The Component Metadata Model" TC37 SC4 MPI (Daan Broeder), Uni Tübingen (Erhard Hinrichs, Thorsten Trippel), IDS (Oliver Schonefeld) This norm started out as CLARIN best practice and subsequently entered an SC4 standardisation track (more names from a presentation at CLARIN-2014: Twan Goosen, Oddrun Ohren, Axel Herold, Thomas Eckart, Matej Ďurčo)
ISO/AWI 24622-2 "Component metadata infrastructure (CMDI) -- Part 2: The component metadata specific language" TC37 SC4 Uni Tübingen (Thorsten Trippel) current status: working draft; CLARIN-ERIC taskforce exists
ISO/PWI 24622-3 "Component metadata infrastructure (CMDI) -- Part 3: META-SHARE metadata schema as an instantiation of the CMDI model" TC37 SC4 ILSP/R.C. ‘Athena’ (Maria Gavrilidou) The acceptance of this item was conditioned on the acceptance of Part 2
ISO/CD 24623-1 "Corpus Query Lingua Franca (CQLF) -- Part 1: Metamodel" TC37 SC4 WG6 IDS (Piotr Banski, Andreas Witt -- project co-leaders) current status: approved for registration as Draft International Standard
ISO 24624:2016 "Transcription of spoken language" TC37 SC4 WG6 IDS (Thomas Schmidt -- project leader) joint TEI+ISO specification

...

ISO standards in the scope of interest of CLARIN

This is an auxiliary list. Items from here may turn out to belong in the upper table.

Standards of ISO TC37 SC4 should be considered as belonging here automatically; some of them may need to be singled out and commented on.

ISO spec (title) TC/SC/WG remarks
ISO 24610-1:2006 "Feature structures -- Part 1: Feature structure representation" TC37 SC4 due for renewal/update; needs experts! An FS validator is being produced.

See also: ISO TC37 home page

Last modified 7 years ago Last modified on 02/01/17 18:36:14