wiki:ComponentVersioning

Version 9 (modified by twagoo, 11 years ago) (diff)

added section 'Considerations for tools' and some other minor changes

The purpose of this page is to come to a design for an extension of the ComponentRegistry that will enable versioning and deprecation of components and profiles.

  1. Component versioning
    1. Definitions
    2. Introduction/Use case
    3. Design
    4. Implementation
      1. Storage / internal representation
      2. Client information
    5. Considerations for tools further down the chain
      1. Metadata editing
      2. Search & catalogue tools
    6. Notes
    7. Discussion

Component versioning

Definitions

  • Component refers to both components and profiles as specified by the general component schema that exist in the ComponentRegistry;
  • Versioning refers to the precursor/successor relation that may exists between two components. It does not imply a structural or functional extension or inheritance, i.e. the structure of the precursor component does not put any constraints on the structure of the successor relation; no assumptions about the structure of the' one can be made on basis of the structure of the other. Versioning does not imply deprecation (see below) of the precursor.
  • Deprecation refers to a state in which a component is no longer 'advertised' by the registry (it will not appear in the list of published components) and will be explicitly marked as such to the client. However it will still be accessible by its URI so that instances based on deprecated components will remain valid. Apart from this, the properties of published components apply. Deprecation can optionally be combined with versioning.

Introduction/Use case

The goal of designing components and profiles in the ComponentRegistry is to eventually make them available in the public space so that everyone can use them as a basis for metadata creation. However, once published a component gets 'frozen' so that all instantiations can rely on it not changing and thus are guaranteed to retain their validity. Also, after a certain 'cooling down' period, published components cannot be removed from the public space (except by administrators).

Robust though this practice may be, it leads to issues when changes in the domain, or simply new insights, need to be incorporated in the component. It is easy to create a new component based on an existing one, extend it and finally make it public. But then one would like to communicate to the users of the component that they should use the new component for new metadata (possibly even convert existing metadata) instead of the old one. However, there are a few problems:

  • There is no reliable way to find out who is instantiating specific components, so the users are unknown and cannot simply be informed
  • The old component cannot be deleted, so new users might accidentally start using the old one
  • All of this is very informal, and requires a lot of work with few guarantees

A formal way of deprecating components and specifying versioning information would solve these issues. The owner of a component can simply:

  1. Create a new component (typically copying the existing component as a basis)
  2. Edit until satisfied
  3. Publish into public space
  4. Deprecate the 'precursor' component
  5. Designate the new component as 'successor' to the old one

The ComponentRegistry can then provide this information to all clients that request the deprecated component, and remove it from the public list. Clients should of course still be able to use the deprecated component since it is not always possible to upgrade.

Design

For each component the following will be specified and communicated to the client:

  • A status indication, one value out of
    • private (to be accessed (read/edit) only by owner(s))
    • development (editable by owner(s), readable by anyone, queryable but not in public listing)
    • public (listed publicly, read-only to everyone)
    • deprecated (no longer in public listing, queryable and read-only to everyone)
  • A (possibly empty) set of direct successors

The ComponentRegistry web service (and as a result this will also apply to the Flex-based component browser) will not display deprecated components in the currently existing public or private listings (e.g. /registry/profiles). However, they will still be available by their ID (like /registry/profiles/clarin.eu:cr1:p_1297242111880).

There will be calls to query for development and deprecated components.

There will be a call to list all direct (and possibly also recursive) successors of a specific component (see XML example below).

Both the XML representation of a component and the XSD that results from the comp2schema transformation should contain deprecation and versioning information in some form (see below for a proposal).

Implementation

Storage / internal representation

The ComponentRegistry web service has a PostgreSQL back-end in which it stores both the component specifications in their verbatim XML formatting and 'component metadata' (referred to as 'descriptions' to avoid ambiguity) containing fields such as 'creator' and 'group'. The 'status' property can be added to this. To accommodate the successor relations, an additional table has to be added simply linking precursors and successors, and optionally storing a comment on each relation.

Client information

Because most clients will consume the XSD transformation of components, the status and versioning information will have to be represented in these profile schemata. It can be provided within an xs:appinfo element. This information can be taken from two proposed new optional header elements in the component specification: Status, StatusDate?, StatusComment? and SuccessorsList. The contents of these will come from the database.

Example:

<CMD_ComponentSpec isProfile="true">
   <Header>
      <ID>clarin.eu:cr1:p_1289827960126</ID>
      <Name>LrtInventoryResource</Name>
      <Description>Resources as stored before in the CLARIN LRT inventory</Description>
      <Status>deprecated</Status>
      <StatusDate>2012-06-11</StatusDate>
      <StatusComment>The following fields were missing: actor age, content language</StatusComment>
      <SuccessorsList>http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1289827960126/successors</SuccessorsList>
   </Header>

in the XSD this would be transformed to:

<xs:schema xmlns:cmd="http://www.clarin.eu/cmd/"
      xmlns:xs="http://www.w3.org/2001/XMLSchema"
      xmlns:dcr="http://www.isocat.org/ns/dcr"
      xmlns:ann="http://www.clarin.eu"
      targetNamespace="http://www.clarin.eu/cmd/"
      elementFormDefault="qualified">
   <xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="http://www.w3.org/2001/xml.xsd"/>
   <xs:annotation>
      <xs:appinfo>
         <cmd:Status>deprecated</cmd:Status>
         <cmd:StatusDate>2012-06-11</cmd:StatusDate>
         <cmd:StatusComment>The following fields were missing: actor age, content language</cmd:StatusComment>
         <cmd:SuccessorsList>http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1289827960126/successors</cmd:SuccessorsList>
      </xs:appinfo>
   </xs:annotation>

Considerations for tools further down the chain

Metadata editing

Deprecated versions of profiles should not be advertised to the user (e.g. should not appear in the profiles list in Arbil) but there should be no restriction in using them (i.e. the can still be added by their schema URL) and existing metadata will be untouched,. The metadata editor however could indicate the deprecated status of used profiles, including (non-obtrusive) notifications about the status of the profile the metadata being worked on is based on. For example, the icon shown for the metadata documents could reflect the state of the profile. Similarly, the list of active profiles in Arbil could mark those profiles in the list that are marked as deprecated. In case a successor is available, the editor should notify the user about that as well and ideally offer to load that profile instead (or in addition). Automatic upgrading to the new version should not happen, but assisting the user in attempting a migration could be considered (here the user should be warned about potential implications on the semantics of the existing values as a result!).

Search & catalogue tools

Search tools that explicitly distinguish between profiles in the user interface could apply clustering with respect to different versions of the same profile (traversing the succession chain). In those cases where profile names are shown, it could make sense to show the status of the profile (e.g. under development, deprecated) if this is considered to be of interest to the user.

Notes

  • A specific version of a profile will always keep its unique profile ID
  • Only owners (and admins) can change the status of a component (as is the case with publishing now)
  • Deprecation and succession will require a reason (for deprecation) or comment (for new version)

Discussion

  • Should 'bi-directional' versioning information be provided or just successors? (should a successor component refer to its precursor?)
    • Consensus seems to be: no
  • What to do with many-to-many relations in versioning? (merging and branching of components)
    • Consensus seems to be: no special support, this is only relevant in the context of inheritance
  • ...