wiki:CmdiVirtualCollection/Requirements

Virtual Collection Registry: functional requirements

The purpose of this page is to gather functional and high level technical requirements of the Virtual Collection Registry. These requirements stem from the following sources:

Description

Description adapted from CmdiVirtualCollection:

A virtual collection (VC) is a collection that is the result of browsing or searching repositories rather than being the result of construction and first-time publication by an organization. Another characterisation is that the resources in a VC are already available in other collections, and the VC can be considered a derived collection. Nevertheless these VCs need to be citable for future use and should therefore be able to be registered in a registry, the virtual collection registry (VCR).

Use cases

Creation use case 1: manual creation

  • User browses to VCR and logs in
  • User creates new VC
  • User adds items
    • Provide URI or PID
    • Provide metadata
      • Type (intensional/extensional)
      • Name, description, purpose, Keywords
      • Creators
      • Reproducability
    • Stored in addition
      • Creation date
      • User id
  • User publishes this VC
  • VCR assigns a PID to this VC
  • VCR provides this VC via REST and over OAI-PMH

Creation use case 2: VC from VLO selection

  • User makes a selection in the VLO
    • (has to be below a set maximum size?)
  • User selects ‘add to virtual collection’ option in the VLO
  • The VLO posts the URL’s of the selection and some metadata (selection/query, VLO version information?) to the VCR (by means of a HTML form)
  • The user authenticates (if not already authenticated)
  • The VCR presents the user with the following options:
    • Add the items to an existing unpublished collection
    • Add the items to a new collection
  • In the case of a new collection, the manual collection creation steps have to be followed by the user (some values are already filled in based on the metadata provided by the VLO)

(Creation use case 3: dynamic VC by search query)

Not considered at this stage

Exploitation use case 1: Citation

  • A user has created a VC for a set of resources (e.g. a number of metadata records each describing a recording session and the resulting media files), and published it so that it has a PID
  • The user includes the PID as a reference to this VC in a publication
  • Other users can use a handle resolver to resolve this PID, which will lead them to the VCR displaying a presentation of the identified VC
  • Links in the this presentation will guide these users to the items contained in the VC at their original location

Exploitation use case 2: Processing tool

  • A user needs to process a set of documents with a tool, e.g. an automated annotator or workflow tool that has VC supports; the documents to be processed are gathered in a VC
  • The user accesses the tool and enters the PID of the VC to be processed
  • The tool will connect to the VCR using its REST service and request the VC metadata
  • The VCR provides an XML representation of the VC metadata
  • The tool follows the links present in the VC and accesses the items directly at their original location, and retrieves them for processing

Requirements

Core requirements

Virtual collection model

  • Type
    • Extensional
    • Intensional
  • State
    • Unpublished
    • Published
  • Metadata
    • Name
    • Description
    • Purpose
    • Keywords
    • Creators
      • name
      • email
      • address
      • telephone
      • website
      • organisation
      • role
      • ORCID?
    • Reproducability
      • intended
      • fluctuated
      • untended
  • Links
    • Type
      • Metadata
      • Resource
    • URI
    • (Query profile)
    • (Query value)
    • Optionally: A (multilingual) description in case of a metadata item

Storage of VC’s

Relation database backend

Rights & lifecycle management

  • Each VCR has an owner
  • An owner can be a single user or a group
  • Ownership can be transferred from one user to another
  • Ideally, groups are provided by a centralised service (but no such service exists at the moment)
    • Alternatively, the VCR could store groups for its users but should be prepared for adaptation to an external group provider
  • A VC is in one of the following states:
    • Development (public or private/unlisted)
    • Published
    • Deprecated
  • A deprecated VCR can be superseded by a published (or deprecated) VCR

PID assignment

  • Each (published) VCR should be assigned a VCR
  • One or more PID providers
    • Implementation currently supports the GWDG PID Handle Service
    • CLARIN handle has been requested. Server will run at …?
    • DataCite?? https://mds.datacite.org/

REST service

  • /VCR
    • GET gets all VC’s (can be filtered using VCRQL (see Protocol.txt))
    • POST VC creates a new VC (can be incomplete, e.g. only metadata)
  • /VCR/{ID}
    • GET gets representation of a single VC
    • PUT modifies existing unpublished VC
    • DELETE deletes existing unpublished VC
  • /VCR/{ID}/cmdi
    • GET a CMDI representation of a single VC
  • /VRC/{ID}/URI
    • PUT adds an URI to a VC
    • POST updates URI

VC's can be retrieved as XML or JSON serialisations of their native representation, or alternatively as a CMDI document valid according to a dedicated VC profile defined in the CMDI Component Registry.

This is largely in line with Protocol.txt

OAI Provider

  • Provides published VC’s as CMDI

Web based front-end

  • Browse public VC’s
  • Browse private VC’s
  • VC viewer
  • VC creation/editing
  • Shibbolised

Integration into secondary tools

  • Resource providers (POST/PUT clients)
    • VLO
    • Centre catalogues
      • e.g. MPI-PL
    • ...
  • Exploitation (GET clients)
    • Weblicht

Proposed additional features

Last modified 10 years ago Last modified on 05/30/14 12:31:29