[[PageOutline(2-5)]] = CLARIN Virtual Collection Registry= == Description == From the [[./Requirements|requirements description]]: A virtual collection (VC) is a collection that is the result of browsing or searching repositories rather than being the result of construction and first-time publication by an organisation. Another characterisation is that the resources in a VC are already available in other collections, and the VC can be considered a derived collection. Nevertheless these VCs may need to be citable for future use and should therefore be able to be registered in a register, the VCR. The Virtual Collection Registry (VCR) is an online registry that provides a RESTful web service (REST service) as well as a web based graphical user interface for the creation, publication, management and retrieval of virtual collections. Published collections are made available as CMDI records over OAI-PMH, and get assigned a persistent identifier by means of the [[EPIC|EPIC service]]. These components are available at the following locations '''(currently referencing the alpha deployment!)''': * [http://catalog-clarin.esc.rzg.mpg.de/vcr/app /vcr/app]: GUI * [http://catalog-clarin.esc.rzg.mpg.de/vcr/service /vcr/service]: REST service * [http://catalog-clarin.esc.rzg.mpg.de/vcr/oai /vcr/oai]: OAI-PMH endpoint The [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/src/main/java/eu/clarin/cmdi/virtualcollectionregistry/gui/pages/HelpPage.html help page] explains the core concepts and terminology relevant to the VCR (based on the information gathered on [[./Help|this page]]). === REST service === The VCR REST service provides CRUD operations on a 'virtualcollection' resource. Its internal XML format for input and output is defined by an [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/src/main/resources/META-INF/VirtualCollection.xsd XML schema]. In addition, it supports JSON for input and output and CMDI and HTML output for individual collections (via content negotiation). The CMDI is based on the new !VirtualCollection profile ([http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1404130561238/xml xml], [http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1404130561238/xsd xsd]). In addition to the standers 'Accept header' method, the content type of the response can also be requested by extending the url with the desired format, e.g. ''/service/virtualcollections/1.cmdi''. A full description of the service can be found in [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/doc/Protocol.txt Protocol.txt]. ==== Form submission service ==== A special endpoint of the VCR REST service allows input from HMTL forms to create new virtual collections. This way, other web applications (such as the [[CmdiVirtualLanguageObservatory|VLO]]) can prepare a collection based on resources gathered in their workflows, and allow the user to send them to the VCR with a single click while preserving the ability to authenticate via Shibboleth. The submission endpoint is ''/vcr/service/submit''. It accepts the following form parameters: * type (required) * name (required) * metadataUri (list, required) * resourceUri (list, required) * description (required) * keyword * purpose * reproducibility * reproducibilityNotice * creationDate * queryDescription * queryUri * queryProfile * queryValue An example implementation (based on Wicket) can be found in the [source:/vlo/branches/vlo-3.1-vcr/vlo-web-app/src/main/java/eu/clarin/cmdi/vlo/wicket/pages 'VirtualCollectionSubmissionPage' in the experimental branch of the VLO]. == Technical notes == === Requirements === The following requirements apply to development environment as well as testing/production environments: * MySQL server * A dedicated schema is required; a connection resource needs to be configured (with write permissions) in the servlet container, and available to the application; the application will create/update the database schema automatically, see [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/doc/README.txt README file] for more details === Development === Additional requirements for building the application from source: * JDK 7 or higher * Maven 3 or higher The VCR sources are contained in a single Maven project. Most of the code is written in Java, complemented by HTML, (S)CSS and some images for the graphical user interface and a number of XML documents. Structurally, it consists of three largely independent functional components, each providing an interface to a shared '!VirtualCollectionRegistry' core service. The sections below describe the shared 'core' and the individual components. Source code: [source:/VirtualCollectionRegistry] ==== VCR core ==== The core of the VCR is the '''collection store''', which is implemented through JPA with annotated classes in the ''eu.clarin.cmdi.virtualcollectionregistry.model'' package. The [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/src/main/resources/META-INF/persistence.xml persistence.xml] file configures the Hibernate Persistence Provider to use a MySQL datasource for storage. The 'update' value of the ''hibernate.hbm2ddl.auto'' property instructs Hibernate to update the database schema at runtime if needed. A singleton object of the ''!DataStore'' class provides access to the JPA Entity Manager. The ''!DataStore'' is primarily used in the '''service''' implemented by the ''!VirtualCollectionRegistry'' class, which provides high level CRUD methods, using the ''!VirtualCollection'' model class for the representation of individual collections. It also calls the configured PID service to register and link a '''persistent identifier''' upon publication of a collection. These methods are used by the REST service, OAI provider and Wicket application classes to access and manipulate the stored collections. The '''Spring framework''' is used to define '''beans''' for singleton service objects, which are injected into various components throughout the application. The [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/src/main/webapp/WEB-INF/applicationContext.xml applicationContext.xml] file bootstraps the bean definition, while most beans are discovered en constructed by means of 'component scanning'. For example, a singleton bean for ''!DataStore'' exists because it is annotated ''@Repository''. The instance is made available in the ''!VirtualCollectionRegistry'' and ''Application'' (for the Wicket UI) instances because both are managed by Spring too (annotated with ''@Service'' and ''@Component'' respectively) and contain a ''!DataStore'' field annotated with ''@Autowired''. In addition, some beans are defined in configuration classes (annotated with ''@Configuration''), by methods annoted ''@Bean''. Note that all beans are singleton by default - so called 'prototype beans' are not used in the VCR. There are a number of application context '''profiles''', the activity of which determine which beans get instantiated by Spring. The context parameter ''spring.profiles.active'' can be used to activate one or more profiles. The profiles are defined by means of the ''@Profile'' annotation. At the moment one profile exists for each PID provider implementation; for example, the ''vcr.pid.epic'' profile activates the ''EPICPersistentIdentifierProvider'' implementation and the ''EPICPersistentIdentifierConfiguration'' configuration. It is technically valid to activate multiple profiles but in this case this will lead to a clash of service implementations. See deployment information for more details on using profiles to select the PID provider for a VCR instance after deployment. A number of core services for marshalling, validation, etcetera can be found in ''eu.clarin.cmdi.virtualcollectionregistry.service'' package (generally interfaces with implementations in the .impl subpackage). More info: * [http://docs.oracle.com/javaee/6/tutorial/doc/bnbpz.html JPA introduction] * [http://stackoverflow.com/questions/438146/hibernate-hbm2ddl-auto-possible-values-and-what-they-do Hibernate hbm2ddl.auto (stackoverflow)] * [http://www.mkyong.com/spring/spring-auto-scanning-components/ Spring component scanning and autowiring example and explanation] ==== JAX-RS REST service ==== The package ''eu.clarin.cmdi.virtualcollectionregistry.rest'' defines a RESTful web service by means of JAX-RS annotations. It uses [[https://jersey.java.net/ Jersey]] as an implementation and depends on a number of Jersey extensions, primarily injection of beans. Four classes define the following '''(sub)resources''': * /virtualcollections (class ''!VirtualCollectionsResource'') allows listing (GET) of published collections and POSTing of new collections * /virtualcollections/{id} (class ''!VirtualCollectionResource'') is a subresource that allows GETting, PUTting and DELETEing individual collections * /my-virtualcollections (class ''!MyVirtualCollectionsResource'') only allows retrieval (GET) the list of private and published collections owned by the authenticated user * / (class ''!BaseResource'') is a dummy resource that renders an informative HTML page (if content negotiation allows) about the service Two ''!BodyWriter'' classes are implemented and registered, taking care of the '''rendering''' of !VirtualCollection object as CMDI or XML/JSON respectively, by means of different methods of the (injected) ''!VirtualCollectionMarshaller'' service. This way, a single method in ''!VirtualCollectionResource'' can handle these various serialisation options (depending on the HTTP Accept header in the client request). An additional method handles requests accepting HTML by returning a redirect ('see other') response pointing to the collection details page of the Wicket UI. The '''CMDI output''' is generated by marshalling instances of the classes generated by the JAX-B Maven plugin at build time (XJC), based on the [http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/profiles/clarin.eu:cr1:p_1404130561238/xsd VirtualCollection profile schema], a copy of which is included in the sources. The collections space can be queried by means of a domain specific '''query language''', as described in the [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/doc/Protocol.txt REST documentation]. The grammar is defined in the file [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/src/main/jjtree/eu/clarin/cmdi/virtualcollectionregistry/query/QueryParser.jjt QueryParser.jjt]. The ''javacc-maven-plugin'' generates the parser at build time (with the JJTree preprocessor). Query strings (as optional GET parameters) are passed from the REST resources to the ''VirtualCollectionRegistry'' service, which uses a static method to get it parsed into a ''!ParsedQuery'' object, which in turn is used to obtain JPA query objects that can be used to retrieve concrete results. More info: * [https://jersey.java.net/documentation/latest/jaxrs-resources.html JAX-RS resources documentation from Jersey] * [https://jersey.java.net/documentation/latest/message-body-workers.html information on custom body providers from Jersey] * [http://mojo.codehaus.org/jaxb2-maven-plugin/ jaxb2-maven-plugin] * [http://mojo.codehaus.org/javacc-maven-plugin/ javacc-maven plugin] with links to information about JavaCC and JJTree ==== Wicket application (UI) ==== The graphical user interface is implemented as a [http://wicket.apache.org Wicket] (version 1.4) web application, which can be found in ''eu.clarin.cmdi.virtualcollectionregistry.gui'' and subpackages. The ''Application'' class is the entry point to the application, and is registered as a bean that gets picked up by the ''!SpringWebApplicationFactory'' defined in the ''web.xml''. Each '''page''' is defined by a tuple consisting of a class and '''HTML template''' of the same name (minus extension). Most pages consist of '''components''', many of which are custom to the VCR and are also defined by a class and HTML template. Each page and component typically represents a ''model''; the VCR has a number of custom Wicket component model implementations in the ''gui'' package. All pages extend a ''!BasePage'', which defines the '''common layout''' (header, footer including some common components). The VCR user interface makes quite intensive use of '''Javascript'''. First of all, it uses Wicket's out-of-the-box functionalities for partial updates via AJAX. In addition, [http://jquery.com/ JQuery] is used for enhanced client side interaction. There is some custom JQuery based code, which is integrated with the Wicket components using the [https://code.google.com/p/wiquery/ WiQuery] library. The base page includes the [http://getbootstrap.com/ Bootstrap] Javascript library to support some out-of-the-box layout niceness. The '''style''' of the UI is defined by a combination of Bootstrap CSS classes (like the Javascript, the style is included in the base page) and a number of .scss ([http://sass-lang.com/ SASS]) files that get compiled to CSS at build time. The main file is [/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/src/main/webapp/css/vcr.scss vcr.scss], which include the base [[CLARIN style]] and includes a number of VCR-specific style definitions. More info: * [https://github.com/Jasig/sass-maven-plugin SASS Maven Plugin] ==== OAI provider ==== The VCR OAI provider is using the [source:/OAIProvider OAIProvider library], which consists of an out-of-the-box singleton ''OAIProvider'' instance and a servlet, in addition to an application specific ''Repository''. The repository is implemented by ''VirtualColletionRegistryOAIRepository'' in the ''eu.clarin.cmdi.virtualcollectionregistry.oai'' package, which also contains a Spring configuration class that links the repository with a ''!VirtualCollectionRegistry'' service instance on the one hand and the ''OAIProvider'' on the other. The servlet is mounted (via web.xml) at ''/oai'' and picks up the OAI provider instance automatically. The provider supports Dublin Core (DC) and CMDI output. The DC is generated by the provider based on information provided by the ''Repository'' implementation. The CMDI output is generated via the same methods as the CMDI output of the REST service (see above). === Deployment === The VCR builds into a single WAR file that can be deployed onto a servlet container (tested on and developed for Tomcat 7). There exists an alternative '''build profile''' called ''development'' (use ''mvn install -Pdevelopment'') which makes logging output go to the console and enables Tomcat user realm authentication. By default, however, the application is enabled for Shibboleth authentication and logging to a file on disk. To allow for the switch between authentication mechanism, the web.xml file is duplicated and one of the two is selected at package time - see [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/src/main/webapp/WEB-INF]. Building the VCR at least up to the ''package'' phase (preferably ''mvn install'') will result in the creation of a .tar.gz file in the ''target'' directory that contains the deployable WAR file as well as the project documentation and licensing information in a directory that contains the artifact name and version number. Further deployment instructions (primarily adding a datasource and a number of parameters to ''context.xml'') can be found in the [source:/VirtualCollectionRegistry/trunk/VirtualCollectionRegistry/doc/README.txt README file]. == Tickets == Milestones: * [milestone:VirtualCollectionRegistry-1.0] Open tickets: [[TicketQuery(status=accepted|assigned|new|reopened), component=VCRegistry, order=priority, format=table, col=summary|priority|owner|reporter|milestone)]]