Changeset 3847 for CMDI-Interoperability
- Timestamp:
- 10/23/13 08:09:30 (11 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
CMDI-Interoperability/CMD2RDF/trunk/docs/papers/2014-LREC/CMD2RDF.tex
r3844 r3847 112 112 % 113 113 The main added value of LOD\cite{TimBL2006} is the interconnecting of disparate datasets. 114 In the broader context of LOD, there is meanwhile a subgroup/subcommunity specializing 115 in linking linguistic data \cite{ldl2012}, that renders an obvious pool of candidate 114 In the broader context of LOD, there is meanwhile a Open Knowledge Foundationâs Working Group on Open Data in Linguistics, that renders an obvious pool of candidate 116 115 datasets to link the CMD data with\footnote{\url{http://linguistics.okfn.org/resources/llod/}}. 117 116 Within these \xne{lexvo} seems most promising starting point, as it features URIs like \url{http://lexvo.org/id/term/eng/}, i.e. for the ISO-639-3 language identifiers which are also used in CMD records. … … 132 131 \end{itemize} 133 132 134 \subsection{CMD specification} 133 \subsection{CMD specification}\label{sec:CMDM} 135 134 136 135 The main entity of the meta model is the CMD component modelled as A \code{rdfs:Class}. A CMD profile is basically a CMD component with some extra features, implying a specialization relation. It may seem natural to translate a CMD element to a RDF property (as it holds the literal value), but given its complexity (e.g. attributes\footnote{Due to space considerations the remainder of the paper will not discuss attributes.}, relation to the containing component) it to has to be a \code{rdfs:Class}. The actual literal value is a property of given element of type \code{cmdm:ElementValue}. For values that can be mapped to entities defined in external vocabularies/ semantic resources, the references to these entities are expressed in parallel properties of type \code{cmdm:hasElementEntity}. The containment relation between components and elements is expressed with a dedicated property \code{cmdm:contains}. … … 191 190 \end{example3} 192 191 193 \subs ection{Data Categories}192 \subsubsection{Data Categories} 194 193 One of the semantic registries in use by CMDI for its concept links is ISOcat. In \cite{Windhouwer2012_LDL} proposes to link to the data categories via an annotation property. 195 194 … … 305 304 \end{example3} 306 305 307 \subs ection{Elements, Fields, Values}306 \subsubsection{Elements, Fields, Values} 308 307 Finally, we want to integrate also the actual field values in the CMD records into the ontology. 309 308 As explained before, CMD elements have to be typed as \code{rdfs:Class}, the actual value expressed as \code{cmds:ElementValue} property and the corresponding data category expressed as annotation property. … … 404 403 \section{Implementation} 405 404 406 The transformation of profiles and instances into RDF/XML is accomplished by a set of XSL-stylesheets, that are currently being tested on a sample dataset. In the near future, a test with the whole CMD datasetwill be performed.407 408 Once the data is available it has to be stored and published in a RDF triple store. The most promising solution seems to be \xne{Virtuoso}, an integrated feature-rich hybrid data store, able to deal with different types of data (``Universal Data Store''). \cite{Haslhofer2011europeana}405 The transformation of profiles and instances into RDF/XML is accomplished by a set of XSL-stylesheets, that are currently being tested on a sample dataset. The mappings described for the CMD specification (see section \ref{sec:CMDM}) have to be integrated into the CMDI core infrastructure, e.g., the CR. And in the near future, a test on the instances in the complete CLARIN joint metadata domain will be performed. 406 407 Once the linked data is available it has to be stored and published in a RDF triple store. The most promising solution seems to be \xne{Virtuoso}, an integrated feature-rich hybrid data store, able to deal with different types of data (``Universal Data Store''). \cite{Haslhofer2011europeana} 409 408 410 409 % Although the distributed nature of the data is one of the defining features of LOD and theoretically one should be able to follow the data by dereferencable URIs, in practice it is mostly necessary to pool into one data store linked datasets from different sources that shall be queried together due to performance reasons. This implies that the data to be kept by the data store will be decisively larger, than ``just'' the original dataset.
Note: See TracChangeset
for help on using the changeset viewer.