Changeset 4427 for CMDI-Interoperability


Ignore:
Timestamp:
02/04/14 10:10:21 (10 years ago)
Author:
vronk
Message:

reworked the examples to fit into the two column layout

File:
1 edited

Legend:

Unmodified
Added
Removed
  • CMDI-Interoperability/CMD2RDF/trunk/docs/papers/2014-LREC/CMD2RDF.tex

    r4401 r4427  
    88
    99\usepackage{verbatim} % adds environment for commenting out blocks of text & for better verbatim
     10\usepackage{multicol}
    1011
    1112%\newcommand{\comment}[1]{}
     
    2930\newenvironment{example2}
    3031{ \footnotesize
    31 \begin{ttfamily} \begin{shaded*} \noindent
    32  \begin{tabular}{p{0.4\textwidth}  p{0.6\textwidth} } }
    33 {\end{tabular} \end{shaded*} \end{ttfamily} }
     32\begin{sffamily} \begin{shaded*} \noindent
     33 \begin{tabular}{@{\hspace{-1mm}} p{0.35\textwidth}  p{0.65\textwidth} } }
     34{\end{tabular} \end{shaded*} \end{sffamily} }
    3435
    3536\newenvironment{example3}
    3637{ \footnotesize
    37  \begin{ttfamily} \begin{shaded*} \noindent
     38 \begin{sffamily} \begin{shaded*} \noindent
    3839 \begin{tabular}{@{\hspace{-1mm}} p{0.25\textwidth}  p{0.25\textwidth}  p{0.45\textwidth}}
    3940}
    40 {  \end{tabular} \end{shaded*} \end{ttfamily} }
     41{  \end{tabular} \end{shaded*} \end{sffamily} }
    4142
    4243\definecolor{shadecolor}{rgb}{0.95,0.95,1.0}
     
    144145The main entity of the meta model is the CMD component modelled as a \code{rdfs:Class}. A CMD profile is basically a CMD component with some extra features, implying a specialization relation. It may seem natural to translate a CMD element to a RDF property (as it holds the literal value), but given its complexity (e.g., attributes\footnote{Due to space considerations the abstract will not further discuss attributes.}, relation to the containing component)  it too has to be expressed as \code{rdfs:Class}. The actual literal value is a property of given element of type \code{cmdm:ElementValue}. For values that can be mapped to entities defined in external semantic resources, the references to these entities are expressed in parallel object properties of type \code{cmdm:hasElementEntity} (constituting outbound links). The containment relation between components and elements is expressed with a dedicated property \code{cmdm:contains}.
    145146
     147\begin{figure*}
     148\begin{center}
    146149\label{table:rdf-spec}
    147150\begin{example3}
     
    187190%              & rdfs:range & :Entity .  \\
    188191\end{example3}
    189 
     192\end{center}
     193\caption{The CMD meta model in RDF}
     194\label{fig:final-example}
     195\end{figure*}
    190196
    191197\subsection{CMD profile and component definitions}
    192198This top-level classes and properties are subsequently used for modelling the actual profiles, components and elements as they are defined in the CR.
    193 For stand-alone components, the IRI is the exact path into the CR to get the RDF representation for the profile/component\furl{http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/components/clarin.eu:cr1:c\_1271859438125/rdf}. For ``inner'' components (that are defined as part of another component) and elements the identifier is a concatenation of the nearest ancestor stand-alone component's IRI and the dot-path to given component/element (e.g., Actor:\\ \code{cr:clarin.eu:cr1:c\_1271859438197/rdf\#Actor.Actor\_Languages.Actor\_Language}\footnote{For the sake of readability, in the examples we will collapse the component IRIs, refering to them just by their name, prefixed with \code{cmd:}})
    194 
    195 \begin{example3}
    196 cmd:collection& a & cmdm:Profile; \\
    197  & rdfs:label & "collection"; \\
    198  & dcterms:identifier & cr:clarin.eu:cr1:p\_1345561703620. \\
    199 cmd:Actor  & a &cmdm:Component. \\
    200 \end{example3}
     199For stand-alone components, the IRI is the exact path into the CR to get the RDF representation for the profile/component\furl{http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/components/clarin.eu:cr1:c\_1271859438125/rdf}. For ``inner'' components (that are defined as part of another component) and elements the identifier is a concatenation of the nearest ancestor stand-alone component's IRI and the dot-path to given component/element (e.g., Actor:\\ \code{cr:clarin.eu:cr1:c\_1271859438197/rdf \#Actor.Actor\_Languages.Actor\_Language}\footnote{For the sake of readability, in the examples we will collapse the component IRIs, refering to them just by their name, prefixed with \code{cmd:}})
     200
     201\begin{example2}
     202cmd:collection \\
     203$\;$ a & cmdm:Profile; \\
     204$\;$ rdfs:label & "collection"; \\
     205$\;$ dc:identifier & cr:clarin.eu:cr1:p\_1345561703620. \\
     206cmd:Actor \\
     207$\;$ a &cmdm:Component. \\
     208\end{example2}
    201209
    202210\subsubsection{Data Categories}
    203211The primary concept registry in use by CMDI for its concept links is ISOcat. The recommended approach to link to the data categories is via an annotation property \cite{Windhouwer2012_LDL}.
    204212
    205 \begin{example3}
    206 dcr:datcat & a  & owl:AnnotationProperty ; \\
    207  & rdfs:label  & "data category"@en ; \\
     213\begin{example2}
     214dcr:datcat \\
     215$\;$ a  & owl:AnnotationProperty ; \\
     216$\;$ rdfs:label  & "data category"@en ; \\
    208217% & rdfs:comment  & "This resource is equivalent to  this data category."@en ; \\
    209218% & skos:note  & "The data category should be identified by its PID."@en ; \\
    210 \end{example3}
     219\end{example2}
    211220
    212221Consequently, the \code{@ConceptLink} attribute on CMD elements and components referencing the data category can be modelled as:
    213222
    214 \begin{example3}
    215 cmd:LanguageName & dcr:datcat & isocat:DC-2484. \\
    216 \end{example3}
     223\begin{example2}
     224cmd:LanguageName \\
     225$\;$ dcr:datcat & isocat:DC-2484. \\
     226\end{example2}
    217227
    218228%\subsection{RELcat - Ontological relations}
     
    243253
    244254\begin{example3}
    245 <lr1> & a  & cmdm:Resource; \\
    246 & cmdm:hasMimeType & "audio/wav". \\
     255\textless lr1\textgreater \\
     256$\enspace \,$ a  & & cmdm:Resource; \\
     257\multicolumn{2}{l}{cmdm:hasMimeType } & "audio/wav". \\
    247258\end{example3}
    248259
     
    256267(Note, that one MD record can describe multiple resources. This can be also easily accommodated in OpenAnnotation.)
    257268
    258 \begin{example3}
    259 \_:anno1  & a & oa:Annotation ; \\
    260  & oa:hasTarget  & <lr1a>, <lr1b> ; \\
    261  & oa:hasBody  & \_:topComponent1 ; \\
    262  & oa:motivatedBy  & oa:describing . \\
    263 \end{example3}
     269\begin{example2}
     270\_:anno1  \\
     271 $\:$ a & oa:Annotation ; \\
     272 $\:$ oa:hasTarget  & \textless lr1a \textgreater, \textless lr1b\textgreater ; \\
     273 $\:$ oa:hasBody  & \_:topComponent1 ; \\
     274  $\:$ oa:motivatedBy  & oa:describing . \\
     275\end{example2}
    264276
    265277\subsubsection{Provenance}
     
    267279The information from the CMD record \code{cmd:Header} represents the provenance information about the modelled data.
    268280
    269 \begin{example3}
    270 \_:topComponent1 & dcterms:identifier  & <lr1.cmd> ;  \\
    271  & dcterms:creator  & \var{\{cmd:MdCreator\}} ;  \\
    272  & dcterms:publisher  & <http://clarin.eu> ; \\
    273  & dcterms:created & \var{\{cmd:MdCreated\}} .  \\
    274 \end{example3}
     281\begin{example2}
     282\_:topComponent1  \\
     283 $\:$ dc:identifier  & \textless lr1.cmd \textgreater ;  \\
     284 $\:$ dc:creator  & \var{\{cmd:MdCreator\}} ;  \\
     285 $\:$ dc:publisher  & \textless http://clarin.eu\textgreater  ; \\
     286 $\:$ dc:created & \var{\{cmd:MdCreated\}} .  \\
     287\end{example2}
    275288
    276289\subsubsection{Collection hierarchy}  % ( Resource Proxy – IsPartOf)}
     
    280293
    281294\begin{example3}
    282 <lr0.cmd>  & a   & ore:ResourceMap . \\
    283 <lr0.cmd> & ore:describes & <lr0.agg> . \\
    284 <lr0.agg> & a   & ore:Aggregation ; \\
    285 & ore:aggregates  & <lr1.cmd>, <lr2.cmd> . \\
     295\textless lr0.cmd \textgreater  & a   & ore:ResourceMap . \\
     296\textless lr0.cmd\textgreater & ore:describes & \textless  lr0.agg\textgreater . \\
     297\textless lr0.agg\textgreater & a   & ore:Aggregation ; \\
     298& ore:aggregates  & \textless lr1.cmd\textgreater, \textless lr2.cmd\textgreater . \\
    286299\end{example3}
    287300
     
    304317\begin{example3}
    305318\_:actor1  & a & cmd:Actor . \\
    306 \_:actor1lang1  & a & cmd:Actor.Actor\_Language . \\
     319\_:actor1lang1  & a & cmd:Actor \\
     320 &  & .Actor\_Language . \\
    307321\_:actor1  & cmd:contains & \_:actor1lang1 . \\
    308322\end{example3}
     
    318332\end{comment}
    319333
    320 \subsubsection{Elements, Fields, Values}
    321 Finally, we want to integrate also the actual field values in the CMD records into the linked data.
    322 As explained before, CMD elements have to be typed as \code{rdfs:Class}, the actual value expressed as \code{cmds:ElementValue}, and they are related bua a \code{cmdm:hasElementValue} property.
    323 
    324 While generating triples with literal values seems straightforward, the more challenging but also more valuable aspect is to generate object property triples (predicate \code{cmdm:hasElementEntity}) with the literal values mapped to semantic entities. Following example shows the whole chain of statements from metamodel to literal value and corresponding semantic entity.
    325 
    326 The actual mapping process from values to entities is a complex challenging task and will be tackled in more detail in the full paper. The main idea is to find entities in selected reference datasets (controlled vocabularies, ontologies) corresponding to the literal values in the metadata records. The obtained entity identifiers are further used to generate new RDF triples, representing outbound links.
    327 
     334
     335\begin{figure*}
     336\begin{center}
    328337\begin{example3}
    329338cmd:Person & a & cmdm:Component . \\
     
    349358<http://www.mpi.nl/> & a  & cmd:OrganisationElementEnity .
    350359\end{example3}
     360\end{center}
     361\caption{Chain of statements from metamodel to literal value and corresponding semantic entity}
     362\label{fig:final-example}
     363\end{figure*}
     364
     365
     366\subsubsection{Elements, Fields, Values}
     367Finally, we want to integrate also the actual field values in the CMD records into the linked data.
     368As explained before, CMD elements have to be typed as \code{rdfs:Class}, the actual value expressed as \code{cmds:ElementValue}, and they are related bua a \code{cmdm:hasElementValue} property.
     369
     370While generating triples with literal values seems straightforward, the more challenging but also more valuable aspect is to generate object property triples (predicate \code{cmdm:hasElementEntity}) with the literal values mapped to semantic entities. Following example shows the whole chain of statements from metamodel to literal value and corresponding semantic entity.
     371
     372The actual mapping process from values to entities is a complex challenging task and will be tackled in more detail in the full paper. The main idea is to find entities in selected reference datasets (controlled vocabularies, ontologies) corresponding to the literal values in the metadata records. The obtained entity identifiers are further used to generate new RDF triples, representing outbound links.
    351373
    352374\begin{comment}
Note: See TracChangeset for help on using the changeset viewer.