Changeset 4427
- Timestamp:
- 02/04/14 10:10:21 (10 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
CMDI-Interoperability/CMD2RDF/trunk/docs/papers/2014-LREC/CMD2RDF.tex
r4401 r4427 8 8 9 9 \usepackage{verbatim} % adds environment for commenting out blocks of text & for better verbatim 10 \usepackage{multicol} 10 11 11 12 %\newcommand{\comment}[1]{} … … 29 30 \newenvironment{example2} 30 31 { \footnotesize 31 \begin{ ttfamily} \begin{shaded*} \noindent32 \begin{tabular}{ p{0.4\textwidth} p{0.6\textwidth} } }33 {\end{tabular} \end{shaded*} \end{ ttfamily} }32 \begin{sffamily} \begin{shaded*} \noindent 33 \begin{tabular}{@{\hspace{-1mm}} p{0.35\textwidth} p{0.65\textwidth} } } 34 {\end{tabular} \end{shaded*} \end{sffamily} } 34 35 35 36 \newenvironment{example3} 36 37 { \footnotesize 37 \begin{ ttfamily} \begin{shaded*} \noindent38 \begin{sffamily} \begin{shaded*} \noindent 38 39 \begin{tabular}{@{\hspace{-1mm}} p{0.25\textwidth} p{0.25\textwidth} p{0.45\textwidth}} 39 40 } 40 { \end{tabular} \end{shaded*} \end{ ttfamily} }41 { \end{tabular} \end{shaded*} \end{sffamily} } 41 42 42 43 \definecolor{shadecolor}{rgb}{0.95,0.95,1.0} … … 144 145 The main entity of the meta model is the CMD component modelled as a \code{rdfs:Class}. A CMD profile is basically a CMD component with some extra features, implying a specialization relation. It may seem natural to translate a CMD element to a RDF property (as it holds the literal value), but given its complexity (e.g., attributes\footnote{Due to space considerations the abstract will not further discuss attributes.}, relation to the containing component) it too has to be expressed as \code{rdfs:Class}. The actual literal value is a property of given element of type \code{cmdm:ElementValue}. For values that can be mapped to entities defined in external semantic resources, the references to these entities are expressed in parallel object properties of type \code{cmdm:hasElementEntity} (constituting outbound links). The containment relation between components and elements is expressed with a dedicated property \code{cmdm:contains}. 145 146 147 \begin{figure*} 148 \begin{center} 146 149 \label{table:rdf-spec} 147 150 \begin{example3} … … 187 190 % & rdfs:range & :Entity . \\ 188 191 \end{example3} 189 192 \end{center} 193 \caption{The CMD meta model in RDF} 194 \label{fig:final-example} 195 \end{figure*} 190 196 191 197 \subsection{CMD profile and component definitions} 192 198 This top-level classes and properties are subsequently used for modelling the actual profiles, components and elements as they are defined in the CR. 193 For stand-alone components, the IRI is the exact path into the CR to get the RDF representation for the profile/component\furl{http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/components/clarin.eu:cr1:c\_1271859438125/rdf}. For ``inner'' components (that are defined as part of another component) and elements the identifier is a concatenation of the nearest ancestor stand-alone component's IRI and the dot-path to given component/element (e.g., Actor:\\ \code{cr:clarin.eu:cr1:c\_1271859438197/rdf\#Actor.Actor\_Languages.Actor\_Language}\footnote{For the sake of readability, in the examples we will collapse the component IRIs, refering to them just by their name, prefixed with \code{cmd:}}) 194 195 \begin{example3} 196 cmd:collection& a & cmdm:Profile; \\ 197 & rdfs:label & "collection"; \\ 198 & dcterms:identifier & cr:clarin.eu:cr1:p\_1345561703620. \\ 199 cmd:Actor & a &cmdm:Component. \\ 200 \end{example3} 199 For stand-alone components, the IRI is the exact path into the CR to get the RDF representation for the profile/component\furl{http://catalog.clarin.eu/ds/ComponentRegistry/rest/registry/components/clarin.eu:cr1:c\_1271859438125/rdf}. For ``inner'' components (that are defined as part of another component) and elements the identifier is a concatenation of the nearest ancestor stand-alone component's IRI and the dot-path to given component/element (e.g., Actor:\\ \code{cr:clarin.eu:cr1:c\_1271859438197/rdf \#Actor.Actor\_Languages.Actor\_Language}\footnote{For the sake of readability, in the examples we will collapse the component IRIs, refering to them just by their name, prefixed with \code{cmd:}}) 200 201 \begin{example2} 202 cmd:collection \\ 203 $\;$ a & cmdm:Profile; \\ 204 $\;$ rdfs:label & "collection"; \\ 205 $\;$ dc:identifier & cr:clarin.eu:cr1:p\_1345561703620. \\ 206 cmd:Actor \\ 207 $\;$ a &cmdm:Component. \\ 208 \end{example2} 201 209 202 210 \subsubsection{Data Categories} 203 211 The primary concept registry in use by CMDI for its concept links is ISOcat. The recommended approach to link to the data categories is via an annotation property \cite{Windhouwer2012_LDL}. 204 212 205 \begin{example3} 206 dcr:datcat & a & owl:AnnotationProperty ; \\ 207 & rdfs:label & "data category"@en ; \\ 213 \begin{example2} 214 dcr:datcat \\ 215 $\;$ a & owl:AnnotationProperty ; \\ 216 $\;$ rdfs:label & "data category"@en ; \\ 208 217 % & rdfs:comment & "This resource is equivalent to this data category."@en ; \\ 209 218 % & skos:note & "The data category should be identified by its PID."@en ; \\ 210 \end{example 3}219 \end{example2} 211 220 212 221 Consequently, the \code{@ConceptLink} attribute on CMD elements and components referencing the data category can be modelled as: 213 222 214 \begin{example3} 215 cmd:LanguageName & dcr:datcat & isocat:DC-2484. \\ 216 \end{example3} 223 \begin{example2} 224 cmd:LanguageName \\ 225 $\;$ dcr:datcat & isocat:DC-2484. \\ 226 \end{example2} 217 227 218 228 %\subsection{RELcat - Ontological relations} … … 243 253 244 254 \begin{example3} 245 <lr1> & a & cmdm:Resource; \\ 246 & cmdm:hasMimeType & "audio/wav". \\ 255 \textless lr1\textgreater \\ 256 $\enspace \,$ a & & cmdm:Resource; \\ 257 \multicolumn{2}{l}{cmdm:hasMimeType } & "audio/wav". \\ 247 258 \end{example3} 248 259 … … 256 267 (Note, that one MD record can describe multiple resources. This can be also easily accommodated in OpenAnnotation.) 257 268 258 \begin{example3} 259 \_:anno1 & a & oa:Annotation ; \\ 260 & oa:hasTarget & <lr1a>, <lr1b> ; \\ 261 & oa:hasBody & \_:topComponent1 ; \\ 262 & oa:motivatedBy & oa:describing . \\ 263 \end{example3} 269 \begin{example2} 270 \_:anno1 \\ 271 $\:$ a & oa:Annotation ; \\ 272 $\:$ oa:hasTarget & \textless lr1a \textgreater, \textless lr1b\textgreater ; \\ 273 $\:$ oa:hasBody & \_:topComponent1 ; \\ 274 $\:$ oa:motivatedBy & oa:describing . \\ 275 \end{example2} 264 276 265 277 \subsubsection{Provenance} … … 267 279 The information from the CMD record \code{cmd:Header} represents the provenance information about the modelled data. 268 280 269 \begin{example3} 270 \_:topComponent1 & dcterms:identifier & <lr1.cmd> ; \\ 271 & dcterms:creator & \var{\{cmd:MdCreator\}} ; \\ 272 & dcterms:publisher & <http://clarin.eu> ; \\ 273 & dcterms:created & \var{\{cmd:MdCreated\}} . \\ 274 \end{example3} 281 \begin{example2} 282 \_:topComponent1 \\ 283 $\:$ dc:identifier & \textless lr1.cmd \textgreater ; \\ 284 $\:$ dc:creator & \var{\{cmd:MdCreator\}} ; \\ 285 $\:$ dc:publisher & \textless http://clarin.eu\textgreater ; \\ 286 $\:$ dc:created & \var{\{cmd:MdCreated\}} . \\ 287 \end{example2} 275 288 276 289 \subsubsection{Collection hierarchy} % ( Resource Proxy â IsPartOf)} … … 280 293 281 294 \begin{example3} 282 <lr0.cmd>& a & ore:ResourceMap . \\283 <lr0.cmd> & ore:describes & <lr0.agg>. \\284 <lr0.agg>& a & ore:Aggregation ; \\285 & ore:aggregates & <lr1.cmd>, <lr2.cmd>. \\295 \textless lr0.cmd \textgreater & a & ore:ResourceMap . \\ 296 \textless lr0.cmd\textgreater & ore:describes & \textless lr0.agg\textgreater . \\ 297 \textless lr0.agg\textgreater & a & ore:Aggregation ; \\ 298 & ore:aggregates & \textless lr1.cmd\textgreater, \textless lr2.cmd\textgreater . \\ 286 299 \end{example3} 287 300 … … 304 317 \begin{example3} 305 318 \_:actor1 & a & cmd:Actor . \\ 306 \_:actor1lang1 & a & cmd:Actor.Actor\_Language . \\ 319 \_:actor1lang1 & a & cmd:Actor \\ 320 & & .Actor\_Language . \\ 307 321 \_:actor1 & cmd:contains & \_:actor1lang1 . \\ 308 322 \end{example3} … … 318 332 \end{comment} 319 333 320 \subsubsection{Elements, Fields, Values} 321 Finally, we want to integrate also the actual field values in the CMD records into the linked data. 322 As explained before, CMD elements have to be typed as \code{rdfs:Class}, the actual value expressed as \code{cmds:ElementValue}, and they are related bua a \code{cmdm:hasElementValue} property. 323 324 While generating triples with literal values seems straightforward, the more challenging but also more valuable aspect is to generate object property triples (predicate \code{cmdm:hasElementEntity}) with the literal values mapped to semantic entities. Following example shows the whole chain of statements from metamodel to literal value and corresponding semantic entity. 325 326 The actual mapping process from values to entities is a complex challenging task and will be tackled in more detail in the full paper. The main idea is to find entities in selected reference datasets (controlled vocabularies, ontologies) corresponding to the literal values in the metadata records. The obtained entity identifiers are further used to generate new RDF triples, representing outbound links. 327 334 335 \begin{figure*} 336 \begin{center} 328 337 \begin{example3} 329 338 cmd:Person & a & cmdm:Component . \\ … … 349 358 <http://www.mpi.nl/> & a & cmd:OrganisationElementEnity . 350 359 \end{example3} 360 \end{center} 361 \caption{Chain of statements from metamodel to literal value and corresponding semantic entity} 362 \label{fig:final-example} 363 \end{figure*} 364 365 366 \subsubsection{Elements, Fields, Values} 367 Finally, we want to integrate also the actual field values in the CMD records into the linked data. 368 As explained before, CMD elements have to be typed as \code{rdfs:Class}, the actual value expressed as \code{cmds:ElementValue}, and they are related bua a \code{cmdm:hasElementValue} property. 369 370 While generating triples with literal values seems straightforward, the more challenging but also more valuable aspect is to generate object property triples (predicate \code{cmdm:hasElementEntity}) with the literal values mapped to semantic entities. Following example shows the whole chain of statements from metamodel to literal value and corresponding semantic entity. 371 372 The actual mapping process from values to entities is a complex challenging task and will be tackled in more detail in the full paper. The main idea is to find entities in selected reference datasets (controlled vocabularies, ontologies) corresponding to the literal values in the metadata records. The obtained entity identifiers are further used to generate new RDF triples, representing outbound links. 351 373 352 374 \begin{comment}
Note: See TracChangeset
for help on using the changeset viewer.