Changeset 4460 for CMDI-Interoperability
- Timestamp:
- 02/06/14 11:02:56 (10 years ago)
- Location:
- CMDI-Interoperability/CMD2RDF/trunk/docs/papers/2014-LREC
- Files:
-
- 2 edited
Legend:
- Unmodified
- Added
- Removed
-
CMDI-Interoperability/CMD2RDF/trunk/docs/papers/2014-LREC/CMD2RDF.bib
r4451 r4460 193 193 } 194 194 195 @STANDARD{ISODIS24622-1_2013, 196 title = {Language resource management -- Component Metadata Infrastructure 197 -- Part 1: The Component Metadata Model (CMDI-1)}, 198 organization = {ISO}, 199 author = {{ISO/DIS 24622-1}}, 200 type = {International Standard}, 201 number = {24622-1}, 202 address = {Geneva, Switzerland}, 203 year = {2013}, 204 url = {http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=37336}, 205 owner = {m}, 206 publisher = {ISO}, 207 timestamp = {2014.02.05} 208 } 209 195 210 @STANDARD{ISO12620:2009, 196 211 title = {Specification of data categories and management of a Data Category … … 215 230 216 231 @INPROCEEDINGS{Joerg2010, 217 author = {Brigitte J\ '{o}rg and Hans Uszkoreit and Alastair Burt},232 author = {Brigitte J\"{o}rg and Hans Uszkoreit and Alastair Burt}, 218 233 title = {LT World: Ontology and Reference Information Portal}, 219 234 booktitle = {LREC}, … … 389 404 390 405 @INPROCEEDINGS{SchuurmanWindhouwer2011, 391 author = {I . Schuurman and M.A. Windhouwer.},406 author = {Ineke Schuurman and Menzo Windhouwer}, 392 407 title = {Explicit Semantics for Enriched Documents. What Do ISOcat, RELcat 393 408 and SCHEMAcat Have To Offer?}, … … 604 619 } 605 620 606 @STANDARD{ISODIS24622-1_2013,607 title = {Language resource management -- Component Metadata Infrastructure -- Part 1: The Component Metadata Model (CMDI-1)},608 organization = {ISO},609 author = {{ISO/DIS 24622-1}},610 type = {International Standard},611 number = {24622-1},612 address = {Geneva, Switzerland},613 year = {2013},614 url = {http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=37336},615 abstract = {},616 owner = {m},617 publisher = {ISO},618 timestamp = {2014.02.05}619 }620 621 622 621 @comment{jabref-meta: selector_publisher:} 623 622 -
CMDI-Interoperability/CMD2RDF/trunk/docs/papers/2014-LREC/CMD2RDF.tex
r4451 r4460 65 65 \title{From CLARIN Component Metadata to Linked Open Data} 66 66 67 \name{Matej Durco, Menzo Windhouwer}67 \name{Matej \v{D}ur\v{c}o, Menzo Windhouwer} 68 68 69 69 \address{ Institute for Corpus Linguistics and Text Technology (ICLTT), The Language Archive - DANS \\ 70 70 Vienna, Austria, The Hague, The Netherlands \\ 71 matej.durco@ assoc.oeaw.ac.at, menzo.windhouwer@dans.knaw.nl\\}71 matej.durco@oeaw.ac.at, menzo.windhouwer@dans.knaw.nl\\} 72 72 73 73 \abstract{In the european CLARIN infrastructure a growing number of resources are described with Component Metadata. In this paper we … … 81 81 \section{Motivation} 82 82 % 83 Although semantic interoperability has been one of the main motivations for CLARIN's Component Metadata Infrastructure (CMDI), until now there has been no work on the obvious -- bringing CMDI to the Semantic Web. We believe that providing the CLARIN CMD records as Linked Open Data (LOD) interlinked with external semantic resources, will open up new dimensions of processing and exploring of the CMD data by employing the power of semantic technologies. In this paper, we lay out how individual parts of the CMD Infrastructure can be expressed in RDF and made ready to be interlinked with existing external semantic resources (ontologies, knowledge bases,vocabularies).83 Although semantic interoperability has been one of the main motivations for CLARIN's Component Metadata Infrastructure (CMDI), until now there has been no work on the obvious -- bringing CMDI to the Semantic Web. We believe that providing the CLARIN CMD records as Linked Open Data (LOD) interlinked with external semantic resources, will open up new dimensions of processing and exploring of the CMD data by employing the power of semantic technologies. In this paper, we lay out how individual parts of the CMD data domain can be expressed in RDF and made ready to be interlinked with existing external semantic resources (ontologies, taxonomies, knowledge bases, vocabularies). 84 84 %This conversion lays a foundation / is groundwork for providing the original dataset as a \emph{Linked Open Data} nucleus within the \emph{Web of Data}\cite{TimBL2006} as well as for real semantic (ontology-driven) search and exploration of the data. 85 85 … … 107 107 108 108 \subsubsection{CMD Profiles } 109 Currently 1 33 public profiles and 772components are defined in the CR.109 Currently 146 public profiles and 857 components are defined in the CR. 110 110 Next to the `native' ones a number of profiles have been created that implement existing metadata formats, like OLAC/DCMI-terms, TEI Header or the META-SHARE schema. 111 111 %The resulting profiles proof the flexibility/expressi\-vi\-ty of the CMD metamodel. … … 119 119 120 120 The main CLARIN OAI-PMH harvester\footnote{\url{http://catalog.clarin.eu/oai-harvester/}} 121 regularly collects records from the -- currently 69 -- providers, all in all over 550.000 records.122 16 of the providers offer CMDI records, the other 53 providearound 140.000 OLAC/DC records, that are converted into the corresponding CMD profile.121 regularly collects records from the -- currently 58 -- providers, all in all over 600.000 records. 122 Some 20 of the providers offer CMDI records, the rest provides around 140.000 OLAC/DC records, that are converted into the corresponding CMD profile. 123 123 %Next to these 81.226 original OLAC records, there a few providers offering their OLAC or DCMI-terms records already converted into CMDI, thus all in all OLAC, DCMI-terms records amount to 139.152. 124 124 %On the other hand, some … … 312 312 \begin{example3} 313 313 \_:actor1 & a & cmd:Actor . \\ 314 \_:actor1lang1 & a & cmd:Actor. \\ 315 & & Actor\_Language . \\ 314 \_:actor1lang1 & a & cmd:Actor.Language . \\ 316 315 \_:actor1 & cmd:contains & \_:actor1lang1 . \\ 317 316 \end{example3} … … 349 348 \_:org & a & cmd:Person.Organisation ; \\ 350 349 & \multicolumn{2}{l}{cmd:hasPerson.OrganisationElementValue \quad 'MPI'\^{}\^{}xs:string ;} \\ 351 & \multicolumn{2}{l}{ cmd:hasPerson.OrganisationElementEntity \quad <http://www.mpi.nl/>. }\\352 353 <http://www.mpi.nl/>& a & cmd:OrganisationElementEnity .350 & \multicolumn{2}{l}{ cmd:hasPerson.OrganisationElementEntity \quad \textless http://www.mpi.nl/\textgreater . }\\ 351 352 \textless http://www.mpi.nl/\textgreater & a & cmd:OrganisationElementEnity . 354 353 \end{example3} 355 354 \end{center} … … 370 369 \section{Mapping field values to semantic entities} 371 370 \label{sec:values2entities} 372 373 \commentx{this is probably definitely too much for one abstract - so we could just anounce the need for this mapping process.}374 371 375 372 This task is a prerequisite to be able to express also the CMD instance data in RDF. The main idea is to find entities in selected reference datasets (controlled vocabularies, ontologies) matching the literal values in the metadata records. The obtained entity identifiers are further used to generate new RDF triples, representing outbound links. … … 452 449 Of course, language is just one dimension to use for mapping. 453 450 Step by step we will link other categories like countries, geographica, organisations, etc. 454 to some of the central nodes of the LOD cloud 451 to some of the central nodes of the LOD cloud, like \xne{dbpedia}, \xne{Yago} or \xne{geonames}, 455 452 but also to domain-specific semantic resource like the ontology for language technology \xne{LT-World} \cite{Joerg2010} developed at DFKI. 456 453
Note: See TracChangeset
for help on using the changeset viewer.