Opened 13 years ago
Closed 13 years ago
#61 closed enhancement (fixed)
add a resource type facet
Reported by: | dietuyt | Owned by: | patdui |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | VLO web app | Version: | |
Keywords: | Cc: |
Description
See http://trac.clarin.eu/wiki/CmdiVirtualLanguageObservatory for the mapping from LRT inventory to this to be created facet.
For IMDI we should look at the mime types of the resources linked in and map it to:
- audio
- video
- text (txt, PDF)
- annotation (EAF, shoebox, toolbox, CHAT)
- image (png, jpg, gif, tiff)
For OLAC we should map the following values of /CMD/Components/OLAC-DcmiTerms?/type[@dcterms-type="DCMIType"]:
(based on http://dublincore.org/documents/dcmi-type-vocabulary/)
Still Image > image
Image > image
Text > text
Sound > audio
Moving Image > video
All these mappings will result in a heterogeneous fact ("Speech corpus" next to "video") but that's because of the differences in the metadata we are importing.
Attachments (1)
Change History (5)
comment:1 Changed 13 years ago by
comment:2 Changed 13 years ago by
OLAC moet nog gemapped worden, (in de importer) dan is het goed :)
Changed 13 years ago by
Attachment: | oai_crdo_fr_crdo000001.xml.cmdi added |
---|
CMDI file with a type element that should be used for the ResourceType? facet
comment:3 Changed 13 years ago by
I attached an example file of a olac CMDI description where there is no possibility to use the extension of the filename as guidance and where the only info about the resource type is in <type dcterms-type="DCMIType">Sound</type>
comment:4 Changed 13 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
I added the resource type pattern for the olac DCMITypes and a postprocessor to map e.g. Still Image to image.
Currently the resourceType facet show all values of resourceTypes without normalizing them to "real" mimetypes. E.g. 'Lexicon / Knowledge Source' is a value, normalizing it would result in an "unknown type" mimetype. This mimetype is stored with the resource url. I think this is both good and bad. Bad because we show a list of resource type that are weird and with the postproccessor we remap some resource types but not all. Good because we show the data that is in the metadata record and that is perhaps what people will search for. If we should add the normalized mimetype I propose to make a new ticket to do this because this ticket gets a little bit sidetracked.
The /CMD/Components/OLAC-DcmiTerms??/type[@dcterms-type="DCMIType"]
Should be put in the mimeType attribute of the ResourceType? Element:
<ResourceType? mimetype="">Resource</ResourceType?>
At least that is what is being used for imdi now, probably a good idea to get that all out when generating the cmdi files.