wiki:VLO-Taskforce/Meeting-2015-04-23

Context Navigation

VLO Taskforce Virtual meeting April 23, 2015

Topics: Recommendations for the VLO-Facets; VLO development; Resource selection; Use Cases; Reports (not documented here)

Participants: Sebastian Drude, Thomas Eckart, Peter Fankhauser, Twan Goosen, Susanne Haaf, Hanna Hedeland, Axel Herold, Jörg Knappen, Dieter van Uytvanck

(1) VLO-WG-Recommendations on facets

Issue: Resource Type vocabulary does not yet include a term for "session/bundle/media session" in spoken corpora
- decision: include term "session" for this (in charge: Susanne --> [update:] done)
Issue: Include "speaker" in the vocabulary as well? (raised by Peter)
- General question: Do we consider speaker as a linguistic resource (and hence include the term "speaker" in the Resource Type vocabulary)?
- If not: How do we treat CMDI files documenting speakers? Should they be excluded from the VLO during the harvesting process? Or should they be implicitely "excluded" by not being interpreted for any of the facets?
- Concerns: Resource type vocabulary should be kept simple and small with quite open terms which might be as inclusive as possible; Consider that in general it's rather difficult to extract the correct resource type from a CMDI file
- (ToDo1:) please vote via email (speaker yes/no?)
Open Issues
- (Susanne) CLARIN-DK: e-mailed them about how they want to be named; (pending:) still waiting for an answer --> [update:] decision is: change from "CLARIN-DK-UCPH" to "CLARIN-DK"
- (Susanne) Modality values: posed question to the F-AG 6 if they could give feedback on the proposed vocabulary; (pending:) still collecting feedback --> [update:] got feedback from Petra Wagner (F-AG6)
- (Susanne) Ask Dieter about mimetypes list for linguistic resources:
  - such list exists, cf. MimetypesForLinguisticResources
  - seems to be a good starting point
  - might need some clarification or at some point maybe homogenization, though (e.g. "Open Document" is in there twice; XML is missing completely)
  - (ToDo2:) Should we use this list? If yes, are there changes necessary to it? Add your feedback as a comment to ticket #764
- (Thomas & Peter) Mapping for Availability facet
  - Mapping has been provided by Paweł Kamocki and Erik Ketzan
  - TestInstance? --> http://aspra11.informatik.uni-leipzig.de:8080/vlo/ or (soon) http://catalog-clarin.esc.rzg.mpg.de/vlo/
  - Mapping
  - Twan: DC5439 should be included in the Recommendations for the Availability facet (in charge: Susanne --> [update:] done)
  - add a (prose) clarification about what Availability values mean (in charge: Susanne --> [update:] done)
Comments on Facet Recommendations
- Dieter:
  - document should be added to the CMDI creation best practises (in charge: Dieter)
  - MD-curation related issues within the document should be passed on to the MD curation taskforce (in charge: Susanne)
  - new tickets based on the Facet Recommendations: #246, #575, #734, #750

New and updated documents
- New version of the VLO-Facets-Recommendations-Document here
- Availability vocabulary Mapping
- Draft for Modality vocabulary Mapping

(2) Perspective on VLO development (raised by Twan)

perspectively plan smaller steps for VLO development
quicker and smaller releases

(3) Resources to be harvested for the VLO (raised by Dieter)

European library offers their metadata
- plan to selectively import data sets into the VLO
- VLO-WG should help selecting
- for more information cf. #755
- (ToDo3:) Please take a look at the list of resources; add your comments about which resources to select to ticket #755
Discussion: Do we want to allow libraries to make their metadata available via the VLO at all?
- Jörg: problem of filtering out resources which are not interesting from a linguistic point of view
- possible solution: Maybe somehow reflect that resources originate from libraries by their "collection" name or "resource type"?
Discussion: How do we seperate "good metadata" from "bad metadata" within the VLO?
- solution 1: VLO-WG provides restrictions on "good metadata", on which information should be there; data not fitting to this restrictions will not be shown as prominently (ranking); MD-curation taskforce should decide which data are included in the VLO
- solution 2: implement a ranking based on how many facets can be filled by a metadata record
- solution 3: implement a ranking based on if the recommended vocabulary has been used (all other values will be shown behind that)

(4) Issue of Use Cases for the VLO (raised by Sebastian)

It would be good if we could collect typical use cases for the VLO
This way it would be easier for us to evaluate if the VLO performs good enough or not
Use cases could be collected by:
- (1) us; focus:
  - Can I find our center's resources within the VLO by usage of what I find are intuitive search terms and filters?
  - Collect false negatives
- (2) representatives of the discipline specific working groups (for typical research questions coming from their disciplines)
(ToDo4) Create a little questionnaire we could all stick to --> Who could attend to that?

Last modified 9 years ago Last modified on 06/04/15 14:23:10

Attachments (2)

VLO-AG_RecommendationsOnFacets.pdf (265.8 KB) - added by haaf@bbaw.de 9 years ago.
Modality_Mapping.xml (33.5 KB) - added by haaf@bbaw.de 9 years ago.

Download all attachments as: .zip

Download in other formats:

Plain Text