Virtual meeting on VLO development planning 2020-01-29
- What?
- VLO progress and planning meeting
- Who?
- Can, Dieter, Matej, Tariq, Thomas, Twan
- When?
- 29 January 2020 10:00 - 11:00 CET (on basis of doodle)
- Where?
Documents
- Milestones (GitHub)
Other
Agenda
VLO
VLO 4.9
New features/enhancements previously considered for 4.9
- #209: Include metadata as Structured data for datasets
- #36: Conditional facets
- Pilot cases:
- hide
multilingual
until at least one value has been selected forlanguage
- hide
genre
until there are less than N(=50?) values OR acollection
value has been selected - ...
- hide
- Pilot cases:
- #115: Temporal coverage
- (...) a pragmatic approach might be to extract all temporal information and store only the extrema for the whole instance. This would be robust regaring alignment problems (...), but would also create a larger timespan out of multiple discontinuous timespans.
- #189: Display original values
- #140: Allow for mapping based on a profile's concept link
- #169: Migrating uniform maps and other post-processors to value mapping
- #240: Show more precise but user friendly media type for resources
- #141: Use alternative labels from vocabularies and maps as synonyms in Solr
- Improved VCR integration
- depends on VCR use cases...
- Improved LRS integration (information exchange, preflight?)
- Try to improve Solr performance + investigate possibilities for scaling up
VLO 5.0
Rethink/redesign/reimplement UI
Curation module & link checking
- Development, release and operation of Apache Storm based link checker
- Hosting/operation moved to CLARIN central infra?
- Possible integration of cmdi validator(https://github.com/clarin-eric/cmdi-instance-validator ) in curation module?
Planning
- Releases & deployments
- From CLARIN internal roadmap:
- 4.9 ideally before EOSC-hub week (release end of April/early May)
- 5.0 design complete ~end of summer, release around the end of 2020
- From CLARIN internal roadmap:
- Next meeting
Notes
4.9 issues
Issues to include in milestone with highest priority
- #209 "Include metadata as Structured data for datasets"
- Should be included but requires investigation:
- Conditions for showing?
- There has to be landing page, or another reference to point to as the original resource (which may or may not serve structured data)
- We want to be able to explicitly exclude specific collections
- Minimal information required
- Based on requirements of Dataset profile (see Google)
- Also for non-CLARIN collections?
- Europeana case: check whether they already do this and/or would be ok with us doing this
- Conditions for showing?
- Should be included but requires investigation:
- #36: Conditional facets
- 'Basic' and clear cases like language -> multilingual could already be included in 4.9
- Some cases like the genre example are more 'aggressive' and can be intransparent. Take up in investigation for re-design (VLO 5.0)
- #115: Temporal coverage
- Round to year and map to extrema (earliest and latest year mentioned). ~19k records have information + many from Europeana.
- Also introduces some UI opportunities/challenges: new type of facet value selector; possibly a date histogram could be shown as well.
Issues for 4.9 milestone with lower priority
- 140: Allow for mapping based on a profile's concept link
- 169: Migrating uniform maps and other post-processors to value mapping
- 189: show original values
- maybe better for 5.0, rethink UI aspects
- risk of sacrificing transparent
- maybe do user testing
- ask Martin to make the case for this?
- What does Europeana do?
- Need to distinguish between types of mapping
- 240: Show more precise but user friendly media type for resources
- Need to find or construct this vocabulary
- Would also be useful for switchboard and VCR
- Also value mapping to join types?
- Would require a separate mapping into separate field
- Solr tuning
- Make an issue + try out things before 4.9
- Prioritise on understanding bottlenecks
- Profiling
Issues to be excluded from 4.9 milestone
- VCR integration
- Probably not right now. Check with Willem
- Existing VCR integration could be taken out?
- LSR integration
- 'Widget' and preflight will not be implemented and available before 4.9 release
- 141: Use alternative labels from vocabularies and maps as synonyms in Solr
- Possibly close permanently
VLO 5.0
More ideas to be gathered in the design document, more concrete idea that can be implemented should have started crystallising around the summer. Any input is welcome.
Curation module & link checking
Link checker
- Strong suspicion of source of performance issues (update: source confirmed and path to resolution identified)
- New, testable version of 'stormy checker' should be available in about 2 weeks
- To be checked with (development version) of the VLO -> #285
Curation module
- Next steps:
- Improve situation with '0' status code
- Can plans to replace status code as success indicator with an explicit status field (
ok
/broken
/undetermined
); status code field can become nullable - Statistics reported to be updated; probably two levels will be introduced: first breaks down according to overall status; second gives details on status codes, response times etc (similar to what is available now)
- Can plans to replace status code as success indicator with an explicit status field (
- Later: integration of CMDI instance validator into curation module
- Improve situation with '0' status code
Planning
- Status & planning meeting for 4.9 development and roadmap: early march
- Doodle
- chosen date: 11 March 2020
- Doodle
Last modified 4 years ago
Last modified on 02/05/20 15:08:49