FCS Taskforce Video Conference 2015-06-24
- What?
- Extension of FCS, advanced dataview in focus.
- Who?
- Members of the FCS taskforce
- When?
- 24 June 2015: 14.00 - 15.00 CET
- Where?
- FlashMeeting: http://fm.ea-tel.eu/fm/7aec81-40982
Agenda
- Welcome
- DataView(s)
Outcome: Minutes, Decisions and Actions (currently draft)
Main focus of today's Dataview return format
Not much examples except for Oliver's. We need more.
Matej: are these exampel formats for Advanced dataview adhoc or based on something?
Oliver: mental model of that one dataview will convey all layers etc. E g no msd dataview or explict elements for some layers. It is a container format.
Pavel: looks nice. Reason behind it?
Any tree? Complex tree view? Multiple layers?
Press stop broadcasting 2-3 secs after you stop talking.
Questions on Hierarchical structure
We need to define a proper scope of work.
Matej: hierarchy is really another complexity level. Start with query annotation layers, and only when we master that move forward.
Pavel: second suggestion multi annotations. Fine with forgetting trees for now. Structured/hierarchical attributes. I think it can be nice in terms of ease of understanding that some things belong together.
Matej: what do you mean? Provide example. case example
Oliver: question of course is, if structured, how would it be structured ;) a client like the aggregator needs to make sense of that
We need to harmonize any structured content. No good solution yet.
Dieter: Pavel: is there a standard way for that in the universal dependencies?
Matej: If we go this route we need to come up with even more controlled vocabularies, like for POS the UD-17 we decided on.
Jörg: Not every endpoint will provide UD-17. We are not going to retag all corpora.
Pavel: I agree hierarchical attributes are too complex for now.
Jörg: bidrectional translation is needed. LJO: Sometimes mappings cannot be twoways, but preferably yes
Dieter: Start with limited list of layers before making a choice on any proposal
Question about first proposal, syllables
Oliver: Finer granularity. In worst case every single characters.
Dieter: Candidate also for speech or primarily for textual resources?
Basically time lined signal "textual" annotations
Dieter: provides concrete example. Silence seems to be best covered by proposal 1. Transcribed speech corpora.
Matej: Not really, proposal 2 is better but needs other unit like time stamps. Generic way to describe atomic units.
Oliver: yes, offsets can well be timestamps
silence, background noice are interpretations/computed annotations
Question about combination of dataviews
Allow to reference other layer's items. This was disussed at the workshop as one alternative LJO wants to provide proposal.
Oliver: non-superior layers, order is not defined in the current proposals
Dieter: I have the impression we are re-inventing formats like EAF: https://corpus1.mpi.nl/media-archive/demo/Ams_Demo/versioning_demo/Annotations/118_fishing2-fire-2011.eaf
LJO action: Add EAF to comparison matrix wehen we have more examples.
Keep things simple. We need to keep it fairly straight forward and still keep non-textual formats in mind.
We need to work on getting more concrete proposals available. More examples are needed.
The transport format and its data could be transformed to standard format if needed.
Question about parallel corpora
Adds another annotation layer, potentially translation also
Consensus for actions:
- More discussion is needed.
- Concrete proposals.
- concrete examples
Hanna: propose new layers? Mappings are welcome.
Decision: Use second proposal for new examples or another proposal with examples
Dieter: Need attribute for unit.
Question about lexical resources
Matej raises the question about how to handle lexical resources.
Actions summary
- ALL: Toying with the current proposals and example
- ALL: Provide proposals
- ALL: Provide examples
- LJO: add page for advanced datawiew to trac where these can be put for reference, still important to use mailing list for discussion
- LJO: Add EAF to comparison matrix wehen we have more examples.
- Next meeting, 2-3 weeks, might be flash again or adobe connect
Provide your examples on this wiki page.
Documents
N/A