| 73 | === Wednesday 26 October === |
| 74 | |
| 75 | Joint TF meeting (raw notes: [https://docs.google.com/document/d/1Y2wb4jl32y9uUAo-Z8wlR0v-eLvdkYUYfl7ENtc-UGU/edit?usp=sharing Google Doc]) |
| 76 | |
| 77 | ==== CMDI 1.2 ==== |
| 78 | CMDI1.2 rolled out - supported in the production environment |
| 79 | CMDI toolkit finished - validate, create, migrate CMD records |
| 80 | Spec is available online |
| 81 | CompRegistry supports 1.2 (and still 1.1), not all 1.2 features are implemented yet. -> Twan working on CR 2.2 with more complete coverage of 1.2 features |
| 82 | support for 1.2 in tools: ?? |
| 83 | COMEDI? (could become the default/recommended CMDI editor instead of Arbil) |
| 84 | Centres partly already providing 1.2 (Leipzig) |
| 85 | |
| 86 | CLAVAS! |
| 87 | CMDI 1.2 supports the CLAVAS vocabularies (currently: Organisations, Language codes; new vocabularies could come in, but need commitment) |
| 88 | candidates for vocabularies: mime-types, licences (manager: legal issues committee?) |
| 89 | CLAVAS based on OpenSKOS provided by Meertens - new version of OpenSKOS coming soon. |
| 90 | governance/versioning/curation of vocabularies is a problem (to solve) |
| 91 | how to edit the vocabularies (google-spreadsheets) |
| 92 | how to deal with wild variants of concepts |
| 93 | Hidden labels - for spelling errors etc; |
| 94 | or separate ConceptSchemes one with official labels/concepts, one with all the misspellings and mapping realtions between the concepts of the different concept schemes |
| 95 | all controlled vocabularies to be used in VLO should be in CLAVAS |
| 96 | |
| 97 | Best practices - a more practical guide to usage of CMDI |
| 98 | Current/previous effort: https://www.clarin.eu/content/cmdi-best-practice-guide |
| 99 | Issues: |
| 100 | Hierarchies |
| 101 | resource type |
| 102 | Input (non-exhaustive) |
| 103 | Existing best practices draft |
| 104 | CLARIN-D VLO-WG output https://trac.clarin.eu/wiki/VLO-Taskforce |
| 105 | Recommendations document |
| 106 | Granularity recommendations |
| 107 | FAQ |
| 108 | Paper by Thorsten Trippel and Claus Zinn ?? |
| 110 | Has to go inline/in sync with curation work and VLO development |
| 111 | |
| 112 | ==== VLO ==== |
| 113 | Brief status update |
| 114 | 4.0: CMDI 1.2 + UI |
| 115 | The VLO's 'Format' facet and the use of MIME types (Hanna) |
| 116 | To improve the 'Format' facet, we probably first need some agreement on it's content, i.e. the use of MIME types, in particular for file formats that don't have an unambiguous standard MIME type such as application/tei+xml or application/xml. In CLARIN-D there has been a decision to try out an approach based on Content-Type syntax; a standard MIME type with additional parameters for disambiguation, cf. MIME format variants for some info and links to the discussion on mailing lists. |
| 117 | VLO ‘format’ facet is fed from resource proxies (120 distinct values) |
| 118 | Also map from mime type component? E.g. in case of zips |
| 119 | Make an (open) vocabulary for media types |
| 120 | Cannot be validated using schema but could be part of the quality assessment procedure |
| 121 | Helpdesk responsibilities and workflows (Hanna) |
| 122 | Is our (and CLARIN-D's) idea of helpdesk responsibilities and workflows for reports and requests via the VLO OK with everyone involved? |
| 123 | Jan: quite satisfied, usually get an answer |
| 124 | Matej: receive reports, do not always act on it |
| 125 | Ticket can be addressed to multiple people if they are in the queue (or by sending an e-mail) |
| 126 | Define Workflow better between trac-tickets and OTRS-issues (and github-issues) |
| 127 | Hanna will come back to us with a new/updated approach |
| 128 | Priorities and planning (Twan) |
| 129 | https://github.com/clarin-eric/VLO/milestones |
| 130 | Maintenance release (VLO 4.0.2) - update of mappings! |
| 131 | Use GitHub to publish and distribute mappings |
| 132 | November? |
| 133 | Separate meeting in the next few weeks |
| 134 | Next minor release (VLO 4.1) |
| 135 | Use new dynamic mapping retrieval workflow |
| 136 | Various forks of a (to be created) mapping repository |
| 137 | Domain experts and other curators can work on a fork and send pull request to curation experts |
| 138 | Curation experts can send pull request to (production/beta/alpha) VLO maintainers |
| 139 | VLO maintainers can maintain testing and production branches/forks |
| 140 | Sync/transfer between CSV, CLAVAS, uniform mapping files (see below) |
| 141 | Q1 2017?? |
| 142 | Development process (Twan) |
| 143 | Code forks/synchronisation |
| 144 | Sustainable procedure for (near) future |
| 145 | |
| 146 | ==== Facet mapping, normalisation, curation module ==== |
| 147 | |
| 148 | Brief status update w.r.t. curation module, related work |
| 149 | Curation module, Curation instance of VLO |
| 150 | https://github.com/acdh-oeaw/vlo-curation |
| 151 | Facet coverage |
| 152 | E.g Resourcetype: map from profile name - big improvement |
| 153 | Tentative mapping(s) -> how to agree on (precise) application of this approach? |
| 154 | Jan: for organisations, mapping is not enough, also fuzzy search required, too many spelling variation |
| 155 | So we would need to update the organisation vocabulary, |
| 156 | But lot could be achieved also with fuzzy search (string normalisation, editing distance) |
| 157 | Reporting of new values (to curators) would be helpful - possibly with automatic (pre)processing on top |
| 158 | [VLO] For certain facet, apply fuzzy factor by default (for free text search - maybe also facets???)- |
| 159 | Workflow for application of mapping in production |
| 160 | Use github (with forks) for curating/editing the vocabularies |
| 161 | You can edit csv in github directly |
| 162 | github:vlo-curation/maps/csv |
| 163 | github:vlo-curation/maps/profileName -> resourceType |
| 164 | |
| 165 | |
| 166 | Short term (4.1) |
| 167 | Resource type |
| 168 | CLAVAS vocabulary |
| 169 | Map for profile name -> resource type mapping |
| 170 | People: Jan |
| 171 | Licence/availability |
| 172 | CLAVAS vocabulary |
| 173 | Licence URIs (instead of the handles that are assigned to concepts by OpenSKOS automatically) |
| 174 | Mapping files licence identifier (URI) -> licence name, licence URL |
| 175 | People: Legal Issues Committee, ask CLARINO people to curate/update (Oddrun?) |
| 176 | MIME type |
| 177 | Map ~120 types to descriptions (vocabularies) |
| 178 | Some names/descriptions already provided in “MIME type inventory” document |
| 179 | Map for cross-facet mapping (-> resource type) |
| 180 | Hanna (+SCCTC editorial team of interoperability document?) |
| 181 | How to get from CSV to SKOS? |
| 182 | How to get from SKOS to vlo-map? (fetch whole concept-scheme from CLAVAS and transform via XSL to a map) |
| 183 | Determine/implement transitions between CSV, CLAVAS, Mapping files |
| 184 | Long term (curation module) |