Opened 6 years ago
#1067 new defect
Extreme title duplication in Talkbank collection
Reported by: | Twan Goosen | Owned by: | matej.durco@oeaw.ac.at |
---|---|---|---|
Priority: | major | Milestone: | |
Component: | MetadataCuration | Version: | |
Keywords: | Cc: |
Description
As can be seen in the VLO, there is a high degree of name/title duplication in the included TalkBank? records. This is due to the values in the 'name' and 'title' fields. For example, two distinct records have
Name: broed Title: "SamtaleBank Steensig Corpus"
and
Name: traenings_toej Title: "SamtaleBank Steensig Corpus"
As the VLO picks 'title' over 'name' for display purposes (a somewhat arbitrary convention but we cannot use both), the result is that both records show up with the same name in the VLO.
TalkBank? could improve the situation by either not using 'title' as a place to store (just) the collection name.
Note: See
TracTickets for help on using
tickets.