Opened 6 years ago

#1067 new defect

Extreme title duplication in Talkbank collection

Reported by: Twan Goosen Owned by: matej.durco@oeaw.ac.at
Priority: major Milestone:
Component: MetadataCuration Version:
Keywords: Cc:

Description

As can be seen in the VLO, there is a high degree of name/title duplication in the included TalkBank? records. This is due to the values in the 'name' and 'title' fields. For example, two distinct records have

Name: broed
Title: "SamtaleBank Steensig Corpus"

and

Name: traenings_toej
Title: "SamtaleBank Steensig Corpus"

As the VLO picks 'title' over 'name' for display purposes (a somewhat arbitrary convention but we cannot use both), the result is that both records show up with the same name in the VLO.

TalkBank? could improve the situation by either not using 'title' as a place to store (just) the collection name.

Change History (0)

Note: See TracTickets for help on using tickets.