Opened 8 years ago

Closed 8 years ago

#828 closed enhancement (fixed)

Check if unwanted files filter interferes with the hierarchy extraction

Reported by: teckart@informatik.uni-leipzig.de Owned by: teckart@informatik.uni-leipzig.de
Priority: minor Milestone: VLO-3.4
Component: VLO importer Version:
Keywords: Cc: Twan Goosen

Description

This could be a problem for CMDI files that only contain Metadata-typed ResourceProxys? (Importer: MetadataImporter:356).

Change History (6)

comment:1 Changed 8 years ago by DefaultCC Plugin

Cc: Twan Goosen added

comment:2 Changed 8 years ago by teckart@informatik.uni-leipzig.de

Milestone: VLO-3.4

comment:3 Changed 8 years ago by teckart@informatik.uni-leipzig.de

The VLO test server in Leipzig has now imported all CMDI records, including files containing only Metadata RPs (at http://aspra11.informatik.uni-leipzig.de:8080/vlo).

Previously ignored files from the following centers are now included (Center + Number of now imported files):

  • Deutsches_Textarchiv: 23 (WEblichtWebservices that only reference their own mdSelfLink)
  • Huygens_Metadata_Repository: 1127 (Example)
  • IDS_Repository: 972 (Example)
  • Meertens_Institute_Metadata_Repository: 1
  • TAlkBank: 1
  • The_Language_Archive_s_IMDI_portal: 19 (Example where the absence of the file broke a hierarchy)

Couldn't find a problem when disabling the filter (and didn't expect any). If there are no objections I will push the commit to the development branch next week.

comment:4 Changed 8 years ago by Twan Goosen

I think that looks pretty good. I suppose that in the case of the Huygens example no hierarchy is shown because the referenced record is not present in the harvest results? I guess there is not much we can do about that other than excluding such records but then the logic gets a bit too complicated to my taste.

comment:5 Changed 8 years ago by teckart@informatik.uni-leipzig.de

That's right, the only MD-RP points to this file. It would be indeed a little bit complicated to deal with these cases, because we would have to check this for every imported file of a center after the hierarchy tree is build, and delete the problematic ones then.

comment:6 Changed 8 years ago by teckart@informatik.uni-leipzig.de

Resolution: fixed
Status: newclosed

Pushed to devel (commit)

Note: See TracTickets for help on using tickets.