Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#750 closed enhancement (fixed)

Give higher score to metadata files higher in the hierarchy

Reported by: Dieter Van Uytvanck Owned by: Twan Goosen
Priority: major Milestone: VLO-3.3
Component: VLO web app Version:
Keywords: Cc: Twan Goosen

Description

It would be nice if all CMDI files that belong to a single collection are displayed, the topmost files are shown. Of course this presupposes a hierarchy (via ResourceProxy? links of the type Metadata).

It is not trivial to (re)construct such a tree for each collection upon importing, as basically you have to iterate twice over all the records:

  • once on the import
  • then afterwards again over all SOLR documents to assign the level in the hierarchy score

Example for the TLA: DoBeS archive collection:

score 1 = http://hdl.handle.net/1839/00-0000-0000-0001-305B-C@format=cmdi

child of previous CMDI: score 2 = http://hdl.handle.net/1839/00-0000-0000-0014-C52F-7@format=cmdi

child of previous CMDI: score 3 = http://hdl.handle.net/1839/00-0000-0000-0015-485F-2@format=cmdi

This would also help to get (at least partially, for the hierarchies that can correctly be reconstructed based on the harvested CMDIs) ticket:382 addressed.

Change History (9)

comment:1 Changed 9 years ago by DefaultCC Plugin

Cc: Twan Goosen added

comment:2 Changed 9 years ago by Twan Goosen

As a note with respect to the implementation: if the importer stores such a 'hierarchy level' in a field numerically, this can be incorporated in the ranking score (see #575), which would have the desired effect in the search results, although search term relevance will be weighed in as well of course.

comment:3 Changed 9 years ago by Twan Goosen

Milestone: VLO-3.2

VLO 3.2 for now, we need to split that milestone up to get the most urgent items (not sure if this should be included) done first

comment:4 Changed 9 years ago by Twan Goosen

Milestone: VLO-3.2VLO-3.3

Moved a number of existing VLO tickets to 3.3 milestone

comment:5 Changed 9 years ago by teckart@informatik.uni-leipzig.de

Component: VLO importerVLO web app
Owner: set to Twan Goosen
Status: newassigned

Support of Solr field _hierarchyWeight added to vlo_importer and vlo_solr (r6212:6214), field contains number of edges to a top node in the CMDI hierarchy (hence a resource that is not part of another resource has a hierarchyWeight of 0)

comment:6 Changed 9 years ago by Twan Goosen

Boosting of records with a lower '_hierarchyWeight' value in feature branch for #761 in r6278
Once vlo-ticket761 is reintegrated, this can probably be closed as well.

comment:7 Changed 9 years ago by Twan Goosen

Resolution: fixed
Status: assignedclosed

r6284 fixes #761 which includes (inverted) sorting by _hierarchyWeight

comment:8 Changed 9 years ago by Twan Goosen

Resolution: fixed
Status: closedreopened

TODO: Further boosting on basis of the presence of _hasPart field value(s) to distinguish between isolated vertices and hierarchy top levels (both hierarchy weight 0)

comment:9 in reply to:  8 Changed 9 years ago by Twan Goosen

Resolution: fixed
Status: reopenedclosed

Replying to twan.goosen@…:

TODO: Further boosting on basis of the presence of _hasPart field value(s) to distinguish between isolated vertices and hierarchy top levels (both hierarchy weight 0)

r6287:6289 adds the new _hasPartCount field, which is now used in a boosting function

Last edited 9 years ago by Twan Goosen (previous) (diff)
Note: See TracTickets for help on using tickets.