Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#387 closed defect (fixed)

CMDI instance not indexed correctly?

Reported by: herold Owned by: teckart
Priority: minor Milestone: VLO-2.16
Component: VLO web app Version:
Keywords: Cc: teckart

Description

It seems like this resource is not correctly indexed as full text searches for e.g. "vu-dnc" or "Diachroon" don't turn up anything:

http://catalog.clarin.eu/vlo/?wicket:bookmarkablePage=:eu.clarin.cmdi.vlo.pages.ShowResultPage&fq=dataProvider:CMDI+Providers&fq=organisation:VU+University+Amsterdam&docId=hdl:10032/5f75f714c90bec3b7011dc1139bee418

(I can download the metadata without problems using curl, so at least this part of the report seems to be a browser issue.)

The original report (via vlw@clarin.eu) follows below:

Issue description
REcord with title "VU-DNC" cannot be found using full text search.
Also impossible to load full CMDI record
Your name Daan Broeder
Your email daan.broeder@mpi.nl

Change History (6)

comment:1 Changed 11 years ago by DefaultCC Plugin

Cc: teckart added

comment:2 Changed 11 years ago by herold

Status: newassigned

comment:3 Changed 11 years ago by herold

The metadata fields that are not indexed are from the defined by the cmdi-generalinfo component (http://catalog.clarin.eu/ds/ComponentRegistry?item=clarin.eu:cr1:p_1320144203804). But this is pretty common and shouldn't be problematic, no? Maybe the CMDI file is too large (1.4M)?

comment:4 Changed 11 years ago by teckart

Owner: changed from keeloo to teckart

comment:5 Changed 11 years ago by teckart

Resolution: fixed
Status: assignedclosed

Fixed in trunk (r3750): Fixing indexing problems for large files (ticket #387) by limiting the extracted data for facet "text" (ResourceProxy? elements are now avoided) and increasing the maxFieldSize parameter of the Solr server (thanks to Josef Willenborg for the hint)

comment:6 Changed 11 years ago by dietuyt

Milestone: VLO-2.16
Note: See TracTickets for help on using tickets.