Opened 9 years ago

Closed 9 years ago

#762 closed enhancement (fixed)

Advanced search via lucene syntax

Reported by: Twan Goosen Owned by: Twan Goosen
Priority: major Milestone: VLO-3.3
Component: VLO web app Version:
Keywords: Cc: teckart@informatik.uni-leipzig.de

Description (last modified by Twan Goosen)

The VLO search box should support (limited?) queries using the lucene syntax so that users can perform more advanced searches over the full text index.

See https://wiki.apache.org/solr/DisMaxQParserPlugin#Query_Syntax for a description of the support in Solr.

Change History (11)

comment:1 Changed 9 years ago by DefaultCC Plugin

Cc: teckart@informatik.uni-leipzig.de added

comment:2 Changed 9 years ago by Twan Goosen

Description: modified (diff)

As of VLO 3.3-SNAPSHOT r6284 the DisMax query parser is used. This already gives support for things like

  • exclusion: corpus -hugo (vs corpus)
  • phrases: "north america"' (vs north america)

We should probably switch to (or allow switching to) ExtendedDisMax which has support for AND, OR, ...

comment:3 Changed 9 years ago by Twan Goosen

Extended dismax used as of r6300
TODO: add some form of usage instructions/documentation

comment:4 Changed 9 years ago by Twan Goosen

Some examples that currently work on the testing version:

  • German AND acquisition (notice collections on top)
  • Berlin AND (german OR deutsch*)
  • modality:*speech* OR modality:*spoken* (notice modality facet)
  • (modality:*speech* OR modality:*spoken*) AND German AND Dutch
  • genre:discourse AND Turkish AND (Dutch OR German)
  • languageCode:"code:swe" AND country:Finland
    • After potential addition of ‘language’ search field: language:Swedish AND country:Finland

comment:5 Changed 9 years ago by Twan Goosen

Consider adding field aliases for some fields with unfriendly names.

Candidates:

  • languageCode -> lang
  • languageName -> language (#770)
  • resourceClass -> resourceType, type (though currently not queryable)

edismax documentation of the feature that should allow for this: http://wiki.apache.org/solr/ExtendedDisMax#Field_aliasing_.2BAC8_renaming

comment:6 Changed 9 years ago by Twan Goosen

Alias 'language' for 'languageName' was added, so you can now also do, e.g.,

modality:*spoken* AND language:German AND language:Polish AND NOT language:English

comment:7 Changed 9 years ago by Twan Goosen

NOTE: For the instructions: tooltips functionality of JQuery UI can be used, see this post for instructions on how to include links (for sample queries)

comment:8 Changed 9 years ago by Twan Goosen

Resolution: fixed
Status: newclosed

Added syntax documentation and a link to it with a JS tooltip in [6337:6345/vlo/trunk/vlo-web-app]

comment:9 Changed 9 years ago by Twan Goosen

Resolution: fixed
Status: closedreopened

In current trunk version, the language field search examples with wildcards do not work, and it's also case sensitive. Probably best to make (a copy of) language name a text field instead of String.

comment:10 Changed 9 years ago by Twan Goosen

Owner: set to Twan Goosen
Status: reopenedassigned

comment:11 in reply to:  9 Changed 9 years ago by Twan Goosen

Resolution: fixed
Status: assignedclosed

Replying to twan.goosen@…:

In current trunk version, the language field search examples with wildcards do not work, and it's also case sensitive. Probably best to make (a copy of) language name a text field instead of String.

works with current schema.xml (currently importing on http://catalog-clarin.esc.rzg.mpg.de/vlo).
Updated one erroneous example in r6406

Note: See TracTickets for help on using tickets.