wiki:MDService Example Queries

MDService Example Queries

Collection of Real-world queries (but not all of them tested yet) and related issues.

Showing both CQL queries and XPath-translations.

MDRepository provides three interfaces:

  getCollections()
  queryModel()
  searchRetrieve()

Basic facets

! needs queryModel - Values

profile overview
important starting question: which profiles are actually used:
   ?operation=queryModel&q=//CMD/Components
however this returns only the root elements, which may not unambiguously refer to the profiles.
CMD/Header/MdProfile should refer to the correct profile, but may not be always filled. (+ requires queryModel to return Values!)
   ?operation=queryModel&q=//MdProfile
   ?operation=queryModel&q=//CMD/Components/MdProfile
language overview
   ?operation=queryModel&q=language
continent/country
   ?operation=queryModel&q=Location

View collections

There is a special param collection, but this is currently unstable, as it relied on the exist's own collection mechanism. A safe alternative is the use of element IsPartOf (full path: /CMD/Resources/IsPartOfList/IsPartOf)

  cql:   IsPartOf = test-hdl:1839/00-0000-0000-0001-494F-5

  xpath: //IsPartOf[.='test-hdl:1839/00-0000-0000-0001-494F-5']
         //IsPartOf[.='clarin-at:aac-test-corpus:sozialismus']

So to get the whole collection (all resources of the collection) we use operation=searchRetrieve and query element IsPartOf:
collection(clarin-at:aac-test-corpus:sozialismus)

As the IsPartOf-relation is fully expanded (every resource references via IsPartOf every ancestor-collection CMD-record, not just the parent), the simple quey as introduced above returns all descendants, not only children, which should actually be the default behaviour. One possible resolution would be to add an attribute @ancestor-level or similar, which would allow to distinguish the depth of the IsPartOf-relation. Querying just the children would be accomplished by:

  //IsPartOf[@ancestor-level=1][.='clarin-at:aac-test-corpus:sozialismus']

This would even allow for flexible maximum depth of the query wrt to the collection-hierarchy.

Querying multiple collections translates to a OR-query:

   xpath: //IsPartOf[.='{handle-coll1}' or .='{handle-coll2}']
          //IsPartOf[@ancestor-level=1][.='{handle-coll1}' or .='{handle-coll2}']

Simple Search Clauses

Index ANY term
 Title any the
Phrase
 Country = "United States" /* not working yet in MDService*/
 Description any "Free play in a family context"
 "data collection"
  (: but works in XPath:  :)
  //title[contains(.,'a machine-readable transcription')] 
comparitors
does not work yet
 Date < 2005 /* especially tricky as the Date-format YYYY-MM-DD (or even other) */

Boolean

Boolean CQL-queries are easily conversible into XPath, they just have to be contextualised with some same ancestor node, like this:

 //Session[contains(Date,'2005')][contains(.//Description,'participant')]

Currently it may not be the root node CMD itself, because MDRepository searches for the ancestor::CMD of the XPath So this will result in 0 (irrespective of the actual conditions)

 ! //CMD[contains(.//Title,'year')][.//Date='2001-11-21']

But this would work:

 //CMD/Components[contains(.//Title,'year')][.//Date='2001-11-21']
 //CMD/*[contains(.//Title,'year')][.//Date='2001-11-21']
 //Components[contains(.//Title,'year')][.//Date='2001-11-21']

Usually using Components as starting point is best, if searching in the actual metadata. This however does not cover the /CMD/Header and /CMD/Resources part of the MD-records. This is especially the case when constraining the collections with IsPartOf:

 //CMD/*[contains(.//Title,'year')][.//Date='2001-11-21'][.//IsPartOf='{handle-coll1}']
 (: this wouldn't work: :)
 !  //CMD/Components[contains(.//Title,'year')][.//Date='2001-11-21'][.//IsPartOf='{handle-coll1}']

Boolean queries:

AND
 biling_data: ( ( Date any 2005 ) ) and ( ( Description any participant ) )
 biling_data: ( ( Date any 2005 ) ) and ( ( Description any part ) ) and ( ( Country any United ) ) 
brackets are not necessary, as long as they are not necessary ;). This should work as well :
 biling_data: Date any 2005 and Description any participant
AND OR
 biling_data: ( ( Date any 2005 ) or ( Country any United ) ) and ( ( Description any part ) ) 
 //CMD/Components[.//Date[contains(.,'2005')] or .//Country[contains(.,'United')]][.//Description[contains(.,'part')]]
AND NOT
seems very expensive, not sure about the syntax
 biling_data: ( ( Date any 2005 ) not ( Country = Japan) )

 //CMD[.//imdi-corpus][not(.//ResourceType[.='Metadata'])]
Last modified 14 years ago Last modified on 09/13/10 21:35:35