Changes between Version 4 and Version 5 of MDService Example Queries


Ignore:
Timestamp:
09/13/10 21:35:35 (14 years ago)
Author:
vronk
Comment:

adding various sample queries, explanations (collections, facets)

Legend:

Unmodified
Added
Removed
Modified
  • MDService Example Queries

    v4 v5  
    22Collection of Real-world queries  (but not all of them tested yet) and related issues.
    33
     4Showing both CQL queries and XPath-translations.
     5
     6MDRepository provides three interfaces:
     7{{{
     8  getCollections()
     9  queryModel()
     10  searchRetrieve()
     11}}}
     12
     13
    414== Basic facets ==
     15!  needs queryModel - Values
     16
     17 profile overview::
     18  important starting question: which profiles are actually used:
     19{{{
     20   ?operation=queryModel&q=//CMD/Components
     21}}}
     22   however this returns only the root elements, which may not unambiguously refer to the profiles.[[BR]]
     23   `CMD/Header/MdProfile` should refer to the correct profile, but may not be always filled. (+ requires queryModel to return Values!)
     24{{{
     25   ?operation=queryModel&q=//MdProfile
     26   ?operation=queryModel&q=//CMD/Components/MdProfile
     27}}}   
    528
    629 language overview::
    7  
     30{{{
     31   ?operation=queryModel&q=language
     32}}}
    833 continent/country::
     34{{{
     35   ?operation=queryModel&q=Location
     36}}}
    937
    1038== View collections ==
     39There is a special param `collection`, but this is currently unstable, as it relied on the exist's own collection mechanism.
     40A safe alternative is the use of element `IsPartOf` (full path: `/CMD/Resources/IsPartOfList/IsPartOf`)
     41{{{
     42  cql:   IsPartOf = test-hdl:1839/00-0000-0000-0001-494F-5
    1143
     44  xpath: //IsPartOf[.='test-hdl:1839/00-0000-0000-0001-494F-5']
     45         //IsPartOf[.='clarin-at:aac-test-corpus:sozialismus']
     46}}}
     47So to get the whole collection (all resources of the collection) we use `operation=searchRetrieve` and query element `IsPartOf`:[[BR]]
     48[http://demo.spraakdata.gu.se/clarin/cmd/model/stats?operation=searchRetrieve&query=//IsPartOf%5B.=%27clarin-at:aac-test-corpus:sozialismus%27%5D collection(clarin-at:aac-test-corpus:sozialismus) ]
     49
     50As the `IsPartOf`-relation is fully expanded (every resource references via `IsPartOf` every ancestor-collection CMD-record, not just the parent), the simple quey as introduced above returns all descendants, not only children, which should actually be the default behaviour. One possible resolution would be to add an attribute `@ancestor-level` or similar, which would allow to distinguish the depth of the `IsPartOf`-relation. Querying just the children would be accomplished by:
     51{{{
     52  //IsPartOf[@ancestor-level=1][.='clarin-at:aac-test-corpus:sozialismus']
     53}}}
     54This would even allow for flexible maximum depth of the query wrt to the collection-hierarchy.
     55
     56Querying multiple collections translates to a `OR`-query:
     57{{{
     58   xpath: //IsPartOf[.='{handle-coll1}' or .='{handle-coll2}']
     59          //IsPartOf[@ancestor-level=1][.='{handle-coll1}' or .='{handle-coll2}']
     60}}}
    1261
    1362== Simple Search Clauses ==
     
    1968 Phrase ::
    2069{{{
    21  Country = "United States" /* not working yet */
     70 Country = "United States" /* not working yet in MDService*/
    2271 Description any "Free play in a family context"
    2372 "data collection"
     73  (: but works in XPath:  :)
     74  //title[contains(.,'a machine-readable transcription')]
    2475}}}
     76
    2577 comparitors ::
    2678   does not work yet
     
    3082
    3183== Boolean ==
     84Boolean CQL-queries are easily conversible into XPath, they just have to be contextualised with some same ancestor node, like this:
     85{{{
     86 //Session[contains(Date,'2005')][contains(.//Description,'participant')]
     87}}}
     88Currently it may not be the root node `CMD` itself, because MDRepository searches for the `ancestor::CMD` of the XPath
     89So this will result in 0 (irrespective of the actual conditions)
     90{{{
     91 ! //CMD[contains(.//Title,'year')][.//Date='2001-11-21']
     92}}}
     93But this would work:
     94{{{
     95 //CMD/Components[contains(.//Title,'year')][.//Date='2001-11-21']
     96 //CMD/*[contains(.//Title,'year')][.//Date='2001-11-21']
     97 //Components[contains(.//Title,'year')][.//Date='2001-11-21']
     98}}}
     99Usually using `Components` as starting point is best, if searching in the actual metadata. This however does not cover the `/CMD/Header` and `/CMD/Resources` part of the MD-records. This is especially the case when constraining the collections with `IsPartOf`:
     100{{{
     101 //CMD/*[contains(.//Title,'year')][.//Date='2001-11-21'][.//IsPartOf='{handle-coll1}']
     102 (: this wouldn't work: :)
     103 !  //CMD/Components[contains(.//Title,'year')][.//Date='2001-11-21'][.//IsPartOf='{handle-coll1}']
     104}}}
    32105
     106Boolean queries:
    33107 AND::
    34108{{{
    35109 biling_data: ( ( Date any 2005 ) ) and ( ( Description any participant ) )
    36  //Session[contains(Date,'2005')][contains(.//Description,'participant')]
    37  
    38  //CMD[contains(.//Title,'year')][.//Date='2001-11-21']
    39 
    40110 biling_data: ( ( Date any 2005 ) ) and ( ( Description any part ) ) and ( ( Country any United ) )
     111}}}
     112 brackets are not necessary, as long as they are not necessary ;). This should work as well :
     113{{{
     114 biling_data: Date any 2005 and Description any participant
    41115}}}
    42116