source: MDRepository/trunk/xquery/README @ 1080

Last change on this file since 1080 was 1080, checked in by vronk, 13 years ago

minor update

File size: 3.6 KB
Line 
1== CLARIN MDRepository ==
2Steps to setup and run the repository
3
40. prerequisites + install
5        a) be sure to use java-jdk 1.6 (we experienced strange java-errors with 1.5)
6       
7        b) install:
8        http://exist-db.org/quickstart.html#sect2
9       
10        java -jar eXist-{version}.jar -p {install-dir}
11               
12        c) set admin pwd
13       
14        d) you may want to add memory to the JVM
15           under bin/functions.d/eXist-settings.sh#set_java_options()
16       
17        e) you may also want to grow the cache in conf.xml
18                 <db-connection cacheSize="48M" collectionCache="24M" database="native"       
19                  where @cacheSize could be around 512M
20      and @collectionCache should be around one third of the @cacheSize
21
22
231. add scripts to: /db/clarin
24               
25                + cmd-model.xqm has all the logic
26                + cmd-model.xql is the script being called as the interface
27                + groups.xsl
28         (+) cmd-stats.xql is meant for testing purposes, but not integrated yet
29         (+) init-cache.xql is meant for refreshing the cache with some long-running (resource-intensive) queries, meant to run once upon dataset change
30
31
322. add a clarin-user in /db/system/users.xml
33   (needed for writing into the cache)
34   + /db/clarin/writer.xml with given user, like this:
35   <write>
36    <write-user>clarin</write-user>
37    <write-user-cred>{PASSWORT}</write-user-cred>
38        </write>
39
40
413. create a collection for caching,
42        eg: /db/cache
43        this has to correspond to the entry in cmd-model.xqm:
44        declare variable $cmd-model:commonFreqsPath as xs:string := "/db/cache";
45       
46        If you change something, you have to manually clear the cache-collection.
47       
48        Queries on queryModel- and getCollections-interfaces are being cached.
49        The key is:
50          for getCollections: collection{maxdepth}-{hash({collection-handle})}
51          for queryModel:   values{maxdepth}-{hash({simple xpath from q-param})}
52
53       
544. define indices
55         copy cmdi-mirror.xconf into /db/system/config/db/cmdi-mirror
56         
57         
585. add data to  /db/cmdi-mirror
59         (the file-system structure will be reflected in the "collection"-structure within exist,
60         however this is irrelevant for the MDRepository methods.
61         Those rely on the linking via handles in MdSelfLink/ResourceRef and <IsPartOf> elements of the MDRecords.
62         The handles in <IsPartOf> are redundant (necessary for faster collection-constraint search)
63         and can be derived from the ResourceRef/MdSelfLink link.
64         This can be done before storing the data in the repository,
65         or after the import directly in the repository (XUpdate-scripts for this will be available soon)
66
67         The top level collection record is by convention called colleciton_root.cmdi
68         and is marked with: <IsPartOf>root</IsPartOf>
69         (So every dataset (olac, lrt, imdi) has one such MDRecord.)
70
716. depending on your server-setup (port) you should be able to get your first query under somewhere like:               
72               
73        http://localhost:8680/exist/rest/db/clarin/cmd-model.xql?q=Components
74        (queryModel is the default operation)
75       
76        http://localhost:8680/exist/rest/db/clarin/cmd-model.xql?operation=getCollections&collection=
77       
78        These queries may take some time, when run first time, so be patient.
79        Avoid starting multiple times.
80        You can see in the cache-collection, if the results are ready.
81
82
83       
84== test suite ==
85THIS IS CURRENTLY BEING DEVLEOPED! NOT SAFELY USABLE YET!
86
87own build-file: build-tests.xml
88based on exist's performance.xml sub-build-file
89imports main exist build-file.
90This yields problems with basedir for the imported build-files
91
92The simplest solution I could find is to set the basedir as property on command line:
93
94ant -f build-tests.xml -Dbasedir=C:/apps/exist benchmark
95
96The other options are to be set in build-tests.properties!
97
98actual queries for testing/benchmarking are written in cmd-test.xml
Note: See TracBrowser for help on using the repository browser.