wiki:MDServiceImpl

Notes about MDService and MDBrowser Implementation

(Implementation notes about MDRepository)

The service can be found under: http://clarin.aac.ac.at/MDService2

This page is rather outdated, and needs update. Meanwhile you might want to check: http://clarin.aac.ac.at/MDService2/docs/htmlpage/info which is less outdated.

MDService

MDService is a REST-style webservice implemented in Java currently based on the apache.Struts2 framework (Build around the MVC design pattern). Later migration to Spring is an option. Wrt to the current processing model, which is just passing around xml-streams a look on apache.cocoon (Spring-based since last version (2.2)) could be also worthwhile.

Class Diagram of the current implementation of MDService

MDService serves primarily as a Proxy - passing the (possibly recoded) request to the appropriate Repositories or Registries (source services) and passing back the result. Current implementation even directly passes the incoming result stream back to the user, without any further processing (i.e. also uninformed about the content).

The default response-format is the "raw" xml received from the source service. The request can contain a format-parameter. If the server recognizes this as a valid parameter it performs the appropriate XSLT-transformation upon the original XML and returns back a HTML-snippet. This is meant to be primarily used by the MDBrowser.

Although primary competence of MDService is to serve information from MDRepository, especially user's interaction with MDBrowser (query formulation) requires information from other services as well. These are: ComponentRegistry, VirtualCollectionRegistry, DataCategoryRegistry? and RelationRegistry.

MDBrowser

MDBrowser explained (status 2010-10)

The image is a snapshot of the current version of the MDBrowser together with description of individual features already implemented.

MDBrowser is the user-interface to the MDService, served currently by the same web-application. It is actually only a boilerplate index.jsp dynamically filled using AJAX (using jQuery library). Although this approach is seems very efficient and easy to work with, we have to stay beware of the distinction and try to hold the functionalities cleanly separated (modularization) within the application, allowing for easy refactoring at later stage.

The basic interaction pattern is, that the client (javascript in client's browser) requests during initialization or upon user's interaction a XHTML-snippet from the server and places it in a block on the page. For this it uses the REST-interface of the Webservice with the format-parameter set appropriately (it accepts values like: htmllist, htmldiv, htmltable), which returns the same information (as the default xml-based REST-interface) just transformed into XHTML-snippets.

Four phases can be distinguished in the interaction between MDBrowser and MDService:

  1. populating the user interface on initialization
  2. contextual help during query building
  3. submit query and retrieve the resulting MDCollection
  4. retrieve one MDRecord

As stated before MDBrowser needs information also from other sources than just MDRepository. For the sake of uniform interaction patterns, and also due to javascript's security restrictions on cross-site scripting, all the information the browser needs, is provided by the MDService acting as proxy. So be it information from ComponentRegistry or VirtualCollectionRegistry, the browser asks for it via dedicated REST-interfaces of the MDService, which then passes these through to appropriate services, optionally caches the received information, applies the transformations and returns the result back to the MDBrowser.

Current status of implemented functionality

MDService currently connects to / provides information from ComponentRegistry, DataCategoryRegistry? and mainly MetadataRepository?, all this information is available by default as XML or as an HTML-snippet which is used by the MDBrowser to display these peaces of information to the user. In detail it's following:

collections
this is currently still served from a cached snapshot see #1 for details
elements/terms
this compiled list of elements used in all the profiles with the information on usage in the profiles (think inverted index) Currently this is for the user to search by just one element irrespective of the various contexts it is used in.
profiles/components/elements
Select one profile, you get all the components and elements listed.
data categories
DataCats? (currently |218|) listed from the Metadata-profile#5 from isocat

Here are the functions listed in detail (We try out various mechanisms of interaction, so the handling is a bit chaotic or "diverse"):

  • A basic search functionality is implemented with following features:
    • multi-selection for collections as basic filter for the queries (however the handling of multiple collections does not seem to be solved in MDRepository, so we just send the first one..)
    • you need to select a collection before submitting the query.
    • simple query input
      • Autocomplete in query-input-field - lists the available elements. (Does not work perfectly - you have to input at least two characters)
      • boolean queries don't work yet..
    • a primitive Queryset allowing for handling/viewing multiple queries in parallel. The queries are listed under each other and can be flipped-out and -in and individually closed

Sample queries:

 Peter  
 Title any the 
 Language = Unspecified
  • When formulating the query the user is supported by the selection-lists in the menu on the left:
    • clicking on Element name in Terms
      => opens up a bubble with details about usage of the Element-name in various profiles and contexts (based on information from ComponentRegistry)
    • source:MDService2/trunk/MDService2/WebContent/style/imgs/icon_filter_10.png [cmd_filter] in the Terms and Profiles/Components-tab
      => copies the name of the Term into the query-input field
    • source:MDService2/trunk/MDService2/WebContent/style/imgs/icon_detail_10.png [cmd_detail] in Profiles/Components-tab
      => opens up a detail-view of given subtree, with information from ComponentRegistry and with input-boxes on elements (think: searchByProfile). If we want this, it should be integrated in the selection menu, under the result-page it's not positioned well..
    • source:MDService2/trunk/MDService2/WebContent/style/imgs/icon_detail_10.png [cmd_detail] in Terms
      => should open statistical usage information for given Element from MDRepository (fixed collection currently, I think olac)
  • click on a DatCat in isocatDCR-tab
    => detail information about that DatCat? in a 'bubble' (floating div)
  • click on the position-number of the record in resultset may result in displaying the detail view of the record in the floating div (if you didn't close the annoying thing already before). Currently this takes some time and works only for the MD-records which are correctly identified by the MdSelfLink (try collection: silang_data).

MDService also implements a primitive cache, so any of the information (from ComponentRegistry, DCR) may not be perfectly up to date.

The whole application is very unstable yet, so you have to expect everything. Unfortunately the full features are currently only available on Firefox, minor annoyances (missing command-imgs) on Chrome and major annoyances on InternetExplorer?. This should be minor and resolvable easily as the underlying js-library claims cross-browser, but unfortunately we neglected this until now...

Next steps

  • check the REST-interface on consistency
  • feed collections live from MDRepository #1
  • usage statistics about Values in Elements #2
  • enhance query-input (support searchClause (::= index relation term), support boolean) - trying out various input methods
  • enhance response/result:
    • timing
    • paging
    • sorting
    • dynamic columns/fields in the resultset (served from the same terms available for the query
  • serve MD-recod-detail from the cached data of the recordset #3
  • serve IsPartOf?-information for MDRecords #7 (option to group by this)
  • connect to VCR to read and register virtual collections
    But how to translate VCs to MDRepository-queries??
  • implement User/Session/User?-Profile (based on ComponentRegistry solution)
  • connect to RelationRegistry.
  • implement a simple ResourceViewer? (just handle the ResourceRefs? -> open in iframe or new window)

Open questions

  • How to interpret a basic one-term query?
    • search any substring in any field
    • exact match on selected fields
  • handling of data-categories - they have to be resolved to Element-names as MDRepository is unaware of Datacategories This is not exactly true. MDRecords should have the usual schema-reference in the root-element, (at least those produced by Avril) and the schema references the used components via ComponentId so it should be possible to work with identities (instead of just element-names). However the schema-reference is not present yet in the test-data and is also hardly to be guaranteed from other data-providers. Furthermore this would make the querying a lot more complex, so we have to find intelligent ways to employ it (eg using the identity-based querying only when dealing with ambiguous queries).

Broader (Long-term) Issues

  • Visual summaries (charts, graphs) of the results or whole repository (Treemap!)
Last modified 13 years ago Last modified on 04/23/11 14:08:18

Attachments (4)

Download all attachments as: .zip