Changes between Version 61 and Version 62 of FCS-Specification-ScrapBook


Ignore:
Timestamp:
02/17/14 20:07:15 (10 years ago)
Author:
teckart
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • FCS-Specification-ScrapBook

    v61 v62  
    767767}}}
    768768     If we do this, it to be decided, if we want to keep the `<Profile>` element or if we decide, that this information is redundant and a profiles will require a certain set of capabilities. However, if we ditch `<Profile>`, it's not so easy for a Client to decide spot-on what profile is supported by the Endpoint.
     769    * [[teckart|Thomas (ASV)]]: I am not sure if this is a problem, but we could interpret "Capabilities" just as a verbose listing of all capabilities of the endpoint. The <ed:Profile> is just the shorter version regarding our definition of what is "basic" and what is "extended". That way our standard aggregator can just work with the profile name, whereas other clients (that don't care how we interpret these terms) could look if an endpoint supports all functionality they need.
    769770* [[teckart|Thomas (ASV)]]: The endpoint specification contains information about supported profile and dataviews. This is no problem for an endpoint with only one resource or homogenous resources. When we extend the profile by adding information about annotation tiers this may not be applicable for all provided resources (like an endpoint that provides two corpora, where only one is annotated with POS tags). So maybe some of this information is not specific for the endpoint but for the resource.
    770771 * [[oschonef|Oliver (IDS)]]: My aim was to consider the list of data views as a hint, what data views  are available at an Endpoint. The semantics is not, that ''every'' resource/collections supports all these data views. It's a little influenced by what SRU does in explain with `<zr:schemaInfo>`.  We could keep it this way, remove it of extend it, so this information can be given on resource/collection level
     
    806807}}}
    807808      What about Collections with Sub-Collections? Parent collection would not indicate the supported data views or it would be the union of supported dataviews of their child-collections? Is there a use-case where collection `http://hdl.handle.net/1` contains more data (= resources) then union of `http://hdl.handle.net/1/1` and `http://hdl.handle.net/1/2`, i.e. there are resources "in between" the child-collection and it's parent? If so, could they have "conflicting" dataviews?
     809     * [[teckart|Thomas (ASV)]]: I am not sure about that these "invisible" resources. But in general we could think of this as normal inheritance. Every root collection element could specify the minimal set of supported dataviews for all daughter nodes (or if missing it is assumed that all entries in SupportedDataViews are supported). Every node in the sub-collection tree can overwrite this configuration. When we have Hits as mandatory dataview even for otherwise disjoint sets of dataviews in the sub-collections the root collection can at least provide Hits for everything.
    808810    * [[oschonef|Oliver (IDS)]]: Shall we foresee some mechanism for Clients to tell the Endpoint in what data views they are interested? E.g. the Endpoint may support Geolocation but the Client does not care or support it and could ask the Endpoint at query time ''to not'' serialize the Geolocation data view (= less wasted bytes send over the network).
     811     * [[teckart|Thomas (ASV)]]: I think that this a good idea. Maybe in some cases it could also be useful to provide dataviews only if they are explicitly requested. This could allow adding data views that are too "expensive" (computational or regarding bandwith) to generate for every request.
    809812* [[teckart|Thomas (ASV)]]: The current (old) solution for exposing granularity and structure of supported collections is a multiple-staged mechanism: the client queries for the first-level structure (=collections) and can explicitly ask the endpoint to give additional information about the internal structure of these collections (and so on...). This is very helpful for endpoints which support queries on detailled subcollections. The proposed solution above would force the endpoint to expose the complete structure of all provided resources in the explain response, which would lead (for example for the endpoint in Leipzig) to very large responses.
    810813 * [[oschonef|Oliver (IDS)]]: True, but the old approach is overly complex. If the response is large (> 100MB), so be it. An efficient Endpoint implementation should do an streaming approach when serializing the response and the Client should not assume, that the response to this information will fit in, let's say,1MB of memory. If it's is hard for the endpoint to compile this list (e.g. it requires complex database queries), it's IMHO again, a matter of the endpoint to cache this information (in memory or disk) and just stream it into the explain response.