753 | 753 | * [[oschonef|Oliver (IDS)]]: The question is, what different functionality do we want to differentiate. And, of course, we need to encode these features for Clients. I think, the basic profile is quite clear in what Endpoints need to support. All the "interesting" (query expansion, data cats, etc) bits are nowhere clear enough to have them specified. I want to do this in the yet-to-be defined "extended" profile, which could also foresee such a matrix. That could be, e .g. an additional list of `<feature>` elements that will be part of the endpoint description. However, I want to keep the distinction between a basic profile, which people can implement right ways, and more advanced stuff. We really need to have a normative spec. |
754 | 754 | * [[teckart|Thomas (ASV)]]: I absolutly agree. Just wanted to point out that maybe we already can think of how these features will be encoded (without a decision of what features will make it into the future extended specification). |
| 755 | * [[oschonef|Oliver (IDS)]]: We could already prepare for for feature encoding, by adding adding a `<feature>` elements to the Endpoint Description. Or better call the capabilities, so something along the lines of: |
| 756 | {{{#!xml |
| 757 | <ed:EndpointDescription xmlns:ed="http://clarin.eu/fcs/1.0/endpoint-description"> |
| 758 | <ed:Profile>basic</ed:Profile> |
| 759 | <ed:Capabilities> |
| 760 | <!-- capabilities should be identified by closed vocabulary; maybe encoded by URIs? --> |
| 761 | <ed:Capability>http://clarin.eu/fcs/1.0/feature/basic-search</ed:Capability> |
| 762 | <!-- actually the next would already be an extended capability beyond the basic profile --> |
| 763 | <ed:Capability>http://clarin.eu/fcs/1.0/feature/query-expansion</ed:Capability> |
| 764 | </ed:Capabilities> |
| 765 | <!-- other EndpointDescription stuff ... --> |
| 766 | </ed:EndpointDescription> |
| 767 | }}} |
| 768 | If we do this, it to be decided, if we want to keep the `<Profile>` element or if we decide, that this information is redundant and a profiles will require a certain set of capabilities. However, if we ditch `<Profile>`, it's not so easy for a Client to decide spot-on what profile is supported by the Endpoint. |
| 772 | * [[oschonef|Oliver (IDS)]]: Yes, so do you think we should mark "supported dataviews" per collection/resource? Something along the lines like: |
| 773 | {{{#!xml |
| 774 | <ed:EndpointDescription xmlns:ed="http://clarin.eu/fcs/1.0/endpoint-description"> |
| 775 | <ed:Profile>basic</ed:Profile> |
| 776 | <ed:SupportedDataViews> |
| 777 | <ed:SupportedDataView id="dv1">application/x-clarin-fcs-hits+xml</ed:SupportedDataView> |
| 778 | <ed:SupportedDataView id="dv2">application/x-cmdi+xml</ed:SupportedDataView> |
| 779 | <ed:SupportedDataView id="dv3">image/png</ed:SupportedDataView> |
| 780 | </ed:SupportedDataViews> |
| 781 | <ed:Collections> |
| 782 | <ed:Collection pid="http://hdl.handle.net/4711/0815"> |
| 783 | <!-- NB: regular stuff skipped --> |
| 784 | <ed:SupportedDataViews ref="dv1" /> |
| 785 | </ed:Collection> |
| 786 | <ed:Collection pid="http://hdl.handle.net/4711/0816"> |
| 787 | <!-- NB: regular stuff skipped --> |
| 788 | <ed:SupportedDataViews ref="dv1 dv2" /> |
| 789 | </ed:Collection> |
| 790 | <ed:Collection pid="http://hdl.handle.net/1"> |
| 791 | <!-- NB: regular stuff skipped --> |
| 792 | <ed:SupportedDataViews ref="dv1 dv2 dv3" /> |
| 793 | <ed:Collections> |
| 794 | <ed:Collection pid="http://hdl.handle.net/1/1"> |
| 795 | <!-- NB: regular stuff skipped --> |
| 796 | <ed:SupportedDataViews ref="dv1 dv2" /> |
| 797 | </ed:Collection> |
| 798 | <ed:Collection pid="http://hdl.handle.net/1/2"> |
| 799 | <!-- NB: regular stuff skipped --> |
| 800 | <ed:SupportedDataViews ref="dv1 dv3" /> |
| 801 | </ed:Collection> |
| 802 | </ed:Collections> |
| 803 | </ed:Collection> |
| 804 | </ed:Collections> |
| 805 | </ed:EndpointDescription> |
| 806 | }}} |
| 807 | What about Collections with Sub-Collections? Parent collection would not indicate the supported data views or it would be the union of supported dataviews of their child-collections? Is there a use-case where collection `http://hdl.handle.net/1` contains more data (= resources) then union of `http://hdl.handle.net/1/1` and `http://hdl.handle.net/1/2`, i.e. there are resources "in between" the child-collection and it's parent? If so, could they have "conflicting" dataviews? |
| 808 | * [[oschonef|Oliver (IDS)]]: Shall we foresee some mechanism for Clients to tell the Endpoint in what data views they are interested? E.g. the Endpoint may support Geolocation but the Client does not care or support it and could ask the Endpoint at query time ''to not'' serialize the Geolocation data view (= less wasted bytes send over the network). |
759 | | * [[oschonef|Oliver (IDS)]]: True, but the old approach is overly complex. If the response is large (> 100MB), so be it. An efficient Endpoint implementation should do an streaming approach when serializing the response and the Client should not assume, that the response to this information will fir in 1MB of memory. If it's is hard for the endpoint to compile this list (e.g. it requires complex database queries), it's IMHO again, a matter of the endpoint to cache this information (in memory or disk) and just stream it into the explain response. |
| 810 | * [[oschonef|Oliver (IDS)]]: True, but the old approach is overly complex. If the response is large (> 100MB), so be it. An efficient Endpoint implementation should do an streaming approach when serializing the response and the Client should not assume, that the response to this information will fit in, let's say,1MB of memory. If it's is hard for the endpoint to compile this list (e.g. it requires complex database queries), it's IMHO again, a matter of the endpoint to cache this information (in memory or disk) and just stream it into the explain response. |