{{{ #!html

Contents

}}} [[PageOutline(1-3, , inline)]] {{{ #!comment Obviously, your page starts below this block }}} = XSD Schema = == Preamble == The xsd schema is designed according to the following paradigm: -- There are 7 sorts of resources in DASISH: {{{CachedRepresentation}}}, {{{Source}}}, {{{User}}}, {{{Annotation}}}, {{{Notebook}}}, {{{Lists of Permissions}}}, {{{Lists of Versions}}}. -- There are 6 xsd-types corresponding to the serialisations of all the types of resources above, except {{{CachedRepresentation}}}. There is no an xsd-schema type corresponding to {{{Cached representation}}} because a cached representation is a "pure" resource like an image or a text file that does not contain any meta-information about itself. The metadata of a cached presentation are defined via an instance of {{{CachedRepresentationInfo}}}. -- Each of these 6 types has an obligatory attribute "URI" which contains DASISH identifier pointing to the location of the resource on the DASISH server. -- There are corresponding lists-of-reference types: {{{CachedRepresentations}}}, {{{Sources}}}, {{{Users}}}, {{{Annotations}}}, {{{Notebooks}}}. Their names are just plural English forms of the corresponding types. -- There are corresponding resource-info types: {{{CachedRepresentationInfo}}}, {{{SourceInfo}}}, {{{UserInfo}}}, {{{AnnotationInfo}}}, {{{NotebookInfo}}}. They contain reference to the corresponding resource plus the most important information about the resource. -- There are corresponding list-of-resource-info types: {{{SourceInfos}}}, {{{UserInfos}}}, {{{AnnotationInfos}}}, {{{NotebookInfos}}}. There is a number of auxiliary types as well. A commonly-used one is ResourceREF which contains the attribute "ref" of type {{{xs:anyURI}}}. It allows to declare elements-references and avoid mixing them with elements-resources. === Handling new (not yet in the DB) sources === Adding annotation with the target sources which are not yet in the DB needs special treatment. It becomes clear when the POST body for a new annotation must be serialized. Two approaches seem to be plausible. We will follow the FIRST option. 1) A "strongly-typed" schema. An annotation contains a list of elements-"targets". Each of them can be either a source element or a new-source element. It is implemented using xs:choice construct for elements. A source and a new-source element differs by one attribute: a source has obligatory "ref" attribute, and a new source has an obligatory "xml:id" attribute. See [source:DASISH/t5.6/schema/trunk/annotator-schema/src/main/resources/DASISH-schema.xsd DASISH-schema] 2) A "weakly-typed" schema. An annotation contain a list of elements-"targets" of the same type that contains two non-obligatory attributes: "ref" and "xml:id". The type-checking "''at least one of the attributes is present and they are mutually exclusive''" may be left for later to schematron or so. See [source:DASISH/t5.6/docs/XMLandXSD/DASISH-schema-alternative.xsd DASISH-alternative-xsd]. The link to the second, "weakly-typed", version of the XSD-schema is left for the reference, however this version is not maintained any more. = Scenario XML's validated vs the given schema = See [[DASISH/Scenario]] == Responding GET api/user/uid == {{{#!xml }}} == Retrieving annotations == === Responding GET api/annotations?link="http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia"&access=read === The root element below is of type {{{AnnotationInfos}}} (the list of {{{AnnotationInfo}}}). {{{#!xml My client is not in a hurry Nativity Facade Nativity Facade (old site) }}} === Responding GET api/annotations/AIDzzz (example of resolvable target sources) === {{{#!xml Nativity Facade different http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia#Nativity_Fa.C3.A7ade 20.04.2013 http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia#Passion_Fa.C3.A7ade 20.04.2013 }}} === Responding GET api/annotations/AIDzyy (example usage for unresolvable target sources 1) === The respond for an annotation with unresolved target sources and the respond for an annotation with resolved target sources (see above) are both instances of the same schema element. However, the annotation refers to an obsolete version of the page. Next, having the target source references, the client will ask for the source versions saved in the DB. The last step: having the info about the version under consideration, the client asks for cached representations of the version. {{{#!xml ?xml version="1.0" encoding="UTF-8"?> Nativity Facade (old page) different http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia#Nativity_Fa.C3.A7ade 2010-01-29T23:59:59 http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia#Passion_Fa.C3.A7ade 2010-01-29T23:59:59 }}} === Responding GET api/sources/SIDbbb (unresolvable target sources 2) === {{{#!xml }}} === Responding GET api/sources/SIDbbb/versions (unresolvable target sources 3) === {{{#!xml 2010-01-29T23:59:59 20.04.2013 }}} === Responding GET api/sources/SIDbbb/cached/CIDtttt/metadata (unresolvable target sources 4) === {{{#!xml }}} == Making a new annotation == === Request body for POST api/annotations === {{{#!xml Comapring English and Catalan Wiki History http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia#History 20.04.2012 http://ca.wikipedia.org/wiki/Temple_Expiatori_de_la_Sagrada_Fam%C3%ADlia#Hist.C3.B2tia 20.04.2013 }}} === Request body for POST api/annotations: another example === The serialization of the POST body for another example (UGOT): {{{#!xml Douglas Adams - Wikipedia, the free encyclopedia Adams was born 1952-03-11 http://en.wikipedia.org/wiki/Douglas_adams#xpointer(start-point(string-range(//div[@id="mw-content-text"]/table[1]/tbody[1]/tr[3]/td[1]/text()[1],'',12))/range-to(string-range(//div[@id="mw-content-text"]/table[1]/tbody[1]/tr[3]/td[1]/text()[1],'',25))) 2013-04-26T11:23:26.000Z }}} === Response body (envelope) for POST api/annotations === The temporary id is replaced with the permanent reference. However, no cahced representation is found for the catalan web-page. Therefore, in the action part of the envelope there is an action CREATE_CACHED_REPRESENTATION for the object which is the source for catalan web-page. {{{#!xml Comapring English and Catalan Wiki History http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia#History 20.04.2012 http://ca.wikipedia.org/wiki/Temple_Expiatori_de_la_Sagrada_Fam%C3%ADlia#Hist.C3.B2tia 20.04.2013 }}} The client sends metadata cached representation in the POST body, and a cached representation itself. An example of serialized metadata for a cached representation has been considered above, so we do not give it here. == Editing annotation body == === Request: an updated body === {{{#!xml History in English and Catalan }}} === Enveloped respond: new (updated) annotation and a list of actions === The list of actions is empty because there are cached representations for all the target sources. {{{#!xml Comapring English and Catalan Wiki History in English and Catalan http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia#History 20.04.2012 http://ca.wikipedia.org/wiki/Temple_Expiatori_de_la_Sagrada_Fam%C3%ADlia#Hist.C3.B2tia 20.04.2013 }}} == Managing permission lists of users == === GET api/annotations/AIDzyy/permissions === {{{#!xml }}} === GET api/users/info?email="alecor@mpi.nl" === {{{#!xml }}} === PUT api/annotations/AIDzyy/permissions=== Swapping rights of the users xyxy and xxzz: xyxy one becomes a reader, and xxzz becomes a writer. Updating w.r.t. the respond {{{GET api/annotations/AIDzyy/permissions}}} above. Request body: {{{#!xml }}} Respond body: is an envelope containing this list and no actions, since all the users are presented in the DB. === PUT api/annotations/AIDzyy/permissions/UIDagc === {{{#!xml writer }}} == Managing Notebooks == === GET api/notebooks === {{{#!xml Gaudi Douglas Adams }}} === GET api/notebooks/NIDxyxy === {{{#!xml Gaudi }}} === GET api/notebooks/NIDxyxy/annotations/ === Respond is a list of annotation info, is similar to the respond on {{{GET api/annotations?link="http://en.wikipedia.org/wiki/Sagrada_Fam%C3%ADlia"&access=read}}}. = Issues with the schema = == Possible namespace pollution: Ticket 348 to be discussed with Peter == Peter: "It looks like there might be some namespace pollution or some other anomaly that causes the jaxb a uto generated classes to omit the getter for notebooks from the ObjectFactory. This is an issue when a jaxb root node is required, such as in the rest interface. A work around has been added which makes it clear where the issue is and why the ObjectFactory is required, but this needs to be replaced when the schema is updated." == External_id (DataBase) vs URI (schema) vs UUID-based class (Java code) == For the time being I treat them as "the same": URI is external_id. Both are strings. Moreover, there is a class "DasishIdentifier" (with a bunch of superclasses, for each reasoure), extending UUI. It "envelopes" external_id" into UUID. == Body: must be some serialization/deserialization mechanism == -- "body" in the DB it is just a text -- "body" in schema-generated class it is a list of objects For now, I use simple "serialize" and "deserialize" Helpers' procedures which should be replaced by some proper marshalling-demarshalling. For simple serialization I treat the first element of the list of objects above as a text whcich corresponds to the DB column "body_xml" == Source == Misprint in timeStamp: timeSatmp. == Cached Representation Info == Missing in the schema: the attribute/elememt "where_is_the_file" which actually points to the location where the file can be download. It is necessary to fulfill GET api/sources//cached//content == Version == MISSING in the schema: attribute URI (corresponding to the external_id in the DB) is absent. Therefore it does not appear in the JAXB-generated class "Version" and the java class has one attribute less than the DB table "version" {{{ CREATE TABLE version ( version_id SERIAL UNIQUE NOT NULL, external_id UUID UNIQUE NOT NULL, version text, ); }}} For now I'm using attrribute "version:String" now to keep "external_id/URI" in the java class "Version". == LISTS of Resources, like "PermissionS" and "CachedRepresentationS", and version-siblingS connected to a particular source cannot be standalone tables in the relational DB == According to the schema: a list of Cached representations is declared as a standalone resource of type "CachedRepresentations". Every version referres to its own list of cached representations. Every such list has it own ID. According to the Rel. database: it looks a bit strange to have such lists. Instead, i have made a common joint table (verson_id, cached_representation_id). A pair (a, b) is listed in this table iff the version with the internal id "a" has cahced representation with intrenal id "b".