Opened 11 years ago
Closed 10 years ago
#596 closed enhancement (fixed)
Use a VCR prefix in handles
Reported by: | Twan Goosen | Owned by: | Twan Goosen |
---|---|---|---|
Priority: | major | Milestone: | VirtualCollectionRegistry-1.0 |
Component: | VCRegistry | Version: | |
Keywords: | Cc: | teckart@informatik.uni-leipzig.de |
Description
Extend PID creation (see #581) so that it prefixes newly created handles with a VCR-specific part. An example of such a handle would be:
11372/VCR-0000-0003-0AE6-F
The "Leizpig" library that is used to consume the EPIC PID service does not support custom handles like this (it relies on the remote service to generate a unique ID) so it needs to be extended or wrapped.
Jozef Misultka notes the following:
One technical comment: in case you decide to use custom PIDs be aware that instead of creating a new PID you might update already existing one in case you do not use one of the special headers e.g., If-None-Match:*.
This is a small difference to EPICv1 where specialised functions existed for creating/modifying.
Change History (9)
comment:1 Changed 11 years ago by
comment:2 Changed 10 years ago by
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:3 Changed 10 years ago by
Resolution: | fixed |
---|---|
Status: | closed → reopened |
Reopening: use a sequential postfix, so VCR-xxxx, not VCR-{uuid}. xxxx can probably just be the database ID
Consider including a checksum as suggested by Jörg
comment:4 follow-up: 5 Changed 10 years ago by
May I ask why you want to have this prefix? After all, a PID is just a stupid number persistently pointing to some content, wherever it moves. Things are simpler without that prefix, IMO.
The prefix has no immediate use to an end user, because the rest is just a number to be copied correctly.
P.S.: What about checksums with a "VCR" prefix? If there ever was one, the prefix will break it.
comment:5 Changed 10 years ago by
Replying to j.knappen@…:
May I ask why you want to have this prefix? After all, a PID is just a stupid number persistently pointing to some content, wherever it moves. Things are simpler without that prefix, IMO.
The prefix has no immediate use to an end user, because the rest is just a number to be copied correctly.
The reason for the "secondary" prefix is the fact that we will be sharing the actual ("primary") prefix among applications. I agree with you that technically an identifier merely needs to be unique but we feel that it will be useful in practice to be able to distinguish between PID "scopes" on basis of its form.
I do think that it is important not be tempted to infer any information from a PID's form; the secondary prefix can serve as an indication of its source but this should not be relied upon.
P.S.: What about checksums with a "VCR" prefix? If there ever was one, the prefix will break it.
I don't quite understand. Checksum of what, and how does this relate to the 'local name' of the PID? Is it specified somewhere that the form of a PID is derived from a checksum of its reference or content?
In any case, both the service and the creator should take care not to override existing PIDs if that is your concern.
comment:6 follow-up: 7 Changed 10 years ago by
Concerning checksums: As far as I get it, the PIDs issued by the EPIC consortium have a checksum (similar to ISBNs) appended as their last digit. This significantly reduces the probability that you type in a PID and get the wrong resource in response. If you want, you can check a PID for validity.
Also Springer-Verlag DOIs have a checksum appended.
comment:7 Changed 10 years ago by
Replying to j.knappen@…:
Concerning checksums: As far as I get it, the PIDs issued by the EPIC consortium have a checksum (similar to ISBNs) appended as their last digit. This significantly reduces the probability that you type in a PID and get the wrong resource in response. If you want, you can check a PID for validity.
Also Springer-Verlag DOIs have a checksum appended.
In the case of the VCR, the bit after the prefix(es) is currently a UUID. It should not interfer with the PID generation mechanism of GWDG.
However, we have concluded that this is not user friendly (the PIDs are specifically intended to be used as references in (printed) publciations) and it should be changed in something shorter (e.g. an incrementing integer). Including a checksum is a good suggestion.
comment:8 Changed 10 years ago by
Owner: | set to Twan Goosen |
---|---|
Status: | reopened → accepted |
The LINDAT client code is available at
https://svn.ms.mff.cuni.cz/redmine/projects/dspace-modifications/repository/revisions/990e685c13c09ae4e1f9e4b85a7538d18ea36b3d/entry/sources/dspace-api/src/main/java/cz/cuni/mff/ufal/dspace/PIDServiceEPICv2.java