Opened 10 years ago

Closed 10 years ago

#596 closed enhancement (fixed)

Use a VCR prefix in handles

Reported by: Twan Goosen Owned by: Twan Goosen
Priority: major Milestone: VirtualCollectionRegistry-1.0
Component: VCRegistry Version:
Keywords: Cc: teckart@informatik.uni-leipzig.de

Description

Extend PID creation (see #581) so that it prefixes newly created handles with a VCR-specific part. An example of such a handle would be:

11372/VCR-0000-0003-0AE6-F

The "Leizpig" library that is used to consume the EPIC PID service does not support custom handles like this (it relies on the remote service to generate a unique ID) so it needs to be extended or wrapped.

Jozef Misultka notes the following:

One technical comment: in case you decide to use custom PIDs be aware that instead of creating a new PID you might update already existing one in case you do not use one of the special headers e.g., If-None-Match:*.
This is a small difference to EPICv1 where specialised functions existed for creating/modifying.

Change History (9)

comment:2 Changed 10 years ago by Twan Goosen

Resolution: fixed
Status: newclosed

comment:3 Changed 10 years ago by Twan Goosen

Resolution: fixed
Status: closedreopened

Reopening: use a sequential postfix, so VCR-xxxx, not VCR-{uuid}. xxxx can probably just be the database ID

Consider including a checksum as suggested by Jörg

Last edited 10 years ago by Twan Goosen (previous) (diff)

comment:4 Changed 10 years ago by Jörg Knappen

May I ask why you want to have this prefix? After all, a PID is just a stupid number persistently pointing to some content, wherever it moves. Things are simpler without that prefix, IMO.

The prefix has no immediate use to an end user, because the rest is just a number to be copied correctly.

P.S.: What about checksums with a "VCR" prefix? If there ever was one, the prefix will break it.

comment:5 in reply to:  4 Changed 10 years ago by Twan Goosen

Replying to j.knappen@…:

May I ask why you want to have this prefix? After all, a PID is just a stupid number persistently pointing to some content, wherever it moves. Things are simpler without that prefix, IMO.

The prefix has no immediate use to an end user, because the rest is just a number to be copied correctly.

The reason for the "secondary" prefix is the fact that we will be sharing the actual ("primary") prefix among applications. I agree with you that technically an identifier merely needs to be unique but we feel that it will be useful in practice to be able to distinguish between PID "scopes" on basis of its form.

I do think that it is important not be tempted to infer any information from a PID's form; the secondary prefix can serve as an indication of its source but this should not be relied upon.

P.S.: What about checksums with a "VCR" prefix? If there ever was one, the prefix will break it.

I don't quite understand. Checksum of what, and how does this relate to the 'local name' of the PID? Is it specified somewhere that the form of a PID is derived from a checksum of its reference or content?

In any case, both the service and the creator should take care not to override existing PIDs if that is your concern.

comment:6 Changed 10 years ago by Jörg Knappen

Concerning checksums: As far as I get it, the PIDs issued by the EPIC consortium have a checksum (similar to ISBNs) appended as their last digit. This significantly reduces the probability that you type in a PID and get the wrong resource in response. If you want, you can check a PID for validity.

Also Springer-Verlag DOIs have a checksum appended.

comment:7 in reply to:  6 Changed 10 years ago by Twan Goosen

Replying to j.knappen@…:

Concerning checksums: As far as I get it, the PIDs issued by the EPIC consortium have a checksum (similar to ISBNs) appended as their last digit. This significantly reduces the probability that you type in a PID and get the wrong resource in response. If you want, you can check a PID for validity.

Also Springer-Verlag DOIs have a checksum appended.

In the case of the VCR, the bit after the prefix(es) is currently a UUID. It should not interfer with the PID generation mechanism of GWDG.
However, we have concluded that this is not user friendly (the PIDs are specifically intended to be used as references in (printed) publciations) and it should be changed in something shorter (e.g. an incrementing integer). Including a checksum is a good suggestion.

comment:8 Changed 10 years ago by Twan Goosen

Owner: set to Twan Goosen
Status: reopenedaccepted

comment:9 Changed 10 years ago by Twan Goosen

Resolution: fixed
Status: acceptedclosed

Fixed in [5515]

Note: See TracTickets for help on using tickets.