1 | This file describes the svndiff version 0 and 1 format used by the |
---|
2 | Subversion code. Its design borrows many ideas from the vdelta and |
---|
3 | vcdiff encoding formats from AT&T Research Labs, but it is much |
---|
4 | simpler and thus a little less compact. |
---|
5 | |
---|
6 | From the point of view of svndiff, a delta is a sequence of windows, |
---|
7 | each containing a list of instructions for reconstructing a contiguous |
---|
8 | section of the target using a contiguous section of the source as a |
---|
9 | reference. The section of the target being reconstructed is called |
---|
10 | the "target view"; the section of the source being referenced is |
---|
11 | called the "source view." Source views must not slide backwards from |
---|
12 | one window to the next; this allows svndiffs to be applied using a |
---|
13 | single pass through the source file. Instructions in a window direct |
---|
14 | copies to be made into the target view from one of three places: from |
---|
15 | the source view, from the portion of the target view which has already |
---|
16 | been reconstructed, or from a block of new data encoded inside the |
---|
17 | window. |
---|
18 | |
---|
19 | An svndiff document begins with four bytes, "SVN" followed by a byte |
---|
20 | which represents a format version number. After the header come one |
---|
21 | or more windows, until the document ends. (So the decoder must have |
---|
22 | external context indicating when there is no more svndiff data.) |
---|
23 | |
---|
24 | A window is the concatenation of the following: |
---|
25 | |
---|
26 | The source view offset |
---|
27 | The source view length |
---|
28 | The target view length |
---|
29 | The length of the instructions section in bytes |
---|
30 | The length of the new data section in bytes |
---|
31 | [original length of the instructions section in bytes (version 1)] |
---|
32 | The window's instructions section |
---|
33 | [original length of the new data section in bytes (version 1)] |
---|
34 | The window's new data section |
---|
35 | |
---|
36 | In svndiff version 1, the instructions and new data |
---|
37 | sections may be compressed by zlib. In svndiff1, in order to determine the |
---|
38 | original size, an integer is appended to the beginning of each of the |
---|
39 | sections. If the original size matches the encoded size (minus the |
---|
40 | length of the original size integer) from the header, the data is not |
---|
41 | compressed. If the original size is different than the encoded size |
---|
42 | from the header, the remaining data in the section is compressed with zlib. |
---|
43 | |
---|
44 | Integers (including the offset and all of the lengths) are encoded using a |
---|
45 | variable-length format. The high bit of each byte is used as a |
---|
46 | continuation bit; 1 indicates that there is more data and 0 indicates |
---|
47 | the final byte. The other seven bits of each byte are data. |
---|
48 | Higher-order bits are encoded before lower-order bits. As an example, |
---|
49 | 130 would be encoded as two bytes, 10000001 followed by 00000010. |
---|
50 | |
---|
51 | Instructions are encoded as follows: the two high bits of the first |
---|
52 | byte compose an instruction selector, as follows: |
---|
53 | |
---|
54 | 00 Copy from source view |
---|
55 | 01 Copy from target view |
---|
56 | 10 Copy from new data |
---|
57 | 11 invalid |
---|
58 | |
---|
59 | The remaining six bits of the first byte indicate the length of the |
---|
60 | copy. If those six bits are all zero, then the length is encoded as |
---|
61 | an integer immediately following the first byte of the instruction. |
---|
62 | If the instruction selector is 00 or 01, then the instruction encoding |
---|
63 | continues with an offset encoded as an integer. If the instruction |
---|
64 | selector is 10, then the offset into the new data is implicit; each |
---|
65 | copy from the new data is always for "the next <length> bytes" after |
---|
66 | the last copy. |
---|
67 | |
---|
68 | A copy from the target view must begin at a location before the |
---|
69 | current position in the target view, but its length may extend past |
---|
70 | the current position. In this case, the target data copied is |
---|
71 | repeated, as happens naturally if the copy is performed byte by byte |
---|
72 | starting at the beginning. |
---|
73 | |
---|
74 | Following are some example instruction encodings. |
---|
75 | |
---|
76 | Copy 11 bytes from offset 0 in source view: |
---|
77 | 00001011 00000000 |
---|
78 | |
---|
79 | Copy 64 bytes from offset 128 in target view: |
---|
80 | 01000000 01000000 10000001 00000000 |
---|
81 | |
---|
82 | Copy the next 63 bytes of new data: |
---|
83 | 10111111 |
---|
84 | |
---|
85 | Following is a complete example of an svndiff between the source |
---|
86 | document "aaaabbbbcccc" and the target document "aaaaccccdddddddd": |
---|
87 | |
---|
88 | 01010011 01010110 01001110 00000000 Header ("SVN\0") |
---|
89 | |
---|
90 | 00000000 Source view offset 0 |
---|
91 | 00001100 Source view length 12 |
---|
92 | 00010000 Target view length 16 |
---|
93 | 00000111 Instruction length 7 |
---|
94 | 00000001 New data length 1 |
---|
95 | |
---|
96 | 00000100 00000000 Source, len 4, offset 0 |
---|
97 | 00000100 00001000 Source, len 4, offset 8 |
---|
98 | 10000001 New, len 1 |
---|
99 | 01000111 00001000 Target, len 7, offset 8 |
---|
100 | |
---|
101 | 01100100 The new data: 'd' |
---|