Cyclic Redundancy Check sequences (CRC)[ Tutorial/Introduction | Background/Other checks ] As mentioned in the section on ISBN check digits, a CRC can be coarsely approximated as a check digit in a sequence where each input digit is multiplied with a different weight. But the confusion starts when different people start talking about the different ways one can view the CRC calculation:
All of them do the same: They produce a weighted sum of all the bits in the data stream which, when joined with the "checksum", will give a remainder of 0 after a division. To understand the process and the advantages of the different approaches, we look at the CRC problem from all four vantage points. Besides the four ways of performing the calculations, there are also different values that control the calculation and thus the final error-detection properties of the "checksum". This value has different names, depending on the viewpoint: generator function, generating polynomial, or characteristic function. This value has a width b associated with it, which determines the number of bits of the resulting checksum. CRC Hardware Operation
Serial hardware CRC: Starting with a zero register, the message is shifted in (<-M), followed by 4 zeros (<-0). Whenever the MSB is 1, the register is XORed with the generator (XOR). The low-order four bits of the register are then the CRC to be added. Sending: For the initial generation, the message is first considered to be padded with b zeros. These imaginary appended zeros are then replaced by the calculated CRC. The message (with the check digits appended) is then sent over the transmission channel to the recipient. Reception is even simpler: The entire message including check sequence is passed through the same circuit that was used to generate the check sequence. The result of the circuit should then be all-zeros, indicating that the CRC verification succeeded.1 The actual operation is illustrated in the image to the left. We assume an imaginary four-bit CRC, whose generator is 10011. Yes, you are right: These are five bits. But the most significant bit (MSB) is not really of interest, and is therefore not counted. First, the generator MSB is always 1. Second, whenever the register MSB becomes 1, the generator is XORed against the register, clearing the MSB. As a result, the MSB does not need to be transmitted. Also, magically (at least for now), the resulting message-plus-CRC sequence, when passed through the CRC hardware, will clear the register. As an exercise, try feeding the sequence "01101110110" (this is our message with the CRC attached, so you do not need to push in the four zeros that were used when generating the message) into the shift register according to the rules. The register will be zero in the end. When our CRC unit is to be used at very high data rates (today starting in the order of 1 Gbit/s), it is very hard to design circuits operating at that speed. Therefore, hardware engineers often process multiple bits at a time (e.g. Dittia 2001, Braun and Waldvogel 2001, Doering and Waldvogel 2004) or use lookup tables like our software engineers described below. Software CRC Computation/* Created with crc4tab[i] = crc(i); */ Lookup table and sample code for our four-bit CRC To make the examples small, the example to the left operates on four bits (aka nibble) at a time, which matches the generator size. In your application, you may chose the chunk size independently of the generator size2 (an example with 8-bit processing size and a 32-bit generator is shown in IETF RFC 1952: GZIP file format specification 4.3 (Deutsch 1996)). The process is sped up by using a lookup table returning the combined result of applying several single-bit operations, instead of using bit-by-bit processing. This is comparable to the fast hardware approach described above, but using the preferred mechanism available in software (lookup tables) instead of those suitable to hardware (combinatorial logic). First, a lookup table is used, with a size corresponding to the chunk size, s: For s bits, 2s entries are needed. For s=4, we thus need 16 entries in our table. This table is calculated using the bit-by-bit method, and can be precomputed (i.e., the table is already included in the code and has been computed at or before compile time) or at run-time, when the program first starts up. Each table entry consists to the CRC corresponding to each index. Second, we need code, that for each chunk translates the previous intermediate "sum" to what it would look like after shifting in s more bits. Then, the current data chunk is XORed to that value and gives the next intermediate sum. When creating the CRC, also the appropriate number of zero-filled chunks needs to be added to obtain the CRC to be placed in that zero-filled area. For verification, you do not zero-fill, but send the message and the transmitted CRC through the function, which returns 0 on a match.3
CRC of (0)011 01112 is 01102 In step two, the previous data has been shifted left, and needs to go through zero to four XORs with the generator (reduction), depending on the input sum. These XORs are performed by looking up the previous sum in the lookup table. As the last operation in step two, the new nibble is XORed in. These first two steps are (because the XOR operation is commutative) identical to having a register 2b bits wide, which starts out zeroed. Then, four bits of data are shifted in from the lower end, similar to the hardware approach. The value is guaranteed to still fit into b bits, so no reduction is necessary. Next, b more bits are then shifted in. This will not cause the register to overflow, as the top half was guaranteed to be zero before. Then, shifted versions of the generator are applied in turn, starting with the one shifted all the way left (by b-1 bits). After that, the upper half is again guaranteed to be zero. For a longer message, of course, the steps would be repeated, terminating with the passing of the zero-fill data into the function (in our case, this is step 3).
Output of (0)011 0111 01102 is 00002: Correct Mathematical BackgroundThis is again the hardware CRC
operation, but laid out similar to paper-and-pencil division known
from school In stage A, we try to see whether we can subtract a multiple of the generator from the topmost 5 bits. As we work on binary numbers, the only two multiples are 0 (resulting in no subtraction, represented by the light blue background) and 1 (resulting in subtraction of the generator, indicated by dark blue background). Subtraction (recall, this is XOR in this section) of the generator is possible if it makes the number smaller. With XOR, this is the case when the MSB is 1. Our number, 011 0111 00002 divided by 100112 using XOR instead of addition/subtraction, results in 01100102. This can be determined in the usual way. Here, you can also look at the light (0) vs. dark (1) blue, with the line in step A corresponding to the MSB and stage H delivering the LSB). This division results in a remainder of 01102. By using this analogy, we can also use many of the arithmetic transformations known from school. For example, it also becomes obvious why the replacement of the zero-fill with the CRC results in a zero remainder after passing: Recall that the CRC is the remainder after dividing the padded message by the generator. Instead of padding with zeros, we want to pad with a bit sequence that makes the padded message evenly divisible by the generator (i.e., no remainder). In normal arithmetic, you would subtract the remainder from the padded message.5 Here, we do the same, except that our subtraction is XOR. Thus, we are guaranteed to have the CRC-padded message evenly divisible by the generator. Operations in GF(2)What has been defined informally in the previous section is also known as operations in the finite field GF(2), also known as Galois field of two elements (hence the GF(2) notation).
Truth table of the addition in
GF(2)
Truth table of the multiplication in
GF(2) Both operators do have inverse counterparts, which are called subtraction and division, as usual. In our case, subtraction as the inverse of addition is — surprise! — identical to the addition; much the same as XOR is its own reverse (applying the operation a second time undoes the first operation). This view already allows operations to Polynomial viewReal-World CRCsTODO: Examples of generators, pre-/post-whitening, bit order. Next -->Footnotes1 This means that the checksum matches the data; it does not say that the data has not been tampered with. But for reasonable CRCs, the probability of accidental changes to the message that retain the validity of the CRC, the probability is negligible. To prevent "intelligent" tampering, cryptographic means are inevitable. 2 Generators smaller than the chunk size require splitting of the chunk into two parts, the more significant part consisting of s-b bits, which are XORed with the previous sum (shifted left by s-2b bits) and then used as an index into the table. The rest of the chunk is then applied as shown in the example code. 3 Of course, you may also pass the received message (without CRC) and again zero-filled through the CRC function and then compare the computed CRC with the one received. 4 Prepending zeros has no impact on the CRC, as shifting in zeros into a register that is zero has no impact, as it can never set the MSB of the register to one, which could make the contents non-zero. (Such modifications are prevented by using pre-whitening.) 5 This would result in changing the message unless the remainder was zero, as the subtraction would cause the borrow to be carried from the padding bits into the message. As XOR is addition/subtraction with neither carry nor borrow, CRCs do not have this problem. (Even in normal arithmetic, it could be overcome by adding the generator again for nonzero remainders.) |