August 18, 2020

Clearing Up CRC Terminology and Representations of Polynomials

In the first post of this series on CRCs, I'm just going to clarify the terminology used. I am not going to cover the maths of how CRCs work in these posts, which can get surprisingly complex for what appears to be a set of simple bitwise manipulations and instead defer to Ben Eater's excellent CRC tutorial on YouTube.

Small correction to the video, as noticed by commenter David W Smith, the length shown in Prof. Koopman's tables are in bits not bytes. They are for the dataword i.e. not including the FCS.

I'm going to follow Prof. Koopman in his terminology

Code Word, Data Word and FCS Frame Check Sequence

Code Word is the whole message, with the payload being the dataword. Note that in this definition, any header is included in the Data Word. The Frame Check Sequence is the addition to the payload which adds redundancy and hence allows errors to be detected. It is an error code of a type given in Error Coding. The FCS must be after the data word to get the burst error detection benefits.

Hamming Distance and Hamming Weight

Hamming distance (HD) is how many bits have to be changed to get from one valid codeword to another, as a minimum. This means that all changes (errors) less than HD will be detected, and some of the errors at the HD will be detected.
The number of bit patterns that are undetectable errors for a given number of bit errors is the Hamming Weight (HW). For example, the HW(4) = 4161 means that there are 4161 undetectable 4-bit errors (messages where 4 bits were changed in transit).

Error Coding

A generic term for a set of bits which is used to detect errors in a message. A single bit error code is a parity bit, as commonly seen in RS-232 and other simple protocols. The next level up is a checksum, which can be achieved by XOR-ing values together or summing them in various ways. The top level is CRC since this has the best properties of detecting busts of errors but is more work to calculate.

Checksum

Error code based on addition of data chunks. There are various forms used in different circumstances but it has been found that Fletcher (using 1s complement addition) is better than Adler (using modulo addition). Sometimes the word checksum is used as a synonym for CRC or FCS but it is best to be strict about definitions. Checksums are generally worse in performance than CRCs, and their error detection is data dependent, but they are cheaper to perform. Unless error code checking is really the limiting factor on application performance, it is worth using CRCs due to their increased power. For example, a 12-bit CRC can provide better error detection than 16-bit Fletcher checksum.

CRC

Error code using polynomial division and special maths rules which gives particularly good error detecting qualities. The term "CRC value" should be reserved for the value that is calculated by the CRC algorithm, as distinct from the FCS which is the value in the message. Of course the CRC value (after any post-processing) should match the FCS, but these need to be distinct variables in the implementation and not confused. Note that the FCS is just as likely to become corrupted by errors as the data word, but the CRC value is the result of an exact algorithm (despite operating on possibly erroneous data).

Polynomial

In the theory, CRC values are calculated by dividing the code word by a polynomial and only keeping the remainder, for example x + 1 which is CRC-1 or parity bit. A more typical example is x^5 + x^2 + 1 which is CRC-5-USB. The explicit form of the polynomial as an expression in x is unambiguous and should always be quoted as the specification. There are many ways to represent this polynomial which can lead to confusion if it is not clear which representation is being used.

Wikipedia names these representations as Normal, Reversed, Reciprocal and Reversed Reciprocal. Depending on the method used to convert the polynomial expression to a hexadecimal value, a different result is obtained.

Hence it is never sufficient to just quote a hex value like 0x8810 out of context to be sure that the polynomial is specified completely.

The following diagram shows different representations of CRC-5-EPC, x^5 + x^3 + 1. I have kept the polynomial factors bits in the same column, with the grey headers being adjusted to show the place-value interpretation of those bits. Where the font is grey, that bit is omitted from the representation. Light green background is the least significant nibble, mid green is the next most significant nibble.

Wikipedia called the representation used by Koopman "Reversed Reciprocal" despite it not being a bit or byte reversed representation.

Conversion of Polynomial Representations

I have created a spreadsheet where you can enter either the Normal form and size; or the Koopman/"Reversed Reciprocal" form and see the full bit representation and the converted representation. It uses the Excel standard colour codings for Input and Output cells.

Practical calculation of CRC Values

Calculating with the polynomial does not involve manipulating expressions in x directly but instead uses the hex representation. The calculation uses XOR operations and shifts instead of long division. It can be very efficiently implemented in hardware, with many modern microcontrollers having dedicated CRC peripherals which can be loaded with appropriate format of the polynomial representation, and can be 60 times faster than the software algorithm.

Examples are Microchip and ST.

If a hardware implementation is not available or does not meet the needs, table lookup versions of CRC algorithms can offer huge speed ups over bit-by-bit algorithms. The pycrc project will generate C source code for table and non-table options. Some CRC polynomials have been designed specifically for efficient table implementation.

Specifying a CRC

There is more to specifying a CRC algorithm than just the polynomial but let's look at that first before we get to the other factors.

Unfortunately it is insufficient to say "CRC-16" and assume that you have unambiguously specified the CRC algorithm and polynomial to be used. For a start, there are 22 conflicting definitions of "CRC-16". So the first step is to specify the polynomial as an expression in x e.g.

x^16 + x^12 + x^5 + 1

This can be followed up with a hex representation such as 0x1021 but that must specify which of the representations is being used, since the exact same expression is also called 0x8810 by Koopman. I prefer Koopman's format because the bit-size of the polynomial is derivable from the hex value, so you do not also need to state the CRC size. If you use the Normal format, the size does need to be specified because there are examples of the same hex representation being used at different sizes, eg 0x03 for bit-sizes 3, 4, 6.

Pre and Post Processing Steps

The initial seed value should be specified. This is commonly zero, but is sometimes all 1s ie 0xFFFF... . Any non-zero value has the advantages of making an all zero codeword impossible, and leading zeros in the message will affect the FCS, without additional computational cost. If the initial seed value is zero, then if the first few bytes of the message are zero nothing changes in the FCS so you lose information about how many bytes of the message were zero. Non zero seed values don't affect the other error detection properties, so it can be useful if you want to guard against problems with the first few bytes being zero.

It is usual to append a string of n 0-bits to the message before running the CRC algorithm, where n is size of the CRC, which allows fast table-driven implementations to operate.

Any modification of the CRC value should be specified, such as an XOR operation with a fixed bit pattern to give the final FCS. Avoid modifications which rearrange the order of the bits since this will affect the burst error detection capability, the CRC needs to be in the same bit order as the rest of the message. Post-processing such as XOR-ing is done after any bit order reversal.

A standardised method of specifying the CRC parameters is to use the method of the CRC-Catalogue also called the Rocksoft Model.

Test Vectors

It is very helpful to developers to have a few test vectors; that is CRC Values with their corresponding messages. This cannot be in any way an exhaustive list but even a couple of well chosen examples can help the developer determine if there are problems with the implementation. The format of the message (data word) and CRC Value must be unambiguous for this to be able to remove ambiguity in the specification. The CRC-Catalogue uses a message of "123456789" as a string of 8-bit UTF-8 characters with the resultant CRC value given as the "check" value.