34

Are checksums basically toned-down versions of cryptographic hashes? As in: they are supposed to detect errors that occur naturally/randomly as opposed to being designed to prevent a knowledgeable attacker's meticulous engineering feature?

So, essentially they are non-secure versions of cryptographic hashes, one could say? Thus for the same reason, these checksums are "cheaper" to compute than cryptographic hashes? (e.g. CRC32 vs SHA-256)

Sorry for my poor english and potentially trivial question. I just need to get the concepts straightened out.

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240
AlanSTACK
  • 1,315
  • 2
  • 14
  • 14

4 Answers4

42

Are checksums basically toned-down versions of cryptographic hashes? As in: they are supposed to detect errors that occur naturally/randomly as opposed to being designed to prevent a knowledgeable attacker's meticulous engineering feat?

That is one way to look at it. However, hash functions have many purposes. They are also meant to be one-way (an attacker cannot know the preimage without guessing), for which there is no parallel with checksums.

So, essentially they are non-secure versions of cryptographic hashes, one could say? Thus for the same reason, these checksums are "cheaper" to compute than cryptographic hashes? (e.g. CRC32 vs SHA-256)

Due to their different requirements, checksums are not just "worse, but faster hashes". They are meant to prevent particular kinds of errors. Cyclic redundancy check can detect e.g. all 1-2 bit errors in short inputs, as well as some other common classes of errors in typical applications (e.g. bursts errors). This is better than a truncated cryptographic hash of similar length would be able to do.


A cryptographic hash truncated to 32 bits can easily collide with two inputs that differ in only one or two bits, whereas a CRC won't. The CRC is geared towards reliably detecting error patterns that commonly occur in transit, so it will do better on those kinds of errors and worse on others. The short hash does optimally over all inputs, and as a result does worse than CRC on the inputs CRC is good at dealing with.

AlanSTACK
  • 1,315
  • 2
  • 14
  • 14
otus
  • 32,462
  • 5
  • 75
  • 167
9

I think it's more helpful to think of checksums as toned-down versions of message authentication codes (not hashes).

Message authentication codes (MACs) are designed to detect any modification to a message, while it is in transit. They are secure against even adversarially-chosen modifications.

Checksums are designed to detect some modifications to a message, while it is in transit. They are designed to detect random modifications: the kinds of modifications that might happen by chance (e.g., due to a burst of noise, or interference, or something), but not adversarial modifications.

As a result, checksums can be faster than MACs. But MACs can be made pretty fast....

D.W.
  • 36,982
  • 13
  • 107
  • 196
4

From my point of view, they would be extremely distant relatives. But I understand the point: both generate fixed length values that can help to indicate when integrity was somehow compromised. They should stay as distant tools with different purposes that should not be confused, but there is CRC32 to complicate the difference.

CRC32 is a checksum that derives a 32 bit long digest, that is used, for instance, to check if a compressed file was damaged while being transferred. However, the fact that it generates a 32 bit long digest led to the believe that it can be used as a cryptographic hash for integrity control. In particular, they are used as a hash function in industrial networks, where the hardware capability is usually heavily bounded and real cryptographic hashes can be a heavy choice. That does not mean that they can actually replace a cryptographic hash function to any extent, but it shows that the descriptions of both families of functions are so similar that could be mistaken by an inattentive observer.

Sergio A. Figueroa
  • 1,918
  • 13
  • 19
0

A MAC ensures a receiver that a message is authentically generated by a sender using a shared secret (key). The MAC is a pair of algorithms: generate and verify. Typically, a sender generates a tag to append to a message. This tag is generated by mixing the message, the secret key and a counter and then hashing the result. The receiver can then use the same process to verify that the message is authentic. There are no guarantees made by the MAC other than the fact that there is a low probability of two messages of having the same MAC (1/2^n). Which makes a MAC good for authentication.

According to the study done by Koopman et al.*, the 15 bit CRC in CAN guarantees that all message collisions are at least 6bits in hamming distance away from each other. This means that an error will always be detected in a message frame (82 bits) with 1-5 random bit flips. The same guarantee of error detection does not hold if you try to MAC an 82bit message into a 15bit MAC. This is why CRCs are still used today for error detection.

*Koopman, Philip, and Tridib Chakravarty. "Cyclic redundancy code (CRC) polynomial selection for embedded networks." Dependable Systems and Networks, 2004 International Conference on. IEEE, 2004.