Is it possible that two different messages have same hash code?

Question

As I know a very common hash code has 256 bits.

From a message, it outputs a hash code that's 256 bits. That hash code should be unique to that message. That message can be something like email.

But a message can be very long, far longer than 256 bits.

Theoretically there can be 2^256 different hash codes, and that's insanely large number.

But if a message contains 1000 letters, each letter being 8 bits, that's 8000 bits. Also 2^8000 different messages possible. Even if we just talk about 2^1000 possible messages that's still huge. So we put a long string of bits, and produce a 256 bits named "hash code".

If we divide 2^1000 messages by 2^256 hash codes, there are 2^744 messages for each hash code.

How is it possible that a hash code is unique to a message? Shouldn't there be some collusions, like two different messages having same hash code?

score 0 · Answer 1 · answered Dec 08 '22 at 17:17

The number of possible hash codes is much larger than the number of possible messages, but it is not true that there are 2^744 hash codes for each message. In fact, the number of possible hash codes for a given message is much smaller than 2^256, because a good hash function will distribute the possible messages evenly across the space of all possible hash codes. This means that, while it is theoretically possible for two different messages to have the same hash code (known as a "collision"), the likelihood of this happening is extremely small for a well-designed hash function. Therefore, we can say that a hash code is "unique" to a message in the sense that it is highly unlikely for two different messages to have the same hash code.

Is it possible that two different messages have same hash code?

1 Answers1