Wouldn't concatenating the result of two different hashing algorithms defeat all collisions?

Question

Let's say I have three messages: A B C

And I run each of these through two different Hashing algorithms: MD5 and SHA1 for this example

MD5(A) = X
MD5(B) = Y
MD5(C) = Y
SHA1(A) = N
SHA1(B) = N
SHA1(C) = M

Notice the MD5 hash of B and C collide. And the SHA hash of A and B collide.

If I simply concatenate the digests, however, the results would be unique:

Combined Digest of A:  XN
Combined Digest of B:  YN
Combined Digest of C:  YM

The underlying principle would be that whatever pair of messages could be found or constructed to form a collision with one hashing algorithm, wouldn't also form a collision with another hashing algorithm.

The combined digest length (for MD5/SHA1) would be 288 bits (128+160) -- but unless I'm missing something, this would be significantly more secure than a single hashing algorithm with a 288-bit digest.

Granted, in the example above I'm using MD5 and SHA1 which are both known to be effectively broken, but I'm hoping an answer exists that applies more conceptually to the premise than simply the choice of algorithms.

i.e., In a situation where collision resistance is critical, wouldn't the combination of SHA2-256 + SHA3-256 concatenated be more secure than a single iteration of SHA2-512, or SHA3-512?

score 4 · Answer 1 · answered Mar 01 '23 at 19:25

No, concatenating two hashes gives you at least the collision resistance of either but in many practical cases it will give you little more.

This is especially truely for MD hash functions where we know how to convert collisions into many way multi collisions. We can make 2^64 multi way sha1 collision and expect one will collide also in MD5.

score 4 · Accepted Answer · answered Mar 02 '23 at 04:55

No, concatenating the result of two different hashing algorithms does not defeat all collisions. You've overlooked the case where $\text{MD5}(A)=\text{MD5}(B)=X$ and $\text{SHA1}(A)=\text{SHA1}(B)=N$. In English, that's when a pair of inputs collides for both hash functions.

Furthermore, assuming a hash function's output is truly uniformly distributed for any given set of inputs (this isn't actually true, but for our purposes, it's close enough to true for modern cryptographic has functions), the collision resistance of $\text{HASH}^P_\text{256-bit}(A) +\text{HASH}^Q_\text{256-bit}(A)$ is exactly equal to the collision resistance of $\text{HASH}_\text{512-bit}(A)$.

Again, assuming a uniform distribution, the chance two inputs collide for an N-bit hash is $\left(\frac{1}{2}\right)^N$, or one in two for each bit of output. Assuming the chance that a pair of inputs collides for hash $P$ is independent of the chance of collision of the pair for hash $Q$, the chance a pair collides for both is the product of the chance it collides for each hash individually. Given this, it's clear the chance of collision is identical either way, since $\left(\frac{1}{2}\right)^{256}\cdot\left(\frac{1}{2}\right)^{256}=\left(\frac{1}{2}\right)^{512}$.

Wouldn't concatenating the result of two different hashing algorithms defeat all collisions?

2 Answers2