Use slow hashing to reduce digest size?

Question

I have seen this question on MD5 replacement for 128 bit digests. It is said numerous times that having a 128 bit digest is impossible today because finding a collision would only require $2^{64}$ operations, which is in the means of big organisations nowadays. However, I think that these answers miss a fairly important point.

The time to bruteforce a hash function does not only depend on the number of security bits of its digest, but it also depends on the time needed to compute a single hash. This is why slow hashing functions such as bcrypt (184 bit) can afford to have a shorter digest size compared to fast hashing ones such as SHA-256 (256 bit).

In this case, would it not be possible to have a collision-resistant 128-bit slow-hashing function to replace MD5?

Thank you for your help.

fgrieu · Accepted Answer · 2021-04-20T12:09:03.063

Would it not be possible to have a collision-resistant 128-bit slow-hashing function to replace MD5?

That's possible.

We could use Argon2 parameterized for 128-bit output and (say) 10 ms computation on a Raspberry Pi 3. If something could speed this up a hundredfold, and we parallelize on 1 million units, there's <40% chance of finding a collision with $2^{128/2}/10^{10}/86400/365.25\approx58$ years of computation.

That would replace MD5 in some applications where it's 128-bit size matters, collision-resistance is essential, but not speed for small inputs nor 512-bit block size (e.g. HMAC-MD5, but this one is not broken as far as we know).

Not coincidentally, that's current best practice for password hashing.

Note: I have simplified this answer after discovering Argon2 version 0x13 already includes a fast hash as the first step, namely Blake2b.

kelalaka · Answer 2 · 2021-04-20T11:21:46.163

Bcrypt is a password hashing function likes PBKDf2, Scrypt, and, Argon, where in the password hashing the collision is not important, pre-images are important.

If you just iterate the $\operatorname{MD5^n}(x)=\operatorname{MD5}(\operatorname{MD5}(...(\operatorname{MD5}(x)...))$ n-times then we will have an already well-known problem. A collision in the inner MD5 is a collision for the $\operatorname{MD5^n}$ therefore simple iteration is not secure only slows the collision finding, really? Just find a collision for $\operatorname{MD5(x)}$ then you have a collision. In other words, the cost of finding collusion is not affected!

An easy fix is for the case $n=2$ is $\operatorname{MD5^2}(x)=\operatorname{MD5}(\operatorname{MD5}(x)||x))$ or similar approaches.

We don't need doubling MD5 or SHA-1 to improve security, we just need new hash functions like SHA3 and the very fast one Blake2b.

Finally, we want cryptographic hash functions to be secure and fast, not slow. Slowness is required in password hashing.

update for the comment

It turns out that hashing with MD5 is required for the identification. In this case, the pre-image attack is more important if the public keys are considered to be kept secret. In the pre-image attack, given a hash value $h$ we are looking for an $x'$ such that $h = \operatorname{MD5}(x')$. The $x'$ may be the original $x$ such that $h = \operatorname{MD5}(x)$ or not. If the attackers seek the original they need to search more. MD5 has only one pre-image attack that its practical cost is not faster than the generic pre-images search that has $2^{128}$-time.

2009 - Yu Sasaki, Kazumaro Aoki, Finding Preimages in Full MD5 Faster Than Exhaustive Search. The post is $2^{123.4}$-time and $2^{45}\times 11$ word of memory.

The collision here is only relevant for you since you don't have two or more users who have the same identity.

So you can use a modern hash function with trimmed instead of MD5.

Use slow hashing to reduce digest size?

2 Answers2