11

I fully realize that MD5 should not be used in any new project, but in my particular situation I have severe CPU performance issues, so MD5 is convenient. I have read a lot about MD5 security for this project, and I know it is broken in several ways: extending a file while having the same MD5 hash, and generating two different files with the same MD5 hash, for instance.

In my particular instance, I have a Merkle tree. The root is validated with SHA256, but for performance reasons the internal nodes use MD5. The key point is that in my application the MD5 hashing is done over fixed length blocks. For instance, leaf nodes are MD5 over a 4096 bytes block. Internal nodes are MD5 over a 16384 bytes block.

So, my question is: Given a known block and its MD5 hash, is there an attack to generate a different block of the same length with the same MD5 hash?

I don't want to generate two 4096 bytes blocks with the same hash. I want to know if, when given a 4096 bytes block, can one replace it with a different 4096 bytes block with the same MD5.

otus
  • 32,462
  • 5
  • 75
  • 167
jcea
  • 343
  • 1
  • 3
  • 10

1 Answers1

18

Right now, the best published attack against MD5's preimage resistance (first preimage, actually, but it applies to second preimage resistance as well) finds preimages in cost $2^{123.4}$ average cost, which is slightly better than the generic attack (average cost of $2^{128}$), but still way beyond the technologically feasible. The attack rebuilds the preimage as a two-block value (128 bytes) and can be adjusted to any larger length.

Therefore:

  • There is no practical attack against your 4096-byte hash tree nodes yet.
  • But the fact that they have a fixed 4096-byte length does not appear to improve the situation.

Still, the foundation of the security of MD5 appears somewhat flimsy, so it is not recommended (at all) for new systems. If you have severe CPU constraints, then you might want to consider the SHA-3 candidates. The 14 "round 2" candidate functions have been quite thoroughly investigated, and no break has been found in any of them; in that respect, they are all stronger than MD5, and arguably stronger than SHA-1, which has a known collision attack (but no known preimage attack). Some of them are quite fast, and competitive with MD5. Depending on your architecture (8-bit CPU, small 32-bit ARM, big modern 64-bit PC with AES-NI opcodes...), you would be most interested in BMW, ECHO, Shabal and Skein. "The" SHA-3 (Keccak) is not as fast as these (except on dedicated FPGA / ASIC).

It is usually best, for public relations, if you use a standard, recommended function. However, if you decide otherwise, using a function which has survived some scrutiny by cryptographers with no known weakness is arguably better than using a function with big weaknesses.

Kornel
  • 173
  • 6
Thomas Pornin
  • 88,324
  • 16
  • 246
  • 315