5

In Merkle–Damgård is there any reason why we use a fixed $IV$ at the beginning? Can we use the first block ($M_1$) right away instead of $IV$ and feed it through the compression function with $M_2$.

Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323
user43738
  • 91
  • 2

3 Answers3

5

TL;DR: A simple reason why we use the IV is to mitigate second preimage attacks.

This is the typical Merkle–Damgård construction: you split your messages in blocs of $k$ bits (input size of the compression function).

    M1         M2 ...     Mn
    |          |          |
    +--|\      +--|\      +--|\
       | \        | \        | \
 IV ---|  |-------|  |-------|  |---  Hash
       +--+       +--+       +--+

your proposal is the following :

    M2         M3 ...     Mn
    |          |          |
    +--|\      +--|\      +--|\
       | \        | \        | \
 M1 ---|  |-------|  |-------|  |---  Hash
       +--+       +--+       +--+

This assumes that the message is longer than $k$ bits! What if the message is shorter ? Well you get the following because you don't have an IV to compress with.

 M1 ---  Hash

Also this idea of using $M_1$ as an $IV$ assumes that the inputs of your compression function have the same size e.g. if you take $f(x,y) \to h$ where:

  • $\texttt{sizeof}(x) = 128 \texttt{bits}$ and
  • $\texttt{sizeof}(h) = 160 \texttt{bits}$

This implies that $\texttt{sizeof}(y) = 160 \texttt{bits}$. Therefore the use of an IV is encouraged unless you want to manage a padding for the first input...

What if we have a fixed last bloc compression ?

Such that:

    M2         M3 ...     Mn ...     LB
    |          |          |          |
    +--|\      +--|\      +--|\      +--|\
       | \        | \        | \        | \
 M1 ---|  |-------|  |-------|  |-------|  |---  Hash
       +--+       +--+       +--+       +--+

And in the case of very short messages:

    LB
    |
    +--|\
       | \
 M1 ---|  |---  Hash
       +--+

In the end don't you agree that this is the same has having a fixed first block, also called an IV ? You still need to save this value. Whether it is at the end or at the beginning, it is needed.

A Second Pre-image attack

Quick reminder:

Given an input $m_1$ it should be difficult to find another input $m_2$ such that $m_1 \neq m_2$ and $hash(m_1)=hash(m_2)$. Functions that lack this property are vulnerable to second-preimage attacks.

Feeding directly $M_1$ instead of the IV whatever the construction above (with or without a fixed last block) leads to a second-preimage attack:

Let $M_1$ such that $M_1 = a || b || c || d$ where $||$ is the concatenation and $a,b,c,d$ are of the right size. We pose $H(a || b || c || d) = h$.

    b          c          d
    |          |          |
    +--|\      +--|\      +--|\
       | \        | \        | \
  a ---|  |---h'--|  |---h"--|  |---  h
       +--+       +--+       +--+

Notice that we have $h'$ and $h''$ such that $h = H(a||b)$ and $h'' = H(a||b||c)$.

What happen if we compute $H(h'||c||d)$ or $H(h''||d)$ ?

   c         d
   |         |
   +--|\     +--|\
      | \       | \
  h'--|  |------|  |---  h
      +--+      +--+


    d
    |
    +--|\
       | \
 h'' --|  |---  h
       +--+

In both case you will get $h$ which leads to: $$\begin{align*} H(a || b || c || d) &= h\\ &= H(h' || c || d)\\ &= H(h''||d) \end{align*}$$

Thus we have a collision through a second preimage attack (note that the compression function may be perfectly secure!).

How to solve this?

Multiple solutions exists:

  1. Use an IV.
  2. Encode the length of the input in the last block.
  3. add a domain separation encoding at the end of each block.

Solutions has been discussed in the following paper (see section 8: Implications for sequential hashing).

Biv
  • 10,088
  • 2
  • 42
  • 68
1

If we used this idea of performing $C(M_1,M_2)$ rather than $C(\text{IV},M_1)$ as the first compression step, then for some reasonable choice of the compression function $C$, there would be devastating attacks. That's including with explicit length of the input in the last block, contrary to the second-preimage attack in the other answer.

An example is when the compression function $C$ is an (assumed ideal) block cipher, with $C(S,M)=E_M(S)$ (that is, encryption of the hash state using a message block as key). For a message $M$ leading to a 2-block padded message $M_1\|M_2$, with the proposed shortcut, the hash is $h=H(M)=E_{M_1}(M_2)$ rather than the usual $h=H(M)=E_{M_2}(E_{M_1}(\text{IV}))$.

To reach any desired hash $h$, we chose any valid $M_2$ (the choice is constrained only by the length padding), and compute $M_1=E^{-1}_{M_2}(h)$ (that is, we decipher $h$ under key $M_2$). We then form $M$ from $M_1$ and $M_2$, and $H(M)=h$ holds. This first-preimage attack is trivially extended to second-preimage and collision.

Sure, common hashes tend to use $C(S,M)=E_M(S)\oplus S$ per the Davies-Meyer construction (or a small variation), and it it is unusual to directly use a block cipher as the compression function. But assuming we do, with the usual choice of IV (an arbitrary public random-like value clearly independent of the block cipher), we still get a reasonably secure construct (collision-resistance to usual levels can be proven with standard arguments; preimage resistance could be weaker than usual due to greater freedom in multi-collision attacks).

This example illustrates that the shortcut weakens our security insurance; and that making a security argument for the construction would require stronger hypothesis on $C$, or/and more complex reasoning.


Note: a comparatively minor difficulty with the proposed shortcut is that we'd need a special case for messages short enough to fit 1 block after length padding; like $h=H(M)=C(M_1,M_1)$.

fgrieu
  • 149,326
  • 13
  • 324
  • 622
0

Actually, the result will be collision resistant, Im afraid the answer that Biv gave is wrong since he didn't pad the message with the length