In Merkle–Damgård is there any reason why we use a fixed $IV$ at the beginning? Can we use the first block ($M_1$) right away instead of $IV$ and feed it through the compression function with $M_2$.
3 Answers
TL;DR: A simple reason why we use the IV is to mitigate second preimage attacks.
This is the typical Merkle–Damgård construction: you split your messages in blocs of $k$ bits (input size of the compression function).
M1 M2 ... Mn
| | |
+--|\ +--|\ +--|\
| \ | \ | \
IV ---| |-------| |-------| |--- Hash
+--+ +--+ +--+
your proposal is the following :
M2 M3 ... Mn
| | |
+--|\ +--|\ +--|\
| \ | \ | \
M1 ---| |-------| |-------| |--- Hash
+--+ +--+ +--+
This assumes that the message is longer than $k$ bits! What if the message is shorter ? Well you get the following because you don't have an IV to compress with.
M1 --- Hash
Also this idea of using $M_1$ as an $IV$ assumes that the inputs of your compression function have the same size e.g. if you take $f(x,y) \to h$ where:
- $\texttt{sizeof}(x) = 128 \texttt{bits}$ and
- $\texttt{sizeof}(h) = 160 \texttt{bits}$
This implies that $\texttt{sizeof}(y) = 160 \texttt{bits}$. Therefore the use of an IV is encouraged unless you want to manage a padding for the first input...
What if we have a fixed last bloc compression ?
Such that:
M2 M3 ... Mn ... LB
| | | |
+--|\ +--|\ +--|\ +--|\
| \ | \ | \ | \
M1 ---| |-------| |-------| |-------| |--- Hash
+--+ +--+ +--+ +--+
And in the case of very short messages:
LB
|
+--|\
| \
M1 ---| |--- Hash
+--+
In the end don't you agree that this is the same has having a fixed first block, also called an IV ? You still need to save this value. Whether it is at the end or at the beginning, it is needed.
A Second Pre-image attack
Given an input $m_1$ it should be difficult to find another input $m_2$ such that $m_1 \neq m_2$ and $hash(m_1)=hash(m_2)$. Functions that lack this property are vulnerable to second-preimage attacks.
Feeding directly $M_1$ instead of the IV whatever the construction above (with or without a fixed last block) leads to a second-preimage attack:
Let $M_1$ such that $M_1 = a || b || c || d$ where $||$ is the concatenation and $a,b,c,d$ are of the right size. We pose $H(a || b || c || d) = h$.
b c d
| | |
+--|\ +--|\ +--|\
| \ | \ | \
a ---| |---h'--| |---h"--| |--- h
+--+ +--+ +--+
Notice that we have $h'$ and $h''$ such that $h = H(a||b)$ and $h'' = H(a||b||c)$.
What happen if we compute $H(h'||c||d)$ or $H(h''||d)$ ?
c d
| |
+--|\ +--|\
| \ | \
h'--| |------| |--- h
+--+ +--+
d
|
+--|\
| \
h'' --| |--- h
+--+
In both case you will get $h$ which leads to: $$\begin{align*} H(a || b || c || d) &= h\\ &= H(h' || c || d)\\ &= H(h''||d) \end{align*}$$
Thus we have a collision through a second preimage attack (note that the compression function may be perfectly secure!).
How to solve this?
Multiple solutions exists:
- Use an IV.
- Encode the length of the input in the last block.
- add a domain separation encoding at the end of each block.
Solutions has been discussed in the following paper (see section 8: Implications for sequential hashing).
- 10,088
- 2
- 42
- 68
If we used this idea of performing $C(M_1,M_2)$ rather than $C(\text{IV},M_1)$ as the first compression step, then for some reasonable choice of the compression function $C$, there would be devastating attacks. That's including with explicit length of the input in the last block, contrary to the second-preimage attack in the other answer.
An example is when the compression function $C$ is an (assumed ideal) block cipher, with $C(S,M)=E_M(S)$ (that is, encryption of the hash state using a message block as key). For a message $M$ leading to a 2-block padded message $M_1\|M_2$, with the proposed shortcut, the hash is $h=H(M)=E_{M_1}(M_2)$ rather than the usual $h=H(M)=E_{M_2}(E_{M_1}(\text{IV}))$.
To reach any desired hash $h$, we chose any valid $M_2$ (the choice is constrained only by the length padding), and compute $M_1=E^{-1}_{M_2}(h)$ (that is, we decipher $h$ under key $M_2$). We then form $M$ from $M_1$ and $M_2$, and $H(M)=h$ holds. This first-preimage attack is trivially extended to second-preimage and collision.
Sure, common hashes tend to use $C(S,M)=E_M(S)\oplus S$ per the Davies-Meyer construction (or a small variation), and it it is unusual to directly use a block cipher as the compression function. But assuming we do, with the usual choice of IV (an arbitrary public random-like value clearly independent of the block cipher), we still get a reasonably secure construct (collision-resistance to usual levels can be proven with standard arguments; preimage resistance could be weaker than usual due to greater freedom in multi-collision attacks).
This example illustrates that the shortcut weakens our security insurance; and that making a security argument for the construction would require stronger hypothesis on $C$, or/and more complex reasoning.
Note: a comparatively minor difficulty with the proposed shortcut is that we'd need a special case for messages short enough to fit 1 block after length padding; like $h=H(M)=C(M_1,M_1)$.
- 149,326
- 13
- 324
- 622
Actually, the result will be collision resistant, Im afraid the answer that Biv gave is wrong since he didn't pad the message with the length