1

There are String1 and String2 - some variables-strings. The probability of collision for different String1 and String2 is

P{SHA1(`String1`) == SHA1(`String2`)} = p

What's the probability of

P{SHA1(SHA1(`String1`))} == P{SHA1(SHA1(`String2`))} //two times
P{SHA1(...(SHA1(`String1`)))} == P{SHA1(...(SHA1(`String2`)))} //10^9 times
otus
  • 32,462
  • 5
  • 75
  • 167
Haradzieniec
  • 121
  • 3

1 Answers1

2

Assuming the inputs and outputs are random, you would expect:

$$\begin{align} P[\operatorname{SHA1}(s_1) \not= \operatorname{SHA1}(s_2)] &= 1 - 2^{-160} \\ P[\operatorname{SHA1}^2(s_1) \not= \operatorname{SHA1}^2(s_2)] &= (1 - 2^{-160})^2 \\ ... \\ P[\operatorname{SHA1}^n(s_1) \not= \operatorname{SHA1}^n(s_2)] &= (1 - 2^{-160})^n \approx 1 - n/2^{160}, \\ \end{align}$$

where $\operatorname{SHA1}^2 = \operatorname{SHA1} \circ \operatorname{SHA1}$, etc.

Thus, $P[\operatorname{SHA1}^{10^9}(s_1) = \operatorname{SHA1}^{10^9}(s_2)] \approx 10^9 / 2^{160} \approx 2^{-130}$.

That's for a single pair $s_1 \not= s_2$. If you are looking for collisions, you would expect a collision with $\operatorname{SHA1}^{10^9}$ after something like $2^{130/2} = 2^{65}$ strings.

This approximation only holds while $n$ is small enough, and breaks down when it gets close to $2^{80}$ and cycles start to become an issue. However, for $10^9$ it should do all right.


However, there are (theoretical) attacks to find collisions in SHA1 faster, so for non-random inputs the probabilities could be higher.

otus
  • 32,462
  • 5
  • 75
  • 167