3

Suppose that messages $m_1$ and $m_2 \neq m_1$, both known to the attacker, are both MAC'd through Poly1305 with the same key. By chance, these two messages hash to the same value. The attacker knows that $\operatorname{Poly1305}(m_1) = \operatorname{Poly1305}(m_2)$ but not the values of the auth tags (e.g. if they were passed through Blake2 with a secret key).

What attacks can the attacker mount? Can the attacker recover the secret key?

What if Poly1305 is replaced by GHASH or VHASH?

CodesInChaos
  • 25,121
  • 2
  • 90
  • 129
Demi
  • 4,853
  • 1
  • 22
  • 40

1 Answers1

2

Yes, the attacker can recover the key with high probability in the event of such a collision. For example, in two-block messages $m_1 \mathbin\| m_2$ and $m'_1 \mathbin\| m'_2$, this means that $$m_1 r^2 + m_2 r = m'_1 r^2 + m'_2 r,$$ so the key $r$ is a root of the quadratic polynomial $$(m_1 - m'_1) r^2 + (m_2 - m'_2) r,$$ of which there are at most two possibilities, one of which is necessarily zero and the other of which you can probably find in your head by this point.* The same goes for longer messages with more work and more roots to sift through. Exactly the same analysis applies to GHASH, which like Poly1305 is a polynomial evaluation hash; similar analysis will tend to apply to other universal hash families.

But the nice thing about Poly1305—or any suitable choice of universal hash family—is that the probability of this event is negligible, well below $2^{-100}$, which is good news for the composition as a candidate PRF, and thus also as a candidate MAC.

In particular, suppose the PRF-distinguishing advantage of any adversary against $F$ is bounded by $\varepsilon_F$, and suppose the hash family $H$ (e.g., Poly1305, GHASH, etc.) has collision probability bounded by $\varepsilon_H$. Then PRF-distinguishing advantage of any adversary making $q$ oracle queries against $m \mapsto F_k(H_r(m))$ is bounded by $\varepsilon_F + \binom{q}{2}\varepsilon_H$ (proof).

Thus this technique is a good way to build a long-input, short-output PRF out of a short-input, short-output PRF and a universal hash family. This structure is used by, for example, AES-GCM-SIV.


* Strictly speaking, there may be more possible values of $r$ because the polynomial evaluation giving an integer representative in $\{0, 1, 2, \dots, 2^{130} - 2\}$ is further reduced mod $2^{128}$ to make the Poly1305 authenticator. But this only multiplies the possibilities by a small number, at most 4.

Squeamish Ossifrage
  • 49,816
  • 3
  • 122
  • 230