4

After reading about UHFs from different sources (From Algorithm books to Crypto books), I am still thoroughly confused about them.

How is a UHF different from other cryptographically secure hashing functions? Is the only difference that it's a keyed hash function while other secure hash functions need not be keyed?

Also, Boneh defines a UHF as a keyed hash function which satisfies a very weak notion of collision resistance and he goes ahead he defines this notion by saying that the adversary must not be able to find a collision with no information about the key at all. Other than the mention of the key - this seems similar to the adversary having to find a collision where he can pick both m1 & m2 such that there is a collision. If the user can pick both messages to find a collision, then Christof Paar defines that as strong collision resistance rather than weak collision resistance. Paar's definition of weak CR is that user is given m1 & it's tag & he must find a m2 with a collision.

So I am confused here about a couple of things

  1. Is UHF just a cryptographically secure hash function which is keyed?

  2. What Collision Resistance property does a UHF need to possess - Weak or Strong CR & how do you define that type of CR?

user93353
  • 2,348
  • 3
  • 28
  • 49

2 Answers2

4

The usual definition for a hash function - to avoid hard coded collisions, i.e. trivial formal adversaries have a collision as part of their description and just output it- is as a pair of algorithms $(\operatorname{Gen},\operatorname{Hash})$. In this definition $\operatorname{Gen}$ typically picks a "key" for the hash function, which for the "keyless" ones can usually be seen as the details of the description, like the exact structure and constants.

The advantage of an adversary $\mathcal A$ against the collision resistance of a scheme $H$ is then defined using the following experiment:

  1. $k\gets \operatorname{Gen}(1^\kappa)$
  2. $(m,m')\gets\mathcal A(1^\kappa,k)$
  3. $\mathbf{Adv}_H^{\text{CR}}(\mathcal A;\kappa)=\Pr[\operatorname{Hash}(k,m)=\operatorname{Hash}(k,m')\land m\neq m']$

A scheme is then called collision-resistant if for all polynomial-time adversaries $\mathcal A$ the advantage $\mathbf{Adv}_H^{\text{CR}}(\mathcal A;\kappa)$ is negligible in $\kappa$.

This definition essentially amounts to "give the adversary a full description of the scheme and then they win if they find a collision". The key serving the role of the definition of the hash function having only existed for so long in the real world, i.e. the key being given to the adversary the moment the hash function design was finalized.

Now for a UHF on the other hand the security definition is much weaker. The advantage of an adversary $\mathcal A$ against the universal hashing security of a scheme $H$ is then defined using the following experiment:

  1. $(m,m')\gets\mathcal A(1^\kappa)$
  2. $k\gets \operatorname{Gen}(1^\kappa)$
  3. $\mathbf{Adv}_H^{\text{UHF}}(\mathcal A;\kappa)=\Pr[\operatorname{Hash}(k,m)=\operatorname{Hash}(k,m')\land m\neq m']$

A scheme is then called universal hashing secure if for all polynomial-time adversaries $\mathcal A$ the advantage $\mathbf{Adv}_H^{\text{UHF}}(\mathcal A;\kappa)$ is negligible in $\kappa$.

As you can see for the UHF security, the adversary doesn't get a full description of the scheme. They don't even get oracle access - i.e. they can't query values to get their hashes. So they have to come up with a collision that works for a non-negligible amount of instantiations ("keys") of $H$. Obviously this definition is much easier to satisfy. In cryptographic practice dedicated UHF schemes actually even break down in security if an adversary learns a single input-output pair, allowing an adversary to completely recover the key $k$ and produce collisions at will.

SEJPM
  • 46,697
  • 9
  • 103
  • 214
2

The main difference is as you noted: Cryptographic hash functions need not be keyed. So intuitively the weak collision resistance guarantee for UHFs is indeed weaker that what we ask for cryptographic hash functions. Namely for UHFs the attacker has to find a collision for a hash function that the attacker doesn't really know(Not really but the UHF key is kept secret), this is a easier to satisfy compared to the case where the adversary knows the key. Indeed for UHFs based on polynomial that would be trivial.

Additionally: It is worth noting that technically a keyless hash function cannot be collision resistant in the sense: No efficient adversary is able to find a collision. The reason being that there exists such an efficient adversary: the one that has the collision hardcoded.

The way out is to actually define parametrized family of hash function $\{H\}_k$. Therefore the attacker cannot simply hardcode the collisions anymore.

But when comes to collision for such a family, in some books, the key $k$ is also given to the adversary.

Marc Ilunga
  • 4,042
  • 1
  • 13
  • 24