20

At university we were told that it is a bad idea to implement a MAC by simply concatenating a key with the data to sign and to run it through a hash function (e.g. $s = \mathrm{hash}(k||\mathrm{data})$ or $s = \mathrm{hash}(\mathrm{data}||k)$). The next ideas that were presented then were HMAC and CBC-MAC which are a lot more complex (but standardized).

Now I'm wondering what the security of the following "idea" would be (I'm sure that there a good reasons why it is not used as it is more simple than HMAC or CBC-MAC):

  1. Compute the hash value $h = \mathrm{hash}(\mathrm{data})$.
  2. Compute the ciphertext $c = \mathrm{enc}_k(h) = \mathrm{enc}_k(\mathrm{hash}(\mathrm{data}))$.
  3. Set signature $s = c$.

Where $\mathrm{hash}$ is an arbitrary hash function and $\mathrm{enc}$ an arbitrary, symmetric block cipher using a key $k$.

I was not able to find any statements about such schemes, maybe I searched in the wrong place. However, it would be nice if you could help me with that question.

2 Answers2

16

No, in general, this is not secure, unless you make additional assumptions on the encryption method beyond the standard assumption of privacy.

To simplify things a bit, the assumption of privacy means that given a ciphertext $C$, the attacker has no information about what the plaintext might be. However, in your case, we don't really care if the attacker can figure out what the plaintext of the encryption function; we also give him the data, and he can compute $hash(data)$ himself, should he care to.

What we are concerned with is (again, to simplify a bit) that an attacker, given a message M and a valid tag for that message, cannot come up with another message, and a valid tag for that message. Translating that into your proposal, if the attacker was given $M$, and $E(Hash(M))$, can he pick another message $M'$, and come up with $E(Hash(M'))$?

Well, for a lot of encryption methods, he can. For example, if we consider a block cipher in counter mode, well, if you flip a bit in the ciphertext, the corresponding bit in the plaintext also flips. What that means that if the attacker computes $E(Hash(M)) \oplus Hash(M) \oplus Hash(M')$, well, that turns out to be precisely $E(Hash(M'))$, and so the attacker has won.

The additional property that we need to assume for the encryption method is nonmalleability; that is, given $M$ and the corresponding encryption $E(M)$, the attacker cannot modify the encryption so that it decrypts to any other specific message.

Of the standard encryption modes, well, ECB actually is nonmalleable, if (and this is a big if) the hash fits entirely within a single block output. Given that 128 bit hashes are vulnerable to collisions (and a hash collision would be another way of producing a forgery), this means using a nonstandard block cipher (for example, Rijndael with a 256 bit block size).

Authenticated encryption modes are also nonmalleable. However, this may be considered cheating; authenticated encryption modes work by effectively using a MAC internally; if the point of the exercise is to create a crypto primitive from other crypto primitives, well, this didn't do it.

poncho
  • 154,064
  • 12
  • 239
  • 382
9

It turns out that this is actually secure, up to the length of the block cipher, if $\mathrm{enc}(\cdot)$ is a secure block cipher (a pseudorandom permutation) and if $\mathrm{hash}(\cdot)$ is collision-resistant.

However, there is a catch. (You just knew there had to be one, didn't you?)

The catch is that typical block ciphers have too narrow of a block width for this to be adequately secure. In other words, the catch arises when you try to work out quantitative security level afforded by this construction.

For instance, let's say you use AES as your $\mathrm{enc}(\cdot)$ and SHA1 truncated to 128 bits (to match AES's 128-bit block width) as your $\mathrm{hash}(\cdot)$. Well, then you only get 64-bit security. This is vulnerable to collision-finding attacks of complexity approximately $2^{64}$. After examining about $2^{64}$ messages, you expect to find a pair of messages $m,m'$ such that $\mathrm{hash}(m) = \mathrm{hash}(m')$, through a simple birthday argument.

To be secure against such attacks, you'd need a hash function and block cipher whose block width is at least 160 bits. But few modern block ciphers support such a block width -- and this was especially the case when HMAC was defined.

Therefore, HMAC is a better fit for the common primitives typically available today.


There is a second reason why HMAC was defined. HMAC was designed to be robust: to minimize the assumptions it makes about the hash function. In particular, the HMAC construction was designed so that HMAC would have a chance of remaining a secure MAC construction, even if someone happens to discover collision attacks on the hash function.

This turned out to be a prescient design strategy. For instance, MD5-HMAC was widely used -- and then folks discovered feasible collision attacks on MD5. Fortunately, despite the fact that the collision-resistance of MD5 is totally broken, MD5-HMAC still appears to be secure: no one knows a way to break it. So, the designers of HMAC were pretty successful in making the HMAC construction resilient to certain kinds of failures of the hash function.

In contrast, your construction does not have that kind of resilience. This is a second reason why one might prefer HMAC over your construction.

D.W.
  • 36,982
  • 13
  • 107
  • 196