3

RFC 5869 describes HMAC-based Extract-and-Expand Key Derivation Function (HKDF). In section 4, entitled "Applications of HKDF", it states that one of the intended uses is:

derivation of symmetric keys from a hybrid public-key encryption scheme

However, it doesn't expand on the meaning of this. Can anyone explain how HKDF can be used in this manner?

Cocowalla
  • 450
  • 1
  • 5
  • 17

2 Answers2

3

Fix an RSA modulus $n$. Here's a way you can send a message $m$ so that only someone who knows the secret factorization of $n$ can read it.

  1. Pick an integer $1 < x < n$ uniformly at random.
  2. Compute $y = x^3 \bmod n$.
  3. Compute $k = \operatorname{HKDF-Extract}(\underline x)$, where $\underline x$ is the little-endian bit encoding of the integer $x$. Note that $\underline x$ itself is not a uniform random bit string: if interpreted as an integer, it is always below $n$. Thus it is unfit to satisfy the security contracts requiring a uniform random key.
  4. Compute $k_\mathrm{enc} = \operatorname{HKDF-Expand}(k, 512, \text{‘Cocowalla's app: encryption key’})$. $k_\mathrm{enc}$ is conjectured to be indistinguishable from a uniform random bit string to an adversary who doesn't know $x$. As such it is fit to satisfy security contracts requiring a uniform random key.
  5. Compute $k_\mathrm{mac} = \operatorname{HKDF-Expand}(k, 256, \text{‘Cocowalla's app: authentication key’})$.
  6. Compute $c = m \oplus (\operatorname{Threefish512}_{\,k_\mathrm{enc}}(0) \mathbin\Vert \operatorname{Threefish512}_{\,k_\mathrm{enc}}(1) \mathbin\Vert \cdots)$, the encryption of $m$ with a one-time pad generated by Threefish-512 in CTR mode.*
  7. Compute $t = \operatorname{Poly1305}_{k_\mathrm{mac}}(c)$, the authentication tag for $c$.
  8. Transmit $(y, c, t)$.

The recipient who knows the factors $p$ and $q$ of $n$ can solve $3d \equiv 1 \pmod{(p - 1)(q - 1)}$ and recover $x = y^d \bmod n$, since $y^d \equiv (x^3)^d \equiv x^{3d} \equiv x^1 \equiv x \pmod n$; then they can derive $k$, etc., as before.

In archaic crypto literature, this is called ‘hybrid’ because we don't use the RSA trapdoor permutation $x \mapsto x^3 \bmod n$ to encrypt the message; rather we use it as a public-key KEM to conceal a secret element $x$ of $\mathbb Z/n\mathbb Z$, and derive from $x$ keys for a symmetric-key cryptosystem—specifically, a DEM, a data encapsulation mechanism. Thus it is a ‘hybrid’ of public-key and symmetric-key cryptography.


* This is not actually how I would recommend doing it: I would rather recommend NaCl crypto_secretbox_xsalsa20poly1305 with a single key. But this illustrated using HKDF-Expand to generate multiple keys of different sizes with different info strings.

The designation of ‘hybrid’ today is archaic because modern cryptographers recognize it was foolish all along to shoehorn structured messages like ‘Help! I'm trapped in an RSA ring.’ or ‘The Magic Words are Squeamish Ossifrage’ into elements of $\mathbb Z/n\mathbb Z$. In fact, for hysterical raisins, we still do approximately this: pick $k \in \{0,1\}^{256}$ uniformly at random, and then shoehorn it with an array of hash functions into an approximately uniform random element of $\mathbb Z/n\mathbb Z$, with a byzantine construction called OAEP. For hysterical raisins most protocols deployed in practice use OAEP, or its even worse predecessors. The much simpler scheme above is called RSA-KEM.

Squeamish Ossifrage
  • 49,816
  • 3
  • 122
  • 230
2

Many key exchanges output secrets that are either not uniform or are large but splitting them is unsafe because subsets aren't uniform either.

Diffie-Hellman over 4096-bit groups or over 256-bit elliptic curves will not be uniform.

$$k = \operatorname{kdf}(\operatorname{dh}(a, b))$$

RSA-KEM (RFC 5990) where $r$ is a random value $\{0,1,2,3,\dots,N-1\}$ where $N = pq$ and the ciphertext is also in this range. Note the following is slightly modified to include the ciphertext, this eliminates any ciphertext malleability without having to analyse the malleability of each primitive you may pair it with.

$$k = \operatorname{kdf}(\operatorname{encrypt_{pub}}(r) \| r)$$

Critically, when using this key you must include an authenticator, not only to authenticate the message but to authenticate the key exchange itself. Otherwise you've derived a random key and cannot tell if either the key or the message is corrupt.

cypherfox
  • 1,442
  • 8
  • 16