9

Key derivation functions, such as HKDF (standardized in RFC 5869), are meant to stretch some initial keying material having enough entropy, like a Diffie-Hellman shared value, into one or more strong cryptographic secret keys.

Password-hashing schemes, such as the PHC winner Argon2, are meant to hash usually low-entropy passwords with the goal of making the hash digest inversion as costly as possible for an adversary with respect to CPU and memory consumption, as well as parallelization.

Is it exact to consider that password-hashing schemes are actually key derivation functions specialized for low-entropy inputs? Or is there any other essential difference of theoretical nature between these two types of cryptographic schemes?

cryptopathe
  • 1,215
  • 10
  • 13

2 Answers2

17

Key-stretching key derivation functions must produce results that have certain randomness properties, and be very difficult to reverse. Password hashes only need to satisfy the property "difficult to reverse", without randomness requirements. This is why all key-stretching key derivation functions work as password hashes but not the other way around.

Note that there are also key derivation functions that are non-stretching. Stretching functions are inherently slow, and this is necessary for password hashing. Fast key derivation functions such as HKDF are not suitable when the input has a low entropy, for example a password, regardless of whether the goal is to derive key material or a password hash.

Vitaly Osipov
  • 429
  • 3
  • 6
8

A key derivation function does a few things:

  1. Turn a random bit string with high min-entropy,* initial key material, into an effectively uniform random bit string.
  2. Label the parts of the resulting uniform bit string by purpose for reproducible derivation.
  3. Prevent multi-target attacks from saving a factor of $n$ cost in attacking one of $n$ targets with an optional salt.

Often, parts (1) and (3) are done separately from part (2) in an extract/expand form, as in, e.g., $\operatorname{HKDF-Extract}(\mathit{salt}, \mathit{ikm})$ which turns a high min-entropy initial key material $\mathit{ikm}$ into an effectively uniform random master key $\mathit{prk}$ with an optional salt, and $\operatorname{HKDF-Expand}(\mathit{prk}, \mathit{info}, \mathit{noctets})$ which derives effectively independent subkeys from a uniform random master key $\mathit{prk}$ labeled by the $\mathit{info}$ parameter. If you already have a uniform random master key to start, you can skip HKDF-Extract and pass it directly on to HKDF-Expand.

A password hash serves one additional purpose:

  1. Cost a lot to evaluate—in time, memory, and parallelism.

This way, even if we can't control the expected number of guesses to find a password, we can control the cost of testing each guess to drive up the expected cost of finding a password.

Specifically, password hashes usually do parts (1), (3), and (4), leaving the reproducible labeled derivation of subkeys in (2) to functions like HKDF-Expand. For example, it can actually hurt to use PBKDF2 to generate more than a single block of output, so you should absolutely use HKDF-Expand to turn a single master key from PBKDF2 into many subkeys. That said, this particular pathology is fixed in Argon2, but HKDF-Expand may still be more convenient for labeling the subkeys by purpose.

Summary:

  • If you have a high min-entropy but nonuniform secret like a Diffie–Hellman shared secret, then use HKDF-Extract.
  • If you have a low min-entropy secret like a password, use Argon2.

Then pass the resulting effectively uniform master key you get out of them through HKDF-Expand to derive subkeys for labeled purposes.


* The min-entropy of a procedure for making a choice is a measure of the highest probability of any outcome; specifically, if, among a finite space of (say) passwords chosen by some procedure, the probability of the $i^{\mathit{th}}$ password is $p_i$, the min-entropy of the procedure is $-\max_i \log_2 p_i$ bits. If there procedure is to choose uniformly at random from $n$ options, the min-entropy of this procedure is simply $\log_2 n$. For example, the diceware procedure with ten words has $\log_2 7776^{10} \approx 129.2$ bits of min-entropy.

Squeamish Ossifrage
  • 49,816
  • 3
  • 122
  • 230