Is the only difference between password hashing for deriving a key versus for verifying a password in how the algorithms are used? That the design requirements for an algorithm for either are identical? Thar PBKDF2, Argon, and other algorithms are usable for both without changing parameters?
1 Answers
In short, you should pick a password-based KDF & parameters that have appropriate memory hardness, side-channel resistance, and computational complexity for your use case. The output is uniformly distributed over a finite field, and won't have any significant statistical weaknesses.
Is the only difference between password hashing for deriving a key versus for verifying a password in how the algorithms are used?
More or less. The output is derived from a pseudorandom function of some kind (usually another cryptographic construct), so the output from any KDF should always be close to uniformly distributed regardless of the distribution of the inputs.
That the design requirements for an algorithm for either are identical? That PBKDF2, Argon, and other algorithms are usable for both without changing parameters?
Not all key derivation functions are equal; the one property that all of these provide is resistance to preimage attacks. That is, we cannot recover the secret ($x$) even if we know the public parameters of the key derivation's output $(s, y)$, given $y = KDF(x, s)$, without guessing most possible values of $x$. However, they are different in the ways they resist preimage attacks (brute-forcing) and side-channel attacks (timing).
We don't use SHA or HMAC or HKDF as password-based KDFs because they aren't complex operations; banks of custom-designed ICs can work their way through every possible password very quickly. Passwords are not usually uniformly distributed, and aren't that complex to guess: every 8-letter alphanumeric password can be represented by a finite field of size $2^{48}$. Instead, we use complex operations that require a certain amount of time and space to run, to make brute force attacks significantly more expensive (and longer!)
For example, argon2d/id is believed to be memory-hard. It will require a certain amount of memory for the entire operation to complete; if it does not have that memory, then it needs exorbitantly more compute power to produce the same output (a space-time tradeoff). However, because it uses data-dependent accesses in it's implementation, it's more vulnerable to side channel attacks which analyze the time it takes for an operation to complete in order to reduce the complexity of a brute-force attack. The extra (and variable!) memory required to mount an attack on argon2d makes the hardware required more expensive, increasing the time & cost of a preimage attack.
On the other hand, PBKDF2 doesn't provide any memory-hardness at all; it's basically just a recursive hash. As a result, chips that can compute PBKDF2 hashes are very small, very fast, and very cheap; making a preimage attack equally fast & cheap as well. However, it doesn't have any data-dependent accesses, making it much more resistant to timing attacks.
If you are using the KDF for authentication, side channel attacks should be considered significant, but at the end of the day memory hardness is what will prevent attackers from brute-forcing secrets.
Further Reading
- 270
- 2