Why do some key derivation functions (like PBKDF2) use a salt?

Question

Let me start by explaining my understanding of the various concepts involved in this question:

Salt: Random bytes of data used as secondary input for a password hashing function, like so:
```
hashfunc(<password>, <salt>) -> <hash>
```
And both the <hash> as well as the <salt> that was used are then stored (in a database, for example).

This prevents attacks using rainbow tables, making it more difficult for an attacker who has access to the output of the hash function (<salt> + <hash>) to deduce the original <password>.
Key derivation function: A function that takes a password <password> and an integer <size> as input, and generates arbitrary bytes of the desired <size> as output, like so:
```
kdf(<password>, <size>) -> <bytes of length <size>>
```
This allows us to turn a plain-text password like "admin123" into a key of a certain size that can be used by other cryptographic functions (like AES encryption, which requires a key of size 32).

So if my understanding is correct, then key derivation functions have no use for a salt. After all, the derived key isn't intended to be stored - it'll be used as input for another cryptographic function, and then it'll simply be discarded. Unless someone steals the derived key from the computer's memory in this short time frame, there is no risk of a rainbow table attack.

And yet, according to wikipedia, the PBKDF2 key derivation function takes a salt as input:

The PBKDF2 key derivation function has five input parameters:

DK = PBKDF2(PRF, Password, Salt, c, dkLen)

What purpose does this salt serve? Did I misunderstand the purpose of key derivation functions?

Related questions:

PBKDF2 and salt does not answer my question because the answers seem to assume that the derived key will be stored, which, according to my understanding of key derivation functions, should never happen.

fgrieu · Accepted Answer · 2018-10-02T15:14:57.427

PBKDF2, like most password-based key derivation functions, has a salt input because that is often useful. Two examples:

When using PBKDF2 as a key derivation function, the salt allows to re-use the same master key for multiple derived keys, e.g. a confidentiality key and integrity keys, with a different salt per use. In the same vein, PBKDF2 could be used to generate per-site passwords from a master password and the site name as salt.
When using PBKDF2 for storage of access-control password tokens, salt will make brute-force attack impossible to carry before the salt is known, defeating rainbow tables. That's noted in the question.

score 3 · Answer 2 · answered Oct 02 '18 at 14:13

It helps mitigate against known plaintext attacks (wiki article). i.e. is you have a several documents that have been encrypted with the the same derived key and one or more documents encrypted with the derived key are known* then it is possible to try and identify the key from comparing the unencrypted document and the encrypted one.

Two users encrypting the same document independently could use the same password (especially if a weak passwords are commonly in used). If a unique Salt is added for each user this will not be apparent from simply comparing the encrypted documents.

Similarly is a unique salt is added to each document will mean that while it is possible to obtaining the key for the known document - the key for other documents will still be secret and a rainbow table approach to identify the password used to create for the key will still not be possible.

Note "document" is used as a generic term for the item to be encrypted

2: or can be guessed/derived in some way

Why do some key derivation functions (like PBKDF2) use a salt?

2 Answers2