8

Libsodium builds their KDF on top of BLAKE2b:

BLAKE2B-subkeylen(key=key, message={}, salt=subkey_id || {0}, personal=ctx || {0})

Besides the key, the function has two additional arguments: The subkey ID (a 8 byte value which is 0-padded and becomes the salt) and a personalization context. They document those two values as follows:

subkey_id can be any value up to (2^64)-1

And:

Similar to a type, the context ctx is a 8 characters string describing what the key is going to be used for. Its purpose is to mitigate accidental bugs by separating domains. The same function used with the same key but in two distinct contexts is likely to generate two different outputs. Contexts don't have to be secret and can have a low entropy.

If I understand this correctly, both the 8-byte subkey ID (which becomes the 16-byte salt) and the 8-byte ctx (which becomes the 16-byte personalization) are used for "namespacing", don't have to be secret and may be of low entropy.

In BLAKE2b, what is the difference between the salt and the personalization context? Can the personalization context be considered the "second part of the salt", introduced only for semantic purposes? And if yes, wouldn't using a single 32-byte salt be simpler to understand and less error-prone to implement?

2 Answers2

5

Context should be constant for one "keystore": that is the property of the KDF that is used for namespacing the salt space and make sure that two different application (or keystore) will not reuse the same salt.

Subkey_id, however, needs to be different for each key within a given key store and it is NOT to be used for namespacing. It can be constant for a given subkey ID though (even when the unhashed data changes).

The purpose if to make sure different applications can safely generate unique salt values without having to perform a full scan: all you need if for each app to have it's own context and then they can use a simple counter for subkey ID.

Stephane
  • 296
  • 2
  • 5
1

I know this is an old question but I had the same question and after a bit of searching I think found the answer so I wanted to share:

The difference between the salt and the personalization contex can indeed be considered purely semantic.

More details

Looking at the implementation in libsodium we simply xor all the parameters into the Initialization Vector which then forms the basis for the hashing rounds.

blake2b_init_param(blake2b_state *S, const blake2b_param *P)
{
    size_t         i;
    const uint8_t *p;
blake2b_init0(S);
p = (const uint8_t *) (P);

/* IV XOR ParamBlock */
for (i = 0; i < 8; i++) {
    S->h[i] ^= LOAD64_LE(p + sizeof(S->h[i]) * i);  // <- here
}
return 0;

}

Looking also at the blake2b_param definition we have:

typedef struct blake2b_param_ {
    uint8_t digest_length;                   /*  1 */
    uint8_t key_length;                      /*  2 */
...
    uint8_t salt[BLAKE2B_SALTBYTES];         /* 48 */
    uint8_t personal[BLAKE2B_PERSONALBYTES]; /* 64 */
} blake2b_param;

Because this is a cryptographic hashing algorithm, if even a single bit changes in its initialization parameters, the whole output will change in a way that's essentially impossible to trace back.

As such, in practice there is no difference between the salt, personalization or even (I think?) the initial key that gets passed into the algorithm. Everything gets digested to a fine and untraceable binary paste :)

Hope this helps!