0

In a previous question, I described a particular keyed "hash" that mapped a 5-digit input code into a 5-digit output code. It used a 8-bit key which is very insecure - more than 99% of the time, you can infer the key given a single input/output code pair.

I put "hash" in inverted commas as although it has something in common with a hash, it always has an equal input and output length, and it is certainly not secure in anyway.

The input code to output code mapping is not one-to-one. It is possible for an output code to come from several input codes. A given input code only maps to a single output code.

Let's ignore the specific algorithm in the previous question and assume something secure has been developed.

The two most important security properties are:

  1. The attacker should not be able to predict the output code when only given the input code.
  2. The attacker should not be able to derive the key given multiple pairs of input/output codes.

I suspect this means that there are $100000^{100000}$ potential mappings between the input and output codes - essentially limitless. They key length cannot be limited by this constraint.

Does this mean that the key length should obey the normal "long enough to prevent brute forcing" rules that most current encryption does, and should be 128 bits or longer?

Cybergibbons
  • 293
  • 1
  • 7

1 Answers1

3

What you want is a pseudo-random function family from the set of 5-digit numbers to itself. As you found out, the total number of such functions is $10000^{10000}$, which is quite a lot.

Contrary to what Thomas said, most of them are not "crap" – a randomly selected function from this space is as secure as we can have. Unfortunately this would mean a key of size 10000 · 5 decimal digits (i.e. around 166096 bits, i.e. 20 KB), which is a bit impractical.

So we need a smaller key, and still a secure selection from the whole space, which just looks like a random selection. That is known as a pseudo-random function family (PRF) ("indexed" by the key). The "looks like random" can be formalized with some indistinguishability criterion, but from that your two criteria follow (each of them allows an easy distinguisher).

The key size should still be big enough to not enable brute-forcing – this nowadays practically means a 128 bit key.

Smaller key sizes might be possible, if you limit the number of queries an adversary can make, but I would not build my security on this – there shouldn't be a problem to store an 128 bit key per device (if one can store secrets there at all).

Of course, the construction of your PRF should be done in a way that doesn't have any inherent weaknesses, like the crap which was used in the example of your previous question.

Using $HMAC(key, input)$ with a secure hash function (like SHA-2), followed by an output transformation which produces five decimal digits, looks like a secure way to build this. As the input has a fixed length and actually fits together with the key in one input block of the hash function (for most crypto hashes), using a simple $H(key || input)$ should work, too.

Paŭlo Ebermann
  • 22,946
  • 7
  • 82
  • 119