Inspired by Magic "Nothing Up My Sleeve" Numbers - Computerphile - YouTube [5:31]. If you just need a constant to begin your algorithm, and the value of that constant isn't important, why not have a widely known convention to always use the digits of Pi or Phi or other well known constant? It would seem that choosing arbitrary numbers is bad because it causes doubt as to the origin of the numbers which may lead some to suspect they were cherry picked for undisclosed weaknesses.
4 Answers
If you just need a constant to begin your algorithm, and the value of that constant isn't important, why not have a widely known convention to always use the digits of Pi or Phi or other well known constant?
Why not use all zeros or all ones then? Why not use what user provided us with?
There is a joke: Good thing about standards is that we have so many to choose from.
Often there are some values that are supposed to have specific properties, like being prime numbers. Then you have to use CSPRNG with that constant. Why not use this CSPRNG, why not that one. This one is simple, this one is more secure, this is newer, this is more well-established etc.
In short: because they are used in widely different settings, and we usually have some vague requirements:
- Sometimes it has to be of any length and secure (DH safe prime generation)
- Sometimes it has to be of any length and just simple constant (HMAC)
- Sometimes it has to be specific length and have high entropy (AES)
- Sometimes it has to be fairly independent from another constant we used somewhere else
Last requirement is probably worst. There might be interference from two different algorithms that happen to use same constant, and since you have one standard you increase such possibility.
Even if you managed to fulfill that somehow, some values are supposed to have high entropy, and some can be plain anything. Making them meaningful decreases speed your algorithm will work on.
All that and you gain that 1% extra security that nobody did plant bug in algorithm. Sadly IMO this doesn't give us much and prevents smart designs that are faster while retaining security.
- 2,970
- 12
- 21
If you just need a constant to begin your algorithm, and the value of that constant isn't important, why not have a widely known convention to always use the digits of Pi or Phi or other well known constant?
This is a fantastic question that I feel should be directed towards every cipher/permutation proposal. I am not able to find a research paper which suggests exactly what convention should be used, and so people are probably still rolling their own ad-hoc methods for generating constants.
I am not aware of an official standard, but a widely accepted convention appears to be selecting some innocuous numbers (i.e. pi, phi, square roots of numbers, etc) then running some kind of function on them to extract output. The function in question might be as simple as selecting certain bits of the supplied number. This appears to be more or less a de facto standard way of generating constants.
Assuming a symmetric primitive, I personally advocate applying the linear layer of the primitive to an incrementing counter to generate constants.
The reason why is because it removes any degrees of freedom in generating constants, since it is always done the same with with no choice/input from the designer (i.e. me). There are no degrees of freedom if you can assume the linear layer was designed to generate highly different outputs and not designed to generate secretly weak output/malicious constants (which would probably cripple the rest of the design).
Additionally, it re-uses the resources that are already available: It requires less data (no tables of constants), which means lower storage requirements, fewer register swaps, as well as less code.
- 19,971
- 6
- 56
- 103
Ostensibly, there is a convention for constants. It's vanity.
The criteria for such a constant is to disassociate the number from any possible influence it might have on the algorithm. Yes, high entropy is useful sometimes, but most of time any number will do. An analysis hasn't been performed, but the SHA1 constants probably would work just as well if they were 12345... But no one would trust that they weren't secretly selected to have some deleterious effect on security. So you pick square roots, or something related to primes as for SHA2. Anything really as long as it is extremely apparent and beyond any possible doubt that it cannot be related to the operation of the algorithm.
If you've spent five years creating the next big cryptographic thing, you'll be tempted to show off with a novel method for constant creation. And it has to meet the disassociation requirements above. So what to do? All the simple roots have been used, as have (most) of the digits of π. The antithesis of secret collusion must be publicly generated numbers, and so the Million Dollar Elliptic curve constants use lottery numbers. I'm not a professional cryptographer, so I cannot judge how this showmanship is received by the community. SHA256 uses 256 bytes of constants. I'm sure that the function would be just as secure if those 256 bytes were the first 256 letters in ASCII form from A Midsummer Night's Dream. It is mathematically improbable that text could be somehow related to the SHA256 bit manipulations thereby satisfying disassociation.
You have to realise that cryptographers are real people with all the human traits. We all show off sometimes.
- 15,905
- 2
- 32
- 83
After reading Bernstein's BAD55 and much head-beating trying to figure this out, it hit me!:
The quick brown fox jumps over the lazy dog
This message is widely circulated verbatim—an uncapitalized "T" or a trailing period are very rare to see—and its length is the prime number 43, allowing you to repeat this message however many times are needed for your algorithm without compromising on asymmetry.
For byte order, cite the precedent of Bernstein's Salsa and ChaCha and interpret the constant in little endian.
For example, my algorithm needs a 128 byte nothing-up-my-sleeve constant, so my constant will be the message repeated thrice then truncated to 128 bytes:
The quick brown fox jumps over the lazy dogThe quick brown fox jumps over the lazy dogThe quick brown fox jumps over the lazy do
Interpreted in little endian 32-bit ints, these 128 bytes become the C array:
const uint32_t myAlgorithmConstants[] = {
0x20656854,0x63697571,0x7262206b,0x206e776f,
0x20786f66,0x706d756a,0x766f2073,0x74207265,
0x6c206568,0x20797a61,0x54676f64,0x71206568,
0x6b636975,0x6f726220,0x66206e77,0x6a20786f,
0x73706d75,0x65766f20,0x68742072,0x616c2065,
0x6420797a,0x6854676f,0x75712065,0x206b6369,
0x776f7262,0x6f66206e,0x756a2078,0x2073706d,
0x7265766f,0x65687420,0x7a616c20,0x6f642079};
Since nobody yet has proposed any better idea for a standard nothing-up-my-sleeve constant, I personally am going with this one.
- 99
- 1
- 5