41

For fun, I'm learning more about cryptography and hashing. I'm implementing the MD2 hash function following RFC 1319 (https://www.rfc-editor.org/rfc/rfc1319). I'll preface by saying I know there are libraries, I know this is an old hash, and I do not intend to use this in anything real-world, so don't anyone freak out.

In learning about S-tables I notice MD2 uses an S-table based on Pi. What I don't understand, is how these numbers relate to Pi. And oddly, I can't find anywhere that describes it. The RFC just says:

This step uses a 256-byte "random" permutation constructed from the digits of pi.

In the reference code, there is the following comment:

Permutation of 0..255 constructed from the digits of pi. It gives a "random" nonlinear byte substitution operation.

Looking at the first few bytes:

41, 46, 67

I cannot see how you get these numbers from Pi. I've tried looking at the binary representation of Pi, and these numbers do not seem to match up with the first bytes of pi.

My usual Googling just leads me back to the RFC or other implementations which seem to copy the above comment exactly. No where can I find where the construction of this table from Pi is explained.

Keith
  • 513
  • 4
  • 6

3 Answers3

29

I sent an email to Ron Rivest and got an answer back.

The digits of $\pi$ are used as a sort of random number generator that is used in the Durstenfeld shuffle (see also Knuth vol 3, sec 3.4.2).

Below is some pseudocode adapted from the description and code he sent me.


S = [0, 1, ..., 255]
digits_Pi = [3, 1, 4, 1, 5, 9, ...] # the digits of pi

def rand(n):
  x = next(digits_Pi)
  y = 10

  if n > 10:
    x = x*10 + next(digits_Pi)
    y = 100
  if n > 100:
    x  = x*10 + next(digits_Pi)
    y = 1000

  if x < (n*(y/n)): # division here is integer division
    return x % n
  else:
    # x value is too large, don't use it
    return rand(n)

for i in 2...256: #inclusive
  j = rand(i)
  tmp = S[j]
  S[j] = S[i-1]
  S[i-1] = tmp

The next function just steps through the digits of pi. You'll notice that rand does not use some of it's possible outputs. Dr. Rivest says that using these values would not give a uniform distribution as they are too large.

mikeazo
  • 39,117
  • 9
  • 118
  • 183
3

Based on mikeazo's pseudocode, here's a fully working program in Python 3 that will generate the S-table:

#!/usr/bin/env python3

import io

We need 722 decimal digits of pi, including the integer part (3).

pi = io.StringIO('3' '1415926535897932384626433832795028841971693993751058209749445923078164062' '8620899862803482534211706798214808651328230664709384460955058223172535940' '8128481117450284102701938521105559644622948954930381964428810975665933446' '1284756482337867831652712019091456485669234603486104543266482133936072602' '4914127372458700660631558817488152092096282925409171536436789259036001133' '0530548820466521384146951941511609433057270365759591953092186117381932611' '7931051185480744623799627495673518857527248912279381830119491298336733624' '4065664308602139494639522473719070217986094370277053921717629317675238467' '4818467669405132000568127145263560827785771342757789609173637178721468440' '901224953430146549585371050792279689258923542019956112129021960864034418')

Generate a pseudorandom integer in interval [0,n) using decimal

digits of pi (including the leading 3) as a seed.

def pi_prng(n): while True: # based on n, decide how many of digits to work with if n <= 10: x, y = int(pi.read(1)), 10 elif n <= 100: x, y = int(pi.read(2)), 100 elif n <= 1000: x, y = int(pi.read(3)), 1000 else: raise ValueError(f'Given value of n ({n}) is too big!')

    # Compute the largest integer multiple of n not larger than y.
    # If x is smaller than that, we can safely return it modulo n,
    # otherwise we need to try again to avoid modulo bias.
    if x &lt; (n * (y // n)): return x % n

Fischer-Yates/Durstenfeld shuffling algorithm, except counting up

XXX Does counting up bias the results?

S = list(range(256)) for i in range(1,256): # generate pseudorandom j such that 0 ≤ j ≤ i j = pi_prng(i+1) S[j], S[i] = S[i], S[j]

Print the S-table as shown on Wikipedia.

for i in range(16): prefix = '{ ' if i == 0 else ' ' suffix = ' }' if i == 15 else ',' row = S[i16:i16+16] print(prefix + ', '.join(map(lambda s: '0x%02X' % s, row)) + suffix)

ryanc
  • 171
  • 2
0

Ported py version of ryanc to C

#include <stdio.h>

// first 722 digits of pi ---> .BSS static const char pi_digits[722] = { 3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7, 9, 3, 2, 3, 8, 4, 6, 2, 6, 4, 3, 3, 8, 3, 2, 7, 9, 5, 0, 2, 8, 8, 4, 1, 9, 7, 1, 6, 9, 3, 9, 9, 3, 7, 5, 1, 0, 5, 8, 2, 0, 9, 7, 4, 9, 4, 4, 5, 9, 2, 3, 0, 7, 8, 1, 6, 4, 0, 6, 2, 8, 6, 2, 0, 8, 9, 9, 8, 6, 2, 8, 0, 3, 4, 8, 2, 5, 3, 4, 2, 1, 1, 7, 0, 6, 7, 9, 8, 2, 1, 4, 8, 0, 8, 6, 5, 1, 3, 2, 8, 2, 3, 0, 6, 6, 4, 7, 0, 9, 3, 8, 4, 4, 6, 0, 9, 5, 5, 0, 5, 8, 2, 2, 3, 1, 7, 2, 5, 3, 5, 9, 4, 0, 8, 1, 2, 8, 4, 8, 1, 1, 1, 7, 4, 5, 0, 2, 8, 4, 1, 0, 2, 7, 0, 1, 9, 3, 8, 5, 2, 1, 1, 0, 5, 5, 5, 9, 6, 4, 4, 6, 2, 2, 9, 4, 8, 9, 5, 4, 9, 3, 0, 3, 8, 1, 9, 6, 4, 4, 2, 8, 8, 1, 0, 9, 7, 5, 6, 6, 5, 9, 3, 3, 4, 4, 6, 1, 2, 8, 4, 7, 5, 6, 4, 8, 2, 3, 3, 7, 8, 6, 7, 8, 3, 1, 6, 5, 2, 7, 1, 2, 0, 1, 9, 0, 9, 1, 4, 5, 6, 4, 8, 5, 6, 6, 9, 2, 3, 4, 6, 0, 3, 4, 8, 6, 1, 0, 4, 5, 4, 3, 2, 6, 6, 4, 8, 2, 1, 3, 3, 9, 3, 6, 0, 7, 2, 6, 0, 2, 4, 9, 1, 4, 1, 2, 7, 3, 7, 2, 4, 5, 8, 7, 0, 0, 6, 6, 0, 6, 3, 1, 5, 5, 8, 8, 1, 7, 4, 8, 8, 1, 5, 2, 0, 9, 2, 0, 9, 6, 2, 8, 2, 9, 2, 5, 4, 0, 9, 1, 7, 1, 5, 3, 6, 4, 3, 6, 7, 8, 9, 2, 5, 9, 0, 3, 6, 0, 0, 1, 1, 3, 3, 0, 5, 3, 0, 5, 4, 8, 8, 2, 0, 4, 6, 6, 5, 2, 1, 3, 8, 4, 1, 4, 6, 9, 5, 1, 9, 4, 1, 5, 1, 1, 6, 0, 9, 4, 3, 3, 0, 5, 7, 2, 7, 0, 3, 6, 5, 7, 5, 9, 5, 9, 1, 9, 5, 3, 0, 9, 2, 1, 8, 6, 1, 1, 7, 3, 8, 1, 9, 3, 2, 6, 1, 1, 7, 9, 3, 1, 0, 5, 1, 1, 8, 5, 4, 8, 0, 7, 4, 4, 6, 2, 3, 7, 9, 9, 6, 2, 7, 4, 9, 5, 6, 7, 3, 5, 1, 8, 8, 5, 7, 5, 2, 7, 2, 4, 8, 9, 1, 2, 2, 7, 9, 3, 8, 1, 8, 3, 0, 1, 1, 9, 4, 9, 1, 2, 9, 8, 3, 3, 6, 7, 3, 3, 6, 2, 4, 4, 0, 6, 5, 6, 6, 4, 3, 0, 8, 6, 0, 2, 1, 3, 9, 4, 9, 4, 6, 3, 9, 5, 2, 2, 4, 7, 3, 7, 1, 9, 0, 7, 0, 2, 1, 7, 9, 8, 6, 0, 9, 4, 3, 7, 0, 2, 7, 7, 0, 5, 3, 9, 2, 1, 7, 1, 7, 6, 2, 9, 3, 1, 7, 6, 7, 5, 2, 3, 8, 4, 6, 7, 4, 8, 1, 8, 4, 6, 7, 6, 6, 9, 4, 0, 5, 1, 3, 2, 0, 0, 0, 5, 6, 8, 1, 2, 7, 1, 4, 5, 2, 6, 3, 5, 6, 0, 8, 2, 7, 7, 8, 5, 7, 7, 1, 3, 4, 2, 7, 5, 7, 7, 8, 9, 6, 0, 9, 1, 7, 3, 6, 3, 7, 1, 7, 8, 7, 2, 1, 4, 6, 8, 4, 4, 0, 9, 0, 1, 2, 2, 4, 9, 5, 3, 4, 3, 0, 1, 4, 6, 5, 4, 9, 5, 8, 5, 3, 7, 1, 0, 5, 0, 7, 9, 2, 2, 7, 9, 6, 8, 9, 2, 5, 8, 9, 2, 3, 5, 4, 2, 0, 1, 9, 9, 5, 6, 1, 1, 2, 1, 2, 9, 0, 2, 1, 9, 6, 0, 8 };

int pi_prng(int n, int idx) { int x = pi_digits[(idx)++]; int y = 10;

if (n &gt; 10) {
    x = (x * 10) + pi_digits[(*idx)++];         // next digit of pi
    y = 100;
    }

if (n &gt; 100) {
    x = (x * 10) + pi_digits[(*idx)++];         // next digit of pi
    y = 1000;
    }

if (x &lt; (n * (y / n))) {
    return x % n;
    }
else {
    return pi_prng(n, idx);
    }
}


int main() { static unsigned char my_s_table[256] = { 0 }; // ---> .BSS unsigned char swap_temp; int i, j, idx;

// my_s_table [0, 1, 2, 3, ... , 255]
for (i = 0; i &lt; 256; i++) {
    my_s_table[i] = (unsigned char)i;
    }

// Fischer-Yates/Durstenfeld shuffling algorithm, counting up
idx = 0;
for (i = 2; i &lt; 257; i++) {
    j = pi_prng(i, &amp;idx);
    swap_temp = my_s_table[i - 1];
    my_s_table[i - 1] = my_s_table[j];
    my_s_table[j] = swap_temp;
    }

// printf my_s_table in hex
for (i = 0; i &lt; 256; i++) {
    printf(&quot;0x%02X, &quot;, my_s_table[i]);
    if ((i + 1) % 16 == 0)
        printf(&quot;\b\n&quot;);                         // '\b' to remove trailing ' ' from &quot;0x%02X, &quot; before newline
    }

return 0;
}