1

I had an idea of creating a pseudo-random binary sequence with that code:

import random as rd
p=[]
k=10**6
for i in range(k):
  p.append(rd.getrandbits(1))

s=''
for i in p:
  s+=str(i)
a=p.count(0)
b=p.count(1)
f = open('s.txt', 'w')
f.write(s)
f.close()

And on the stage of testing it with NIST tests there came a question of choosing right parameters for launching them. If I set the length of the sequence for this file for 100000 and number of bitstreams for 1, will it mean that from this file i will have only first 100000 bits tested as one sequence? Should the data in file be separated in some way?

I have also checked this question on crypto.stackexchange, but to be honest it didn't help me. Reading the official docs for NIST package didn't bring no understanding of bitstreams parameter too.

I will be very grateful if you explain me, how to set number of bitstreams parameter correctly and explain the meaning of it in this situation and the way the input data should be given to the tests.

reogeo
  • 15
  • 4

1 Answers1

0

Yes, the NIST tests are a little opaque. You'll find that the input file size should be (no. bitstreams) * (no. bits for testing). That means the input file will be automatically partitioned into (no. bitstreams) fragments, and the test suite run that times. Yo don't have to do anything. That creates the output histogram and the final P value assessments. It's typical to pick 10 streams.

So an input file should be at least 1MB in size. A good rule of thumb is 1 Byte * (no of bits for testing). But you won't get any meaningful P values for RandomExcursions and RandomExcursionsVariant until you go to a 10MB input as in ./assess 1000000. However, these two excursion tests don't generate P values until the input is >100MB.


Note1. The above input sizes are for binary files as I find them easier to manipulate. You can modify accordingly for ASCII representations. The tests work the same way.

Note2. Given that you're being Pythonic, the generator is the Mersenne Twister. It will pass with flying colours, even though academically it has some weaknesses in the very higher dimensions. If it doesn't, it's a code problem.

Paul Uszak
  • 15,905
  • 2
  • 32
  • 83