2

In school we only learned the formula without the normalizing coefficient:

enter image description here

But Wikipedia has one that says:

enter image description here

I am using IC in a cryptography class to find the key length and am checking various key lengths to see which one results in an IC closest to 0.066. I have to automate the key length finding process but I don't know which formula I should use. What does the normalizing coefficient change about the value? Does it make it better for finding the key length? Which formula should I use for my case?

from freq_analysis import freqs
import string

def ioc(fname, normalize: bool = False): frq = list(freqs(string.ascii_uppercase, fname).values()) n = sum(frq) for x in range(0, len(frq)): frq[x] = frq[x] * (frq[x] - 1) frq = sum(frq) if normalize: return frq26 / (n(n-1)) else: return frq / (n*(n-1))

Full context for above code here: https://github.com/DarkFireGuy/cracking-vigenere (if you want to see context for freq_analysis and freqs())

1 Answers1

2

The normalizing coefficient makes sure that for a random text the IC will be (close to) 1.0, independent of the size of the alphabet.

Jigg
  • 21
  • 2