If I 'shuffle' the characters in a long text, is there some number of shuffles that will produce apparent randomness approaching a cryptographically secure random number generator? A chapter in a widely available novel online, then, would produce a key suitable for a stream cipher, Vernam cipher, etc., would it not? If not could someone help explain why?
1 Answers
If by "shuffle", you mean permutation of the characters of your alphabet, then no. The reason is text in English (and any other language for that matter) contains quite a bit of redundancy. If you simply permute the letters, you can always do statistical analysis on the resulting text and figure out quite easily that the most commonly occurring letters are probably E, A, and T. Moreover, compositing permutations on top of permutations just results in another permutation, which means shuffling the letters more times is useless since you could have just used a different permutation to begin with.
Let's say you use a pretty long chapter of a novel, permute it, and use the result as your key. Let's assume, best case scenario, you're going to use it to encode a message as long as your key, like a crummy one time pad. Except the reason a one time pad is secure is because in generating the key, each letter is independent and identically distributed with probability 1/26, whereas your key was generated by a permutation of the distribution of regular english text which is not uniformly distributed.
This means some letters are going to occur more frequently than others such as (A, E, or T) XOR (permutation of A, E, or T). You can still do statistical analysis on this ciphertext, and figure out what permutation of letters correspond to the most frequent ones in a number of trials. Then you plug in those permutations into the key and recover the key by guessing the rest of the words in the key.