2

Let us say we have three different plaintexts (all alphabets, A-Z): $x$, $y$ and $z$, each of length $21$. Let the key, $a$, be also of length $21$.

Now, what we have is $x \oplus a$, $y \oplus a$ and $z \oplus a$. How can we find out $x$, $y$ and $z$ from this?

I have looked around the web and found that the usual way to break this is to do statistical analysis and dictionary attack on the values of type $x \oplus y$ which we can get from ${(x \oplus a)} \oplus {(y \oplus a)}$. I used xortool for this and the key which I got from it gave random garbage for the plaintexts and so, that didn't work.

I cannot help but think that since we have three different ciphertexts (as opposed to just two which are used in most dictionary attacks), we must have some extra constraint that we can impose on the possible set of keys but I'm coming up short. Any help is appreciated. If you can provide links to any tools, that would be great as well.

Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323

1 Answers1

3

You're using bytewise xor, and three plaintexts $x,y,z$ with a small alphabet (A-Z, which I assume means ASCII values 0x41-0x5a). Then knowing $a \oplus x$, at a certain index, we know all possible values of $a$, at that same index, namely $\{0x41 \oplus (a \oplus x),\ldots, 0x5a \oplus (a \oplus x)\}$. But we have 2 more constraints from $a \oplus y$ and $a \oplus z$ at that index. The intersection of these three sets is the set of options for $a$ at that index. This already reduces the options at each index, probably.

Then you have to use probable plaintext, say 'THE' (if you have English text). You can try 'THE' at all positions as part of text $x$, compute $a$ on these corresponding places, and then compute $y$ and $z$ as well. Check whether the results look like (parts of) English words, and try to extend words in those other texts. Also try 'THE' in texts $y$ and $z$ in all positions. You can immediately abort all attempts where $a$ is not in the sets we computed above. Google "reading in depth". With three texts this should give pretty fast results. All of this is easily automated.

Doing statistics only on 21 characters will not get all the way, I think. You do need some idea of the language for the plaintext, etc.

Henno Brandsma
  • 3,862
  • 17
  • 20