20

Let's assume a text file that grows at its very end but is otherwise not edited. We now have 100 transmissions of this, but OTP-encrypted (different OTP each time, of course). The first 50% of the original file are identical.

Of course nothing can be said about the other 50%. But can parts of the first 50%, which stay perfectly identical, be attacked?

(Please be light on the lingo and math, I'm a noob.)

Different scenario, same question: The text is edited in various places, but only individual sentences. This shifts the bytes around, but apart from that, most of it stays perfectly identical.

9 Answers9

43

No. As long as each pad is completely random and independent, you can encrypt literally anything of the appropriate size (no larger than the pad) and retain information theoretic secure confidentiality. This attack is termed a known-plaintext attack, or KPA. The OTP encryption scheme is only vulnerable to this if you re-use padding material, which breaks the scheme. A proper OTP is not vulnerable to KPA.

forest
  • 15,626
  • 2
  • 49
  • 103
27

The short answer: No

As long as the key is not reused, OTP has perfect secrecy. Even at some point if the attacker knows the plaintext, he will only get a key that is used once. A problem may occur if the generation algorithm is predictable; that is, the attacker may use the weakness in the generation algorithm to produce previous and next bits.

Patriot
  • 3,162
  • 3
  • 20
  • 66
kelalaka
  • 49,797
  • 12
  • 123
  • 211
17

Why is OTP perfectly secure?

Let's assume you would like to encrypt a plaintext $m$ using OTP. In order to do that, you would need to pick $m$ from a possible message space of $M$ with a given length. $M$ hereby represents all possible messages of this length.

Further, you choose a key $k$ from the given keyspace $K$. Note that $K$ and $M$ have the same size. In order to encrypt this message, you calculate $c = m \mathbin{\oplus} k$ and send $c$ to the recipient. $k$ must be distributed out-of-band, which means it is known both to you and the recipient, but not the attacker.

An attacker would now intercept $c$ and attempts to recover $m$, by iterating through $K$ and attempt every possible $k$. What this means is that the attacker receives every possible $m$ in $M$, something they could have done anyways. They have no more information about your chosen $m$ than they did before.

What if we use the same $m$ multiple times?

To answer this question, let us create $c_1$ and $c_2$, this time with $m$, $k_1$ and $k_2$, such that $c_1 = m \mathbin{\oplus} k_1$ and $c_2 = m \mathbin{\oplus} k_2$.

If an attacker were to intercept $c_1$ and $c_2$, they could calculate the following:

$c_1 \mathbin{\oplus} c_2 = m \mathbin{\oplus} k_1 \mathbin{\oplus} m \mathbin{\oplus} k_2$

Since $m \mathbin{\oplus} m = 0$ and $x \mathbin{\oplus} 0 = x$, we can write the result as:

$c_1 \mathbin{\oplus} c_2 = k_1 \mathbin{\oplus} k_2$

This is not useful for the attacker, as $k_1$ and $k_2$ are randomly chosen and never reused.

Why is OTP not used everywhere then?

Even though this is a bit out-of-scope, it's a question that is often asked by beginning cryptographers when encountering a seemingly perfect encryption scheme. The problem is its usability.

Imagine you would like to encrypt a message and send it to me. How would you do that? I don't have your key, and if you want to negotiate an out-of-band key exchange (say, by meeting with me in person in a secure area), then you might as well tell me the message there (if the message is known at the time).

MechMK1
  • 445
  • 5
  • 18
6

You appear to be asking if comparative analysis could be employed between the different crypted iterations of the same text to decode it. Answer is "no": As long as the pads are produced using truly random data, the crypted text will never come out the sausage maker the same way twice. The only way comparative analysis could succeed is if keys were reused, which multiple posters to this thread already noted.

The pads must also be transmitted securely. If there is ever a period where they were not accounted for or not in a courier's control during distribution, they must be deemed to be compromised. If a courier is used for distribution, they themselves must also be absolutely trustworthy; a known quantity.

Lastly, if either the sending or receiving station are themselves compromised (those using OTPs), there's ways to tip off the opposite end of the channel of that fact who can then decide to either cease communications and leave you to your fate, or use the channel to distribute misinformation.

So if the following (4) strictures are observed, OTP are UNBREAKABLE:

  1. Never reuse keys
  2. Don't encrypt messages larger then the key
  3. Produce pads using truly random data (e.g. a hardware RNG based on physical phenomena)
  4. Ensure pads cannot be compromised during distribution to the operators
F1Linux
  • 273
  • 7
  • 13
4

Even if the attacker knew the exact plaintext of each message, the only thing they could derive is the pad used for that specific message. Assuming the OTP was used correctly, this pad is only used for that one message, and therefore doesn't give the attacker any knowledge they didn't already have.

It also isn't possible to check whether or not a message has a given plaintext, or even whether two messages have the same plaintext.

If the pad is used twice, then the two encrypted texts can be combined with XOR to yield the XOR of the two plaintexts. That is highly non-random, which is immediately apparent with a little frequency counting.

But if the plaintext is used twice, then combining the two encrypted texts only yields the XOR of the two pads, which is distributed just as evenly as the pads themselves, and as the XOR of two messages with different plaintexts would be. There's no way to tell.

3

Though the method is not vulnerable provide that the implementation is right. Of course the key is not reused. For extra entropy you can use a feedback mechanism in case the data is longer than the PAD.

3

An alternative explanation:

We could distinguish between two kinds of encryption methods:

The first kind of encryption methods has the property that exactly one plain text can result in a certain encrypted text (e.g. "sfd1!&&fd8[//zu").

Public-key-based encryption methods are such methods. (Assuming an attacker knows the public key, so the public key is given.)

It can be proven that such encryption methods always can be cracked if you have enough computation power.

The second kind of encryption methods has the property that totally different, valid plain texts can result in the same encrypted text when different "keys" (*) are used:

  • The text "Hello, how are you?" will result in "sfd1!&&fd8[//zux+$-" when using a certain key.
  • The text "Transfer EUR 250000." will result in "sfd1!&&fd8[//zux+$-" when another key is used.
  • Using a third key, "I order 5 licenses." will result in "sfd1!&&fd8[//zux+$-".

So even if you have some hypothetical technology that is able to crack encryption, you won't be able to find out if the original message was "Transfer EUR 250000." or "I order 5 licenses." unless you have any information about the "key".

In the best case, every text of 19 characters length can result in "sfd1!&&fd8[//zux+$-" when the "correct" key is used. Even with your hypothetical technology you would only be able to find out that the original message is 19 characters long.

One-time pad belongs to the second kind of encryption methods. And in addition, a different key is used each time the message is sent.

If the file is originally 100 bytes long and it grows by 50 bytes each time the message is transferred, you are only able to find out that the first message is 100 bytes long, the second one 150 bytes and so on...

Even if you have the information that the first part of the unencrypted message is equal to the last message, this information will not help you much:

With the correct "key" any text which is originally 100 bytes long and grows by 50 bytes each time will result in the encrypted messages that you have received.

(*) The term "key" seems not to be correct here. In this answer any secret information only known to the sender and/or receiver of the message is meant.

3

The beautiful thing about one time pads is that its strength and weakness is easy to demonstrate. Suppose that you wanted to send a binary value (true/false, guilty/innocent, buy/no buy, attack/wait).

Create a pad of two values, to make it simple use a pair of numbers starting with zero. First flip a coin for whether heads represents true or false, then alternate from there. My first flip was tails, so tails is true. Second flip was tails again so 0=true, 1=false. This is the first pad. I get the message true (aka 0). Now create a new pad, since you require all of the previous messages to be resent, it has to double in size. This time heads will start as true. First flip 0=true, 1=false (same as before, what a coincidence), second flip 2=false, 3=true. Get the message 3,1 (true, false). Prepare paid for 3rd message, 0=false,1=true, 2=false, 3=true, 4=true, 5=false. Get message message true, false, false as 1,2,5.

Now, assuming that you know all of that, for the fourth message, without the pad, what does 1,2,4,6 mean? Since you have the full history and know that it is being repeated, you know that 1=true, 2=false, 4=false, so we only have one question left to answer: does 6 equal true or false? I don’t know, I haven’t generated the pad...

jmoreno
  • 131
  • 3
0

Whether anything is "secure" depends upon the threat model. Simple forms of OTP are absolutely secure against eavesdropping attacks, but may be essentially worthless against known-plaintext spoofing attacks, since each piece of key needs to be used twice--once by the person encrypting the message, and once by the person decrypting it. Consequently, someone who has access to the plaintext and ciphertext forms of a message, as well as the communications channel used to transmit it, will be able to ascertain the key before it is used by the recipient.

If (for purposes of the example) a message was encrypted using a Caesar cipher OTP (add 0-25 to each letter of plaintext to get the ciphertext), someone who knew that a message was "ATTACK AT DAWN" and knew that the last four letters encrypted were "FRED" would be able to determine that the last four letters were shifted by 24, 9, 16, 10. If the person replaced the last four characters of the transmitted message with "ZXZD", the recipient would decode the resulting message as "ATTACK AT NOON".

Overcoming this problem in cases where bandwidth or timing restrictions would preclude the use of a Message Authentication Code (MAC) would require generating more key material, so that multiple possible keys could yield each (plaintext-ciphertext) combination. If message characters must be encrypted entirely independently (such that corruption of one character at the source would corrupt only one character decoded by the recipient), there would be no way to avoid granting an adversary the ability to corrupt any single character. If each character represents a one-of-C choice, however, adding additional key material, chosen from one of (C-1) possibilities, would suffice to prevent the adversary from having any control over which of the (C-1) alternative characters would be decoded. In cases where a transmission error would be allowed to cause more severe disruption of the received message, other forms of encryption and authentication would likely be more appropriate.

Note that depending upon the usage and threat model, the ability of an attacker to guarantee that a byte gets changed somehow may or may not be a meaningful weakness. It's important to note, however, that sometimes seemingly harmless weaknesses (e.g. the fact that Enigma could only map each letter to one of 25 others) may sometimes be exploitable in ways that may be hard to anticipate.

supercat
  • 359
  • 1
  • 6