3

This might not be a very specific question, but I was wondering this morning.

If one were to encode a message by making a string out of an entire book and writing the message by searching for a random occurrence in the "book-string" of each letter in the message and replacing it with the string index of the letter. Additionally you could make sure no number appears twice in the message as there is normally more than one occurrence of each letter in a book. This way no "letter" in the encrypted message would repeat.

How safe is this, assuming you passed on the key safely, and how would you go about breaking it?

kelalaka
  • 49,797
  • 12
  • 123
  • 211

5 Answers5

5

This isn't very secure. Generally, partial knowledge of the plaintext should not lead to leakage of other parts of the plaintext. In your book cipher it clearly does.

Say we guess the first part of the message. Then we can try and see which books would be correct for the given ciphertext. After the book (the key) is found we can then decrypt the rest of the message.

Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323
2

What you are proposing is just a homophonic substitution cipher and it is highly insecure for modern standards.

It doesn't satisfy common security definitions, such as "security against chosen plaintext attack". It is not semantic secure in the sense that an attacker can easily construct two different messages whose corresponding ciphertexts are guessable.

For instance, the letter Z is not frequently used, so how many times does Z appear in a usual book?

Thus, the message

"When I got there, he was like ZZZZZZZZZZZZZZZZZZZZZZZZZZ sleeping hard"

is much more likely to have repeated digits in its ciphertext than

"When I got there, he was reading an old book about computer science and art."

But even if you stick with weaker security definitions, like just requiring that someone having access to "some" ciphertexts is not able to recover the plaintext, it is still not very secure, because one can use all sort of frequency analysis against it. For instance, which are the most common 3-letter words? Maybe "the", "one", "are"... So we could try to replace them in the ciphertexts and see if it works. If some of them works, then we have already discovered some information about the plaintext and the key...

2

If I am to believe the estimate given at http://mentalfloss.com/article/85305/how-many-books-have-ever-been-published , a book makes for a roughly 27-bit key. That does not sound very secure.

0

Personally I consider this to be secure provided it is only used for one short message, getting less sesure each time you use it. Other people have explained why it is not secure when used for many messages.

As a system for a one off "emergency message" it has a lot going for it, as no equipment is needed, and a book that is easy to access can be preagreed. To make it a little more secure add a preagreed offset to the "index"

(It can be thought as a "one time pad" without the risk of someone finding the pad with you, or the issue with having to access the pad.)

Ian Ringrose
  • 101
  • 2
0
  1. Suppose that the "book" is the widely available Linux dictionary called linux.words. And further suppose that the book is put into some random sequence prior to use. The number of possible random sequences is a decimal number with more than two million decimal digits.
  2. In addition, suppose that the encipher process takes some approach to ensure that repeated plain text items are NOT encoded with the same cipher item. * I'd like to suggest that this sort of approach to a book cipher would be much more secure than the discussion here would seem to indicate. It doesn't even matter if the randomisation scheme uses some academically "poor" random number generator. The attacker still has a huge number of possible versions of the book to try....and that's if the book is disclosed in advance!
HowieB
  • 21
  • 1