4

I want to encrypt data and protect its integrity and confidentiality. However, I cannot increase the length of the data.

Are there any cipher modes of operation which provide confidentiality and integrity protection without increasing the length of the ciphertext?

Ilmari Karonen
  • 46,700
  • 5
  • 112
  • 189
Gev_sedrakyan
  • 125
  • 1
  • 1
  • 5

2 Answers2

7

If the data to protect has no built-in redundancy at all (for example, has each of its bit determined by fair coin toss), there is no way to protect integrity without expansion (Proof sketch: there are as many distinct possibilities for valid plaintext as there as possibilities for valid enciphered-and-protected data, hence every possible enciphered-and-protected data must be acceptable by the verify-and-decipher operation).

If the data to protect has some built-in redundant part (like, a fixed header of 16 octets), then we can remove it, use normal authenticated encryption, and restore the redundancy after the verify-and-decipher step. Variant: if we are certain that some lossless compression scheme will shrink the data enough (e.g. the plaintext is 2227 symbols restricted to 246 possible values, each coded as an octet, allowing to encode that plaintext as 2211 arbitrary octets by mere conversion from base 246 to base 256, making room for 16 octets), we can compress the data before applying the authenticated encryption, and expand after the verify-and-decipher step.

If the data to protect has some built-in redundancy that we have no way to turn into compressibility but can nevertheless characterize (like, plaintext is the result of applying Bzip2 compression to about one page of sensible English text), we have the option to use Format preserving encryption (that is, in the context, a block cipher with the block size equal to the data size, that implements a key-dependent, random-like, reversible transformation of the whole data). Any alteration of the ciphertext (other than replacing it with a known-valid ciphertext) will make the whole deciphered plaintext random, and we can reliably detect that (with a Bzip2 decompression algorithm hardened against rogue input, followed by even the most basic validation that the output contains a large-enough proportion of English words). Notice however that we loose something on confidentiality: contrary to the previous solution, the same plaintext will always be enciphered to the same ciphertext, which sometime is unacceptable, and does not match the modern goals of a cipher.

fgrieu
  • 149,326
  • 13
  • 324
  • 622
1

@fgrieu gives an excellent answer.

One more option. Suppose you know that the plaintext has redundancy (due to some message formatting or something), and you know how to verify that its redundancy is correct (e.g., to check the formatting), but you don't know how to compress it or how to remove the redundancy.

Then a reasonable solution is the following:

  • Encrypt the message with a length-preserving encryption scheme such as CBC encryption with all-zeros IV and ciphertext stealing (I would not recommend using counter mode or a stream cipher).

  • Then, apply a public random permutation or a length-preserving all-or-nothing transform.

  • Finally, encrypt the result with the same length-preserving encryption scheme, but with a different independent key.

On the recipient side, you can undo this (decrypt, undo the permutation or all-or-nothing transform, decrypt), then check the formatting/redundancy. If the formatting/redundancy is valid, accept the message as valid and not tampered with.

(You might even be able to skip the first step, and just apply a public random permutation followed by encryption with CBC mode. But the above might be more robust.)

Even better would be to use a variable-length length-preserving block cipher (pseudorandom permutation), if you can find one. (For instance, some schemes for format-preserving encryption might be suitable.) But the above is a pragmatic engineering solution that should be good enough if you can't find a variable-length block cipher of the right length for your messages.

I'd expect this approach will provide as much integrity as is possible, given the assumptions I outlined. (I mean this only in an engineering/heuristic sense; it is not something I have proven mathematically.) It is also pretty easy to code up and apply. Therefore, this may be an attractive approach in practice.

D.W.
  • 36,982
  • 13
  • 107
  • 196