10

I was wondering, whether there exist such algorithms/enciphering procedures which both compress and encrypt the input data. That means, for starters, the output will be both smaller in size and difficult to decrypt...and if the compression algo is good, then the bits will be almost random.

Also any twin-encryption algo-s around?: by which I mean, suppose I have 2 data strings (alphanumeric only, say for now) -- Using them both, and an algo, I produce the encrypted output - I take in a pair, and produce a pair. The procedure is algo-based and not key-based. Any comments on this? And how this could relate to the earlier crypto-compression problem? (One way it would relate is, if the data is not just strings of characters, but a large multimedia file, it can be compressed fast and securely apart from being encrypted)

The FAQ says "please ask questions that can be answered, not merely discussed" but I can not be more specific, atleast not now: if it had some coding, I might have posted it on stackoverflow!

All comments, with links to other Q&A are welcome. Other forum addresses are also welcome!

pratchit
  • 129
  • 1
  • 3

6 Answers6

13

Unlike some crypto tasks like encryption+authentication combining compression+encryption have nothing in common/non synergies, so combining them into one algorithm offers no advantages.

In practice this means you first compress your data, and then encrypt it, because encrypted data is uncompressable. That way you cleanly separated the separate concerns, and you can vary them independently.

A good point to combine them is at the protocol/fileformat level. For example TLS supports compression and encryption, as do most archive formats(zip, rar, 7z,...).

But compression can hurt security: Typical encryption is designed to hide everything about the data, except its length. Compression depends on the data itself and affects the length of the data. That means that it can leak information about the data through the length bypassing the encryption. This is particularly severe if you compress data chosen by the attacker and secret data within the same context. This lead to the CRIME attack against TLS compression.

CodesInChaos
  • 25,121
  • 2
  • 90
  • 129
6

Also any twin-encryption algo-s around?: by which I mean, suppose I have 2 data strings (alphanumeric only, say for now) -- Using them both, and an algo, I produce the encrypted output - I take in a pair, and produce a pair. The procedure is algo-based and not key-based.

One fundamental fact (or perhaps I should say "assumption") in cryptography is that you cannot securely encrypt data without there being some unknown secret involved. This is known as Kerckhoffs's_principle. In such a scheme as you propose here, any adversary with access to the algorithm could reverse the process and decrypt the data trivially because the algorithm was simply a map between two inputs and two outputs. We have to assume that the adversary knows the algorithm, perhaps it is publicly available, perhaps they craftily reverse-engineered it, etc, and so they will be able to perform the reverse mapping.

B-Con
  • 6,196
  • 1
  • 31
  • 45
2

Although I do not have a formal proof I think a compressing encryption algorithm could not be secure. The reason is that we know there is no way to compress every input (information theoretically this is impossible). That means that a lossless compression algorithm can really only compress certain input strings. That in turn means that if our encryption algorithm manages to compress an input, then that fact reveals information on the input.

Guut Boy
  • 2,907
  • 18
  • 25
2

Yes! there exist some examples. But whether they are secure or not is another question. In my opinion, there should not be any problem with the idea of crypto-compression**.

Examples of crypto-compression schemes are Secure Arithmetic Coding, Secure Huffman Coding, Secure Adaptive Huffman Coding, and so on. However, as far as I know most of the schemes either lack the good properties of a good compressor or are not secure enough for practical purposes. Apparently, there might be some examples that I am not aware of.

These methods mostly aim to increase performance by combining the two blocks of "Source Coding" and "Encryption". It worth mentioning that there exist a different but successful attempt for combining the blocks of "Encryption" and "Channel coding". The very best classic example is McEliece cryptosystem.

Block Diagram of a Communication System

**I will try to come up with some "more" acceptable reasons later :)

Glorfindel
  • 506
  • 1
  • 11
  • 22
Habib
  • 961
  • 8
  • 23
1

I think a combined algorithm might be less secure. Imagine compressing a long string of 111… with a LZW algorithm, which would reduce it to a very short message, and then encrypting it. The reduction in size would leak information about the nature of the plaintext.

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240
ddyer
  • 509
  • 3
  • 5
0

Compressocrat

The Compressocrat cipher was described by SHMOO (Larry Loen) in the May-June 1983 of the Cryptogram magazine. An interesting feature of this cipher is that the ciphertext is usually shorter than the plaintext.

-- https://sites.google.com/site/bionspot/compressocrat_cipher

Like every practical combination of compression and encryption I've ever seen, Compressocrat first compresses (in this case, a trinary variant of static Huffman compression intended for English plaintext letter-frequencies and hand-encryption) and then encrypts (in this case, using a simple substitution cipher which is known to be fairly easy to implement by hand).

David Cary
  • 5,744
  • 4
  • 22
  • 35