Would a stream cipher gain any benefit from a more complicated function than XOR?

Question

In a stream cipher, the keystream is usually XORed with the plaintext, which is a 1-bit key-dependent bijective operation (ie, if the key bit is 0, 0->0 and 1->1, whereas if it is 1, 0->1 and 1->0). I have seen, eg, bytewise addition discussed here (Can I safely replace XOR with ADD in a stream cipher?) as having minimal security benefit ("If your stream cipher is so broken that this makes a difference, run." – CodesInChaos) and sounds like it would still be nearly as vulnerable to a malleability attack or keystream-reuse attack, but what about a more non-linear bijective function?

EDIT: This is a question about a hypothetical science-fiction encryption scheme and whether it could have advantages in principle, not about why this has not heretofore been done in real life. This is also not a question about whether XOR is "good enough" as long as the cryptosystem is always properly implemented and used (which I admit it is), but about whether this scheme fixes known vulnerabilities that XOR has when a cryptosystem is, as inevitably happens, sometimes not properly implemented and used. END EDIT

For example, multiplication mod 257 or 65537 (with the all-0s value taken as -1 as in IDEA), where 8/16 bits of plaintext is turned into 8/16 bits of ciphertext dependent on 8/16 bits of keystream? Would that get rid of malleability or would it replace it with multiplicative malleability? And would it still be trivial to break with a reused keystream?

EDIT: Okay, I think this one is a bust on further reflection. Instead of P1 XOR P2, you get P1*(P2^-1), which I assume is almost as easy to break (P1 and P2 being the two plaintexts encrypted with the same key), and the malleability becomes multiplying the ciphertext by Pa*(P1^-1) to get P1K1Pa*(P1^-1) = Pa*K1 (where P1 is the original plaintext and Pa is the attacker's desired decryption). And I think this applies to any commutative arithmetic function. END EDIT

Or, more generally, an S-box-like construction where n bits of keystream select between 2^n bijective n-bit transforms (like the DES S-boxes where each row was required to be bijective, per criterion S-3 here)? Does that get rid of the malleability attack entirely, as long as the box is well-designed? And would it make the key-reuse attack harder? (I am assuming that the box contents are known to the attacker and all of the security is in the keystream bits)

And would either idea introduce any obvious vulnerabilities? (This question concerns science fiction I am writing, so you can assume it's implemented without side-channel attacks)

EDIT: Also assume that it can be implemented very cheaply in time (hardware/gate count is of secondary importance; envisage military applications by a nation that can mass-produce and custom-design VLSI chips). On purpose-built hardware, an entire AES round, including both of the arithmetical functions, can be done in 10 cycles (or possibly the whole AES encryption, it isn't clear), so assume 5 clock cycles at most, 1 cycle if it's implemented as a hardcoded lookup table (or pair of tables, 1 for encryption, 1 for decryption). The reference cipher I am using for performance is Trivium, which is parallelisable to 64 bits per clock cycle, meaning 16 copies of a four-bit transform unit, 8 copies of an 8-bit unit, etc if it costs 1 cycle per transform or the transform can be efficiently pipelined. END EDIT

Daniel S · Accepted Answer · 2024-09-29T12:34:42.253

I think that the most generic mathematical combiner of key and plaintext in a character preserving manner would be a quasi-group, this is similar to a group, but does not necessarily have associativity or identity. We'd the latin square (bijective) property in order to uniquely decrypt, and to preserve character sets.

Now, groups allow malleability, for if a character were encoded $c_i=p_i*k_i$, we could swap for an encryption of $p_i'$ with the cipher character $c'_i=p'_i*(p_i^{-1}*c_i)=p'_i*k_i$. Attacks for exploiting key reuse would be similarly possible.

For quasi-groups, the non-associative property means the above does not necessarily work, because it is not necessarily true that $p_i^{-1}*(p_i*k_i)=(p_i^{-1}*p_i)*k_i$ (indeed the inverses are not defined in the absence of an identity). However, for small sets of character sizes, one could have the quasi-group operation written as a look up table; identify $k_i$ given $p_i*k_i$ and the look up table and then compute $p'_i*k_i$. For larger character set sizes, I'm not sure how one might implement decryption based on the non-associative structure.

score 0 · Answer 2 · answered Oct 01 '24 at 18:53

In short, no.

Consider a stream cipher to be an analogue of a one-time pad. You're combining a signal (the plaintext) with noise, and using the information-theoretic value of the noise to blur the signal to indecipherability. The difference formally is that you're seeding a PRNG with a key, and you can even think of this as a form of compression. You're trading easy transport of your randomness for some loss of searching.

If we consider a character-level mix, addition works fine. Your plaintext is an A, the stream character is E; A is 1, E is 5, the result is 6 or F, and poof, Alice is your auntie.

Most stream ciphers actually work at the bit level -- LFSRs literally work that way, and it's trivial to see how counter-mode works at a bit level. We typically work on groups of eight bits, but that's not necessary. You can encrypt nine bits with a stream cipher. While we're at it, I'll note that most hash functions also work at the bit-level. It was in fact one of the requirements of SHA-3. (Full disclosure: I was one of the Skein/Threefish authors.) This is the difference between a stream cipher and a block cipher, that a stream cipher (usually) works on bits, and block ciphers work on chunks of bits we call blocks. You could have a block cipher of seventeen bits if you wanted. (And yes, further in the minutia, that means that you can consider some stream ciphers like RC4 to be an eight-bit block cipher with a built-in chaining mode. It all depends on which lenses you like to view the world through.)

At the bit level, XOR is addition. Even in blocks of bits (like words), an XOR is addition without carry of columns of bits. You can view a generic XOR of a word size to be a whole bunch of bit additions done in parallel.

You can make a more complex function than XOR that satisfies our requirement -- that it preserves the infomation-theoretic mixing a signal and noise. You just aren't going to be better in the sense of more secure, and you're going to be slower. Or not faster, anyway. Typically, both add and xor take the same amount of time, a single clock (or microclock) cycle.

So there's no benefit because xor is good enough; on an information theoretic level you can't get better; xor works on single bits as well as groups; thus at the bit-level, xor and add are the same (otherwise it's an alternative); and you aren't going to find something else that's faster.

Would a stream cipher gain any benefit from a more complicated function than XOR?

2 Answers2