Combining LFSRs for Stream Ciphers: Why do we need high non-linearity?

Question

Linear Feedback Shift Registers (LFSRs) can be excellent (efficient, fast, and with good statistial properties) pseudo-random generators. Many stream ciphers are based on LFSRs and one of the possible designs of such stream ciphers is combining outputs of $m$ LFSRs as input of a boolean function $f:GF(2)^m\rightarrow GF(2)$. This last function has to be carefully selected.

My question is a rather elementary one. I understand that using one LFSR to produce the keystream is not appropriate as one can create the whole keystream by knowing a tiny fraction of it: if the tap positions of a length $n$ LFSR are known, one needs $n$ bits to determine the entire keystrem sequence, and if they are not known, one needs $2n$ bits (by using the Berlekamp-Massey algorithm to find out the tap positions). However, why do we need a non-linear combination of LFSRs (among all sorts of other requirements)? What would be the problem of getting a number of LFSRs with appropriate lengths and tap positions and XOR together their output to produce the keystream?

Dilip Sarwate · Answer 1 · 2013-05-11T19:39:57.140

The Berlekamp-Massey algorithm is an iterative method for finding the shortest LFSR that can generate a given sequence of bits. The given sequence might or might not be generated by an LFSR: the Berlekamp-Massey algorithm does not care. It just finds the shortest LFSR that can generate the given sequence, and if the sequence has been generated by an LFSR of length $n$, then the Berlekamp-Massey algorithm is guaranteed to find this LFSR after examining no more than $2n$ bits of the sequence. A simplistic description of what happens is as follows. After the algorithm has found the shortest LFSR that generates the first $k$ bits of the sequence, it examines the $(k+1)$-th bit of the sequence. If this $(k+1)$-th bit of the sequence matches the $(k+1)$-th bit of the output of the current LFSR, the LFSR is accepted as the one that generates the first $k+1$ bits. If not, the LFSR is updated so that the new, typically longer, LFSR generates the first $k+1$ bits. As stated earlier, if the sequence in question was in fact generated by an LFSR of length $n$, then the Berlekamp-Massey algorithm is guaranteed to find this LFSR by the time it has examined $2n$ bits of the sequence. How does the algorithm know that it is done? Well, it doesn't, but after the correct LFSR has been found, the $(2n+1)$-th, the $(2n+2)$-th, the $(2n+3)$-th, $\ldots$ bits of the given sequence match the corresponding outputs of the LFSR and so the Berlekamp-Massey algorithm does not update the $n$-bit LFSR it has found.

What does all this have to do with the question asked? Well, the (bit-by-bit XOR) sum of the outputs of the various LFSRs is a sequence that is generated by a longer LFSR (typically, the length of the longer LFSR is the sum of the lengths of the LFSRs whose outputs were summed). So, the cryptographic security is not significantly larger. What is needed is some way of combining the constituent LFSR outputs so that the resulting sequence has linear complexity much larger that the sum of the LFSR lengths. The linear complexity of a sequence is defined as the length of the shortest LFSR that can generate the sequence. What we want is a sequence that has high linear complexity but which can be generated easily as a nonlinear function of the outputs of short LFSRs. The legitimate users of the system can encipher and decipher easily, but a cryptanalyst attempting to break the system via a known plaintext attack has to either figure out the nonlinear function (and the constituent LFSRs) which is not easy to do or attempt a Berlekamp-Massey algorithm attack which may fail because not enough bits of the sequence can be determined via a known plaintext attack to find the shortest LFSR that generates the sequence.

D.W. · Accepted Answer · 2013-05-12T04:28:02.580

If there was no non-linearity, then every bit of keystream output would be a (known) linear function of the unknown key bits. Consequently, in a known-plaintext attack scenario, each bit of known keystream output would allow us to write a linear equation on the unknown key bits. If we have a 128-bit key, there are 128 boolean unknowns (variables), so once we have 128 bits of known keystream, we have 128 linear equations in 128 unknowns. At that point it becomes easy to solve for the original key bits using standard methods for solving a system of linear equations (e.g., Gaussian elimination). Thus, an attacker could recover the key from 128 bits of known output from the stream cipher, which is a total break of the stream cipher.

The only way to prevent this kind of attack is to make sure that the cipher contains non-linear elements. To prevent other related but fancier attacks (e.g., linear cryptanalysis), one also needs sufficient non-linearity in the stream cipher.

Clarification: To keep it simple, my answer above assumes that the feedback polynomial for the LFSRs is known. The attack does generalize to the case where the feedback polynomials are not known (you need twice as much known keystream output); in that case, the attack gets a bit more complicated, but the basic idea still applies. I tried to keep it simple to help you understand the intuition without getting bogged down in mathematics, but if you want to see more details about the case where the feedback polynomials are not known, Dilip Sarwate has an excellent answer that explains that case more thoroughly.

Combining LFSRs for Stream Ciphers: Why do we need high non-linearity?

2 Answers2

Linked