What is the size of a 1024-bit RSA cipher compared to its plaintext?

Question

Let's assume we encrypt a 50000 bit plaintext with 1024-bit RSA and public exponent $e$ = 3: How big will its cipher be?

When we increase the exponent to let's say $e$ = 2¹⁶ + 1 = 65537, how big will its cipher be then?

score 6 · Answer 1 · edited Oct 07 '21 at 07:59

RSA is almost always used in hybrid mode, where AES (or another symmetric cipher) is used to encrypt the data itself, and RSA is then used to encrypt the random data key. That way RSA has only a static overhead: the modulus size (which is also the key size) in bytes. So for RSA-1024 that would mean an overhead of 128 bytes + whatever overhead is required for the symmetric cipher (which can be zero bytes if a stream cipher or stream cipher mode such as counter mode is used). In that case you'd have $50000 / 8 + 1024 / 8 = 6250 + 128 = 6378$ bytes if I'm not mistaken.

Using unpadded / raw or textbook RSA (i.e. RSA using only modular exponentiation) is insecure. So you always need to pad the plaintext message within RSA. In that case you would for instance use RSA-OAEP as defined in the later PKCS#1 standards. This padding scheme adds quite a lot of overhead. Generally we don't care about that, because there is plenty left over to encrypt a symmetric key when using hybrid encryption. However, if you'd use multiple RSA encryptions in sequence then you would have an overhead of 42 bytes and a payload of only 86 bytes, assuming a SHA-1 hash within OAEP for minimum overhead. A single encrypted partial message would still be 128 bytes. So you would have $\big\lceil 6250 / 86 \big\rceil \cdot 128 = 73 * 128 = 9344$ bytes taken, an increase of $2966$ bytes (!)

A few notes to these calculations:

RSA as specified in PKCS#1 always sets the output of encryption to be the modulus size in bytes, even if the actual number is smaller. That way the size is always static and doesn't need to be indicated (unless the key size is not known in advance). If you would allow alternating sizes then you would need to indicate the output size of each encryption, or you would not be able to separate the resulting RSA ciphertext blocks.
RSA-PKCS#1 v1.5 padding can also be used. PKCS#1 v1.5 padding is an older, less secure scheme. PKCS#1 v1.5 padding has somewhat less overhead in the non-hybrid scheme.
RSA-KEM is another scheme for hybrid encryption. It doesn't encrypt a symmetric key; it encrypts a master secret used to derive a symmetric key and is arguably more secure. It doesn't add any overhead to hybrid encryption. It cannot be used to encrypt the message directly or in parts.
Message integrity / authenticity is not taken into account in above. If that's required generally we use a sign-then encrypt scheme which increases the plaintext message before encryption.
Generally other overhead is also present. For instance, CMS based encryption also indicates certificates and algorithms used to the receiver. So you cannot generally expect the plaintext message to expand only with regard to the RSA key size.
RSA-1024 has already been deprecated by NIST (for medium and long term encryption) and is generally considered too small. Note that increasing the key size actually has positive effects on the relative amount of overhead required (when concatenating RSA encryption of partial messages); a larger RSA key size would actually be beneficial w.r.t. size. Unfortunately, that comes at a price of ever increasing CPU requirements, especially for RSA decryption and (one time) key generation.

Marcus · Answer 2 · 2020-10-04T19:03:45.313

This depends on the exponent $e$ and the modulus $N$ which you are using.

In laymen terms "something power 3 is always smaller than something power 65537", for instance:

$x$^$3$ $< x$^$16$$+1,$ for $x∈ℝ$^$+$

Or in general:

$x$^$e$ $> x$^$e-y$$,$ with $y>0$

It gets more complicated with the modulus due to its cyclic nature, however the maximum value of a modulus could potentially be bigger if the modulus is bigger:

$max( x$ % $N ) > max( x$ % $(N - n) ),$ with $n>0$

That being said, there can a maximum size of the cipher be calculated, for more infos see into Maarten's answer. In general, the ciphertext uses very nearly as many bits as there are in the public modulus $N$, considered RSA is used in a terminally safe way.

Which means that the ciphers length is usually (with common RSA implementations) not dependent on the size of exponent $e$, simply because $e$ is high enough to be divided by modulus $N$. However, without proper padding, using a small $e$ fails to achieve a product high enough to be divided by $N$. Let's have an example (calculated here):

Choose $e$ = 3
Choose an $N$ for which $ϕ(N)$ is coprime with e, which means an $N$ as the multiplication of 2 large primes $p$ = gcd($p$ - 1, $e$) = 1 and $q$ = gcd($q$ - 1, $e$) = 1, e.g. $p$ = 134113233377 and $q$ = 171012421319 (from this list) ==> $N$ = $p$ x $q$ = 22935028770720897164263 (76 bit)
Calculate that $d$ = 15290019180277181006379 (76 bit)
Enter these numbers in plaintext $M$ and compare it to their resulting cipher $C$:

   M: 1             -> C: 1                        (1 bit)
   M: 13            -> C: 2197                     (12 bit)
   M: 134           -> C: 2406104                  (20 bit)
   M: 1349          -> C: 2454911549               (32 bit)
   M: 13497         -> C: 2458735114473            (44 bit)
   M: 134975        -> C: 2459008378109375         (54 bit)
   M: 1349752       -> C: 2459019309075947008      (64 bit)
   M: 13497527      -> C: 2459023134921900302183   (72 bit)
   M: 134975276     -> C: 4975384384602091248435   (73 bit)
   M: 1349752761    -> C: 21423635623920893065273  (75 bit)
   M: 13497527614   -> C: 4504951087215542921902   (72 bit)
   M: 134975276143  -> C: 13105173284468409708818  (74 bit)
   M: 1349752761432 -> C: 258234696569487676944    (68 bit)

As you can see, the size of the cipher stops at 75 bit, so length($C$) ≤ l($d$) - 1 = l($N$) - 1

It's pretty obvious that this encryption is completely insufficient, not only because of the cipher size being too low for and below $M$ = 1349752, but even more so because the $M$'s 134, 1349, 13497 up until 13497527 all start with the numbers "24" (the $M$'s 134975, 1349752 and 13497527 even all start with "24590").

Let's do the same with another $e$:

Choose $e$ = 65537 (instead of 3)
Calculate that $d$ = 4510925444415510242433 (72 bit)
Enter these numbers in plaintext $M$ and compare it to their resulting cipher $C$:

   M: 1             -> C: 1                        (1 bit)
   M: 13            -> C: 17466161323880056389598  (72 bit)
   M: 134           -> C: 2107714247256743075865   (72 bit)
   M: 1349          -> C: 7477203662088274241639   (73 bit)
   M: 13497         -> C: 5132009836650541594940   (73 bit)
   M: 134975        -> C: 16541984621407927196414  (74 bit)
   M: 1349752       -> C: 20887420686729795448028  (75 bit)
   M: 13497527      -> C: 21682424773647631361120  (75 bit)
   M: 134975276     -> C: 3676623109854753818222   (72 bit)
   M: 1349752761    -> C: 22872817161688280222695  (75 bit)
   M: 13497527614   -> C: 18762631911648547002249  (74 bit)
   M: 134975276143  -> C: 21146132359162765255647  (75 bit)
   M: 1349752761432 -> C: 14030823333728076106071  (74 bit)

Again the size stops at 75 bit, so length($C$) ≤ l($d$) - 1 = l($N$) - 1

In this example, the ciphers size is always 72 to 75 bit and the encryption looks random as well, so $e$ is chosen sufficiently enough. What it also shows is that $d$ is no measure for the maximal length of the cipher, but only the modulus $N$ sets this maximal length.

For further explanations on the problems with a low $e$, have a look into this answer, giving reference to this paper. Basically it says that a low $e$ allows the reconstruction of the private key $d$ if some bits of $d$ are leaked. So even with proper random padding, $e$ = 3 should probably be avoided in RSA implementations (there are even further problems with a low $e$ e.g. in case of encryption with three distinct public keys, explained here, which only reinforces my point).

What is the size of a 1024-bit RSA cipher compared to its plaintext?

2 Answers2