Information leak in ElGamal encryption with message in base group

Question

Assume a finite commutative base group $\mathbb B$, some $g$ in $\mathbb B$, and $\mathbb{G} = \langle g\rangle$ the subgroup that $g$ generates, with choice of $\mathbb B$ and $g$ such that ElGamal encryption would be secure for a random message in $\mathbb G$ (semantically/under CPA).

ElGamal encryption still allows decryption if we use a random message in $\mathbb B$ rather than in $\mathbb G$. How much information about the message can leak if we do this? If necessary to get a result, restrict to $\mathbb B=\mathbb Z^*_p$, perhaps with $p$ an odd prime, and/or to $\mathbb G$ of prime order.

Notation: order of $\mathbb G$ generated by $g\in \mathbb B$ (noted multiplicatively) is $q$; private key $x$ is uniformly random with $0<x<q$; public key is $h=g^x$. Encryption of $m$ random in the message space ($\mathbb G$ for the standard definition of ElGamal encryption, $\mathbb B$ in the question) generates one-time $y$ uniformly random with $0<y<q$, and computes $c_1=g^y$, $s=h^y=g^{x\,y}$, $c_2=m\,s$. Ciphertext is $(c_1,c_2)$. Decryption recomputes $m$ as $c_1^{q-x}\,c_2$.

It is known that if $\mathbb B=\mathbb Z^*_p$ with $p$ a large prime and $\mathbb G$ of prime order $(p-1)/2$, there's an information leak of one bit: we can determine from the ciphertext if the plaintext is a quadratic residue or not; but nothing else as far as we know. I'm unsure about the many other cases ($\mathbb G$ of smaller prime order, or not of prime order; other base groups).

My motivation for that problem came while answering this question.

user94293 · Answer 1 · 2018-01-25T19:24:34.900

Let a prime $p = 2qs + 1$. Consider the subgroup $\mathbb{G} = \langle g \rangle$ of order $q$. Public key is $h = g^x \bmod p$ while private key is $x \in \mathbb{Z}_q$. The encryption of $m$ is given by $(c_1, c_2)$ with $c_1 = g^r \bmod p$ and $c_2 = m \cdot h^r \bmod p$ for a random integer $r \gets \mathbb{Z}_q$. Semantic security mandates that message $m$ belongs to $\mathbb{G}$.

Let $z$ be a generator of $\mathbb{F}_p^*$. So we can write $m = z^M \bmod p$ for some $M \in \mathbb{Z}_{p-1}$.

Raising $c_2$ to the power of $qs$ yields $${c_2}^{qs} \equiv z^{Mqs \bmod (p-1)} \equiv (z^{qs})^{M\bmod 2} \equiv (-1)^{M\bmod 2} \pmod p$$ The value of $(M \bmod 2)$ indicates whether or not message $m$ is a square in $\mathbb{F}_p^*$.
Raising $c_2$ to the power of $2q$ yields $${c_2}^{2q} \equiv z^{M2q \bmod (p-1)} \equiv (z^{2q})^{M\bmod s} \pmod p$$ If $s$ is small or smooth, an attacker can recover the value of $(M \bmod s)$ as the discrete logarithm in $\mathbb{F}_p^*$ of ${c_2}^{2q}$ with respect to base $z^{2q}$.

Let a cryptographic function $H \colon \mathbb{G} \to \mathbb{F}_p^*$ viewed as a random oracle. Semantic security can be met with message space $\mathbb{F}_{p}^*$ by defining the ciphertext as the pair $(c_1, c_2)$ with $c_1 = g^r \bmod p$ and $c_2 = [m \cdot H(h^r \bmod p)] \bmod p$ for a random integer $r \gets \mathbb{Z}_q$. Decryption of $(c_1, c_2)$ is obtained as $m = [c_2 /H({c_1}^x \bmod p)] \bmod p$.

Another option to get semantic security without random oracles is to take $s = 1$ ($p = 2q+1$ is a safe prime). The set of valid messages is restricted to $\mathcal{M} = \{1, \dotsc, (p-1)/2\}$. The encryption of a message $m \in \mathcal{M}$ is given by the pair $(c_1, c_2)$ with $c_1 = g^r \bmod p$ and $c_2 = m^2 \cdot h^r \bmod p$ for a random integer $r \gets \mathbb{Z}_q$. Decryption of $(c_1, c_2)$ is obtained in two steps as $m^2 = c_2 /{c_1}^x \bmod p$ and then $m$ as the square root (modulo $p$) of $m^2$ in the set $\mathcal{M}$. Note that if $m \in \mathcal{M}$ then $-m \bmod p = p-m \notin \mathcal{M}$.

Geoffroy Couteau · Answer 2 · 2018-01-26T08:51:56.620

In general, if you use $\mathbb{Z}_p^*$ with $p = q\cdot \prod_{i=1}^t p_i + 1$, where the $p_i$ are distinct small prime numbers, then you will have $O(t)$ bits of leakage. So, intuitively, a lot of information can leak if we do this: up to $O(\log p)$ with this approach, hence up to a constant fraction of all the bits of your message. As pointed in the other answer, that's relatively easy to avoid in general though.

EDIT: as pointed by Vadym Fedyukovych in the comments, a more precise evaluation of the leakage for $p$ of the form above, including the low order terms, is $O(t\log\log p)$.

Information leak in ElGamal encryption with message in base group

2 Answers2

Linked