I'm currently studying the elliptic curve secp256k1 implementation. In my understanding, it has efficiently computable endomorphisms: We can find out a pair of number $\lambda$ and $\beta$ from the curve such that for any point $P_1=(x,y)$ on the curve, there exists another point $P_2=[\lambda]P_1=(\beta x,y)$ also on the curve. This is because of the special form of secp256k1: $y^2=x^3+7 \bmod p$, and $p≡1 \bmod 3$. If $\beta ^3=1 \bmod p$, $(\beta x,y)$ is also on the curve since $y^2=(\beta x)^3+7=\beta ^3x^3+7=x^3+7 \bmod p$.
The scalar multiplication $[k]P_1$ can be modified to
$$[k_1 + \lambda \cdot k_2]P_1 = [k_1]P_1 + [\lambda \cdot k_2]P_1 = [k_1]P_1 + [k_2]P_2$$.
If $k_1$ and $k_2$ are about half the length of $k$ (in secp256k1, $k$ is 256-bit so $k_1$ and $k_2$ should be 128-bit), we can speed up the scalar multiplication. Since the length of $k$ is 256-bit and it requires about 256 rounds of "double and add" to directly compute $[k]P_1$. But we can parallelly compute $[k_1]P_1+[k_2]P_2$ with the following algorithm:
$\text{Input: } k_1, k_2, P_1, P_2$
$\text{Output: } [k_1]P_1+[k_2]P_2$
$d\gets\max(\operatorname{length}(k_1),\operatorname{length}(k_2))$
$R_1\gets\operatorname{EC-add}(P_1,P_2)$
$\text{if } (k_1[d-1]=1 \text{ and } k_2[d-1]=1): R_2\gets R_1$
$\text{else if } (k_1[d-1]=1 \text{ and } k_2[d-1]=0): R_2\gets P_1$
$\text{else if } (k_1[d-1]=0 \text{ and } k_2[d-1]=1): R_2\gets P_2$
$k_1\gets k_1 \ll 1,\ k_2\gets k_2 \ll 1$
$\text{for }(i\gets 0;i<d-1;i \gets i+1)$
$\quad R_2\gets \operatorname{EC-double}(R_2)$
$\quad \text{if } (k_1[d-1]=1 \text{ and } k_2[d-1]=1): R_2\gets\operatorname{EC-add}(R_2,R_1)$
$\quad \text{else if } (k_1[d-1]=1 \text{ and } k_2[d-1]=0): R_2\gets\operatorname{EC-add}(R_2,P_1)$
$\quad \text{else if } (k_1[d-1]=0 \text{ and } k_2[d-1]=1): R_2\gets\operatorname{EC-add}(R_2,P_2)$
$\quad k_1\gets k_1 \ll 1,\ k_2\gets k_2 \ll 1$
$\text{return }R_2$
I'm able to find out two pairs of $(\beta,\lambda)$ by Fermat's little theorem. They are $(\texttt{0x7ae96a2b657c07106e64479eac3434e99cf0497512f58995c1396c28719501ee}, \texttt{0x5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72})$ and $(\texttt{0x851695d49a83f8ef919bb86153cbcb16630fb68aed0a766a3ec693d68e6afa40}, \texttt{0xac9c52b33fa3cf1f5ad9e3fd77ed9ba4a880b9fc8ec739c2e0cfc810b51283ce}).$
Then I try to split $k$ to $k_1$ and $k_2$ with half-length. I implement the algorithm according to the textbook Guide to Elliptic Curve Cryptography and these discussion Bitcoin Forum, Github:
- performing extended Euclidean algorithm to $\lambda$ and order $n$ to find out two pairs of $(a,b)$ such that $a + b\lambda = c\cdot n = 0 \bmod n$ and length of $a$ and $b$ are about 128 bit.
I found out:
$(a_1, b_1) =$ $(\texttt{0x3086d221a7d46bcde86c90e49284eb15},\texttt{-0xe4437ed6010e88286f547fa90abfe4c3})$
$(a_2, b_2) =$ $(\texttt{0x114ca50f7a8e2f3f657c1108d9d44cfd8},\texttt{0x3086d221a7d46bcde86c90e49284eb15})$
- Given $k$, calculate $c_1 \gets b_2 \cdot k / n$ and $c_2 \gets -b_1 \cdot k / n$
- $k_1 \gets k - c_1 \cdot a_1 - c_2 \cdot a_2$ and $k_2 \gets -c_1 \cdot b_1 - c_2 \cdot b_2$
From page 7 of this paper, I know that the goal of step 1 is to find out 2 vectors: $v_1=(a_1,b_1), v_2=(a_2,b_2)$ such that $a_1+b_1\cdot \lambda =0$ and $a_2+b_2\cdot \lambda=0$. Then we have $c_1$ and $c_2$ such that $c_1 \cdot v_1 + c_2 \cdot v_2$ is closest to $(k,0)$ in step 2. Finally we have the vector $(k_1,k_2) = (k,0) - (c_1 \cdot v_1 + c_2 \cdot v_2)$ that is small and $k_1 + k_2 \cdot \lambda = k\ mod\ n$.
However, I have some question here:
- I don't quite understand why the scalars of the vectors $c_1$ and $c_2$ can be calculated from $b_2\,k / n$ and $-b_1\,k / n$ for the closest vector from $(k,0)$. Why does that work?
- Sometimes the length of $k_1$ or $k_2$ will be longer than 128 bits. I need to estimate the largest size of $k_1$ and $k_2$. Since I am going to implement a hardware design with constant time for side-channel attack countermeasure. How to estimate them?
- Is there any efficient multiplication algorithm for the prime field of $2^{256}-2^{32}-977$? curve 25519 has the property $2^{255} \equiv 19 \pmod p$, which makes it efficient for prime field computations, what about secp256k1?