5

We are creating a device with a small microcontroller (20 MHz CPU 16 KiB RAM).

We need some way to securely send signed files to device (only signature, no encryption necessary). An external company has come up with an elliptic curve solution but it needs about a minute to verify a signature, without counting the file hash time.

Could you suggest some other moderately secure algorithms that can run faster in such small CPU? (a few seconds would be OK, hundreds of milliseconds would be fantastic)

3 Answers3

7

Given that the bottleneck on the embedded device is local non-interactive public-key signature verification, the best industry standard for that is RSA (with a standard signature padding, such as PKCS#1 RSASSA-PSS, PKCS#1 RSASSA-PKCS1-v1_5), which is usually significantly faster than ECDSA for signature verification including for the common $e=65537$; and for good implementations always faster when using $e=3$, which allows a speedup by a factor of about $8$. Rabin signature verification is nearly twice faster than RSA with $e=3$, and is also standard if not common, e.g. was in ANSI X9.31:1988 and is in ISO/IEC 9796-2:2010.

Note: the absolutely fastest seems to be Daniel J. Bernstein's A secure public-key signature system with extremely fast verification (2000); this is essentially Rabin with an expanded signature allowing extremely fast verification, using an idea he first outlined there.

Both RSA and Rabin are based on modular arithmetic modulo $N$ of secret factorization. The time for signature verification is dominated by $17$ (RSA, $e=65537$), $2$ (RSA, $e=3$), or just $1$ (Rabin, $e=2$) multiplication(s) modulo $N$, where $N$ has $n$ bits. $n=2048$ is acceptably secure till 2030 according to NIST and French ANSSI.

When appropriately implemented using standard (quadratic) algorithms working on $w$-bit words, the computation time for one multiplication modulo $N$ is dominated by $\approx(n/w)^2$ executions of an elementary operation consisting of

  • two multiplications of two $w$-bit word giving a $2w$-bit result
  • addition with carry of the corresponding two results into temporary values
  • three reads of a $w$-bit word
  • one write of a $w$-bit word
  • on register-starved CPUs only, some read-writes for temporaries

Notoriously, careful optimization of the core loop is essential (assembly language shines!); and using the wrong algorithm will impact speed (in particular: separating modular multiplication from modular reduction increases the memory accesses; Montgomery arithmetic at best does not help).

Actual execution time can be in seconds on a mere 8-bit CPU (for 2048-bit RSA, $e=3$, an implementation I wrote verifies a signature in $1.25$s on a 8051 core with 5M cycle/s and 4-cycle multiplication of bytes giving 16-bit result, and no hardware 16-bit addition).

Execution time decreases about quadratically with the word size, allowing time in milliseconds for a modern 32-bit CPU (the question does not specify which core is used; ARM CPUs tend to be good at this, especially those with UMLAL and UMAAL).

Per eBACS benchmarks, on an ARM Cortex-A8, RSA-2048 ($e=3$) is timed at a median of 555418 cycles (28ms scaled to 20MHz), versus 2594303 cycles for one of the fastest elliptic-curve signature system, ed25519.

fgrieu
  • 149,326
  • 13
  • 324
  • 622
2

One alternative to RSA that may bear looking at is hash based signatures, perhaps as worked out in this IETF draft. Here, the signature validation consists of evaluating a series of perhaps a few hundred hashes. I'm not certain how fast your CPU can evaluate a hash (or an AES encryption, which the draft allows to be used instead of a hash), however I suspect that it'd be able to meet your performance goals.

poncho
  • 154,064
  • 12
  • 239
  • 382
1

There are modular-root signature systems other than standard RSA.

Can the device [interact with a party that is allegedly the signer] during verification?
If no, can it interact with a more powerful computer that does not need to be the signer?
hardware - software