Multiplication implemented in c++ with constant time

Question

I'm considering some non-cryptographic PRNG which uses multiplication of two 64-bit or 128-bit random numbers at some point.

__uint128_t a;
__uint128_t b;
__uint128_t result;
result = a * b;

Is this constant time? I don't think so, especially since it takes less time to multiply two small numbers than when they are large numbers. Is there any way to implement this in constant time?

Here someone wrote that multiplication itself on most common architectures will be constant:

https://stackoverflow.com/questions/17909343/is-multiplication-of-two-numbers-a-constant-time-algorithm

This is not in line with my experiments, in which if I multiply two random 128-bit numbers, but one is smaller than 2^64, it is faster, than multiplication of two numbers close to 2^128.

score 9 · Accepted Answer · answered Jul 16 '23 at 01:47

Is this constant time?

The answre to that would be quite compiler and CPU dependent.

Is there any way to implement this in constant time?

Given a reasonable set of constant time operations (such as additions, logical operations, shifts by constant amounts), yes, however such a construction would likely be much more expensive than what the compiler would give you.

On the other hand, why do you care? You explicitly said that these RNG's were not cryptographically secure; that is, we don't care whether an intelligent adversary monitoring the output could predict future outputs (and presumably that would include adversaries that could count cycles).

This is not in line with my experiments

As I said, that's compiler and CPU dependent.

score 3 · Answer 2 · edited Jul 16 '23 at 19:29

A previous answer correctly says it's compiler and CPU dependent. But in reality this is not an operation I would worry about. On a modern computer with a modern compiler, I would be shocked to see a 128 bit multiplication be anything except constant time.

Modern CPUs will implement at least 64 bit multiplication in constant time. Extending to 128 will be most efficient without a loop. Even if implementing with 32 bit operations, I would expect it best to not loop and not terminate early with smaller numbers.

Also, note your code multiplies two 128 bit numbers but puts the result in a 128 bit destination, which can be an issue.

A more common approach would be multiplying 64 bit integers and putting results in 128 bit destination.

Here is an example and explanation of what it might look like in assembly: https://stackoverflow.com/questions/33789230/how-does-this-128-bit-integer-multiplication-work-in-assembly-x86-64

You can see it will be constant time, and you can guess about the time by looking at: https://www.agner.org/optimize/instruction_tables.pdf (but due to out of order executions, actual runtime of everything is not simple addition).

gnasher729 · Answer 3 · 2023-07-16T22:20:36.783

There's a way to find out: Measure it. Take x = largest 64 bit number, y = largest 128 bit number, and multiply 0, 1, x and y by 0, 1, x and y one billion times each and measure the time. Obviously making sure that the compiler cannot figure out ahead what the numbers are.

If your processor has a 64 x 64 bit product instruction with 128 bit result (quite typical nowadays), chances are that the compiler's code produces three 64 x 64 multiplications, without checking if any operand fit into 64 bits. And then it depends on the hardware, whether multiplying 0 * x or x * 0 is faster than x * x; very often it is not on a modern processor. That would include 64 bit x86 and ARM processors.

Note: 128 / 128 bit division will be different, because it would be very hard to implement efficiently with one fixed sequence of operations.

Multiplication implemented in c++ with constant time

3 Answers3