9

I'm designing a function f that should be moderately hard to invert and very fast to evaluate in a modern CPU. The function will be used in a proof-of-work function.

I've read that the middle-bits of multiplication are the harder bits to obtain, so I suspect that they are the hardest to invert.

Let $f(x) = ((x^2) >> 16)$ where $x$ is 32-bits and $f(x)$ is truncated to 32-bts, and the multiplication is carried on a 64-bit architecture.

(alternatively one could use $f(x) = ((2^{32}-x-1)*x) >> 16 )$

Suppose that, since $f$ may not be bijective, any pre-image (if exists) is accepted. Suppose also that there is no enough memory or time to precompute $f^{-1}$ for all possible $f(x)$ values (although there might be some memory/time to precompute a much smaller table)

Again, this is not a strict crypto question. In this context "hard" does not mean cryptographically hard. I'm asking approximately how hard it is, measured in number of instructions of a standard computer (with a standard instruction set). A bound on the number of any operation would be great.

I'm posting here because the question does not fit well in theoretical computer science not in programming stack-exchanges.

Maybe there is a paper that describes this?

mikeazo
  • 39,117
  • 9
  • 118
  • 183
SDL
  • 1,927
  • 13
  • 25

3 Answers3

7

This is probably not secure enough for a proof of work. I'll outline some attacks, of increasing sophistication/complexity and increasing effectiveness (decreasing runtime).

Brute force

The obvious attack is brute force: enumerate all $2^{32}$ possible inputs and check to find the first that produces the desired output. This takes $2^{32}$ time. I'm sure you already knew about this attack, and based on your question, it sounds like this is acceptable in your application.

Time-space tradeoff

You can use Hellman's time-space tradeoffs (or rainbow tables, the hyped-up version of that) to solve preimages. You have to do a $2^{32}$-step precomputation to build up the table. The table is of size about $2^{22}$. After you've built up the table, you can find a preimage of $f(x)$ in about $2^{22}$ steps of computation.

Thus, after a one-time precomputation that probably takes a few minutes or at most hours, you can invert the function in a few seconds, using a few tens of megabytes of storage.

Guess the high bits and take a square root in the integers

There's a cleverer attack, which will find the preimage using at most $2^{16}$ simple steps of arithmetic (often quite a bit faster). This will probably run in much less than a second, maybe at little as milliseconds to find a preimage.

We can write any 64-bit integer in the form $\alpha \cdot 2^{48} + \beta \cdot 2^{16} + \gamma$, where $\alpha$ is a 16-bit integer, $\beta$ a 32-bit integer, and $\gamma$ a 16-bit integer (i.e., $0 \le \alpha,\gamma < 2^{16}$ and $0 \le \beta < 2^{32}$). Now we don't know the value of $x^2$, but $x^2$ is a 64-bit integer and we know its middle bits, so we can write it in the form

$$x^2 = \alpha \cdot 2^{48} + \beta \cdot 2^{16} + \gamma$$

where we know $\beta$ ($\beta$ is just the output of your hash function) but we don't know $\alpha,\gamma$.

Now iterate over all possible values of $\alpha$. For each guess at $\alpha$, form the value

$$y = \alpha \cdot 2^{48} + \beta \cdot 2^{16} + 2^{16}-1,$$

take the square root of $y$ in the integers, and round down to an integer. Let $x'$ denote the result, i.e., $x' = \lfloor \sqrt{y} \rfloor$. Then check whether $x'$ is the desired preimage, i.e., whether $f(x') = \beta$.

I claim that this attack requires at most $2^{16}$ steps. There are only $2^{16}$ possible values of $\alpha$, so we do at most $2^{16}$ iterations. Moreover, in the iteration where we've guessed the value of $\alpha$ correctly, I claim we will successfully recover the preimage $x$. Let me explain why. First, the 64-bit integer $y$ will be very close to the 64-bit integer $x^2$: $y - x^2 < 2^{16}$. Therefore, when you take the square-root, $\sqrt{y}$ will be very close to $\sqrt{x^2}=x$. How close? Well, notice that $(x+1)^2 \approx x^2 + 2x + 1$, so for 32-bit values of $x$, consecutive squares will be about $2^{32}$ apart from each other. That's much larger than the gap between $y$ and $x^2$, so $y$ will be much closer to $x^2$ than to $(x-1)^2$ or $(x+1)^2$. Thus, taking the square root of $y$ and rounding to the nearest integer will return $x$, not $x-1$ or $x+1$ or anything else (unless $x$ is extremely small, say $x < 2^{14}$, which has very low probability and thus can be ignored).

This means that this attack is guaranteed to succeed after at most $2^{16}$ iterations.

It turns out that not all values of $\alpha$ are equally likely; when $x$ is uniformly distributed, the upper 16 bits of $x^2$ are biased towards small values. Therefore, if you iterate over all values of $\alpha$ in the sequence $0,1,2,3,\dots,2^{16}-1$, you are unusually likely to succeed early. The average number of iterations until success is $2^{16}/3$, so the attack is about $3\times$ faster than you might expect based upon a worst-case analysis.

D.W.
  • 36,982
  • 13
  • 107
  • 196
4

Say $m$ is the number and $h=f(m)$ it will be pretty easy to find $m'$ (not necessarily equal to $m$) such that $f(m)=f(m')$ on a modern computer.

Brute Force
The output of $f(m)$ is 32 bits. The following python function will do it


def find_collision(val):
  while True:
    test = random.getrandbits(32)
    target = ((test*test) >> 16) & 0xffffffff)
    if target == val:
      return test

It will take around $2^{32}$ loop iterations to find a correct value. I'll leave taking that down to individual instructions to you. Random number should be cheap if the OS has a random number generator (on linux it would be as easy as reading 4 bytes from /dev/urandom).

Remember this is simply a brute force attack. It is very possible that a better attack works too in which case even if you scale up to 160 bit numbers it could still be easy to invert.

mikeazo
  • 39,117
  • 9
  • 118
  • 183
4

Someone can find a preimage (or prove that there is no such preimage) with about $2^{20}$ trial squares, and no precomputed storage. ACtually, I believe that the below procedure will actually achieve $2^{18}$ trial squares; that requires closer analysis than I feel like at the moment.

Here is the key observation that we can take advantage of to show this:

If we have a guess $X$ of $A \ge 16$ bits of the lsbits of the preimage, and that guess has precisely $B < A-1$ bits of zeros at the bottom (i.e. bit $B$ is the lowest '1' bit of $X$), then that completely determines the lower $A+B-15$ bits of the hash output; in addition, we can set or clear bit $A+B-15$ of the hash output by setting/clearing bit $A$ of the preimage.

It is easy to verify this by examining the binomial expansion of $(X+\Delta)^2$

Here is how we can use this observation: if we have a 16 bit guess $X$ of the preimage (other than 0x0000 or 0x8000; those violate the $B<A-1$ assumption), then we determine $B$ the number of least significant zero bits.

We first check the hash of $X$; if the lower $B+1$ bits of that output do not agree with the target image, then we know that $X$ cannot be the lower 16 bits of a preimage.

Assuming that the lower $B+1$ bits are correct, then scan from bit 16 updates; check to see if bit $B+1+i$ of the hash is correct; if it is not, then set bit $16+i$ of $X$, and recompute the hash.

This loop will end when either we run out of bits in the target hash, or $16+i = 32$ (and we have no more bits we can set). If we run out of bits in the target hash, then we have a preimage $X$. If we run out of bits in the preimage, check if the hash is precisely the target we're looking for; if not, then we've proven that the original $X$ cannot have been the lower 16 bits of the preimage.

The above loop will require at most $17$ hashes (the initial hash of $X$, and one hash for every one of the 16 upper bits of the preimage).

Perform this procedure for every 16 bit initial value of $X$ other than 0x000 and 0x8000; that requires a maximum of $17 \times (2^{16}-2) \approx 2^{20}$ hashes.

To do an exhaustive search also requires the examination of the 0x0000 and 0x8000 preimages; those can be special cased.

poncho
  • 154,064
  • 12
  • 239
  • 382