5

As simple as it may sound, I was unable to find a collsion free one-way(ish) function which takes 32 bits of input and produces 32 bits of output. I apologize if I just didn't knew the right keywords to find the function.

Of course, $2^{32}$ is within a range which can be easily bruteforced within a couple of seconds on modern mid to high end retail PC's but the goal is to not let attackers find the input in a way faster than bruteforce.

Also the function should be proven to be collision free, which can be achieved by bruteforcing each $2^{32}$ input. I'm willing to take that task upon me if someone hasn't got the time to do so (or just doesn't want to).

EDIT: I forgot to mention the attack scenario and already discarded ideas:

The function will run in a white-box enviroment so I'd rather not use functions relying on keys to be secure (like Pseudo-Random-Permutations).

I also tried turncating hash functions like SHA-2, Keccak or the underlying hash function of ChaCha. Sadly, all of them have collision when being truncated to 32 bit.

The function will be used to obfuscate if-statements by comparing the outputs of the function.

Ella Rose
  • 19,971
  • 6
  • 56
  • 103
VincBreaker
  • 1,484
  • 12
  • 27

1 Answers1

5

It is wanted a permutation of the 32-bit integers that is sizably more difficult to compute in the reverse direction than it is in the forward direction. Obviously, the ratio can't be more than $2^{31}$, since that's the cost of reversing the function by brute force. Also it is useful that the thing is parametrized; otherwise, it could be tabulated, and even reduced to a rainbow table.


I propose a construction using the Discrete Logarithm Problem in $\Bbb Z_p^*$ and cycle walking.

Let $p$ be a prime with $q=(p-1)/2$ prime and $p\bmod8=7$ (which implies that $2^q\bmod p=1$, and $2$ is a generator of the subgroup of quadratic residues modulo $p$, of prime order $q$). The function $f_p$ defined for integer $x$ by $$x\mapsto f_p(x)=\min\big((2^x\bmod p),\,p-(2^x\bmod p)\big)$$ is a permutation of the integers in $[1,q]$ which is easy to compute in the forward direction, but requires solving the DLP in $\Bbb Z_p^*$ to reverse. Illustration for $p=47$

   x =  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
fp(x)=  2  4  8 16 15 17 13 21  5 10 20  7 14 19  9 18 11 22  3  6 12 23  1

This property also holds if we add two parameters $a$ and $b$ with $1\le a<q$ and $0\le b<q$, giving $$x\mapsto f_{(p,a,b)}(x)=\min\Big(\big({(2^a)}^{x+b}\bmod p\big),\,p-\big({(2^a)}^{x+b}\bmod p\big)\Big)$$

With $p>2^{33}$, the function ${\tilde f}_{(p,a,b)}$ obtained by iterating that $f_{(p,a,b)}$ while the result is larger than $2^{32}$ is a permutation of the integers in $[1,2^{32}]$.

And now $x\mapsto h_{(p,a,b)}(x)={\tilde f}_{(p,a,b)}(x+1)-1$ defines a permutation $h_{(p,a,b)}$ of the 32-bit integers, as desired, which is considerably easier to evaluate in the forward direction than it is to invert.

As an illustration, for $p=\lceil2^{32}\pi\rceil+3062=13493040767$, $a=1$, $b=42$, it is easy to compute $h_{(p,a,b)}(4)=2511267353$ (because $f_{(p,a,b)}(5)=5073155518>2^{32}$, $f_{(p,a,b)}(5073155518)=5723706427>2^{32}$, $f_{(p,a,b)}(5723706427)=2511267354\le2^{32}$). But it is sizably more difficult to solve $h_{(p,a,b)}(x)=2919107273$ for $x$. Try it!

I guesstimate that inversion for one output is several hundreds times harder than direct computation even using sophisticated algorithms to solve the DLP (index calculus or NFS); thousands times harder with simpler algorithms (BSGS or Pollard's ρ); and hundreds thousands times harder by brute force.

In a white box context, it might be slightly better to have the implementation contain $2^a\bmod p=g$ rather than $a$, and implement ${(2^a)}^{x+b}\bmod p$ as $g^{x+b}\bmod p$ or $g^{(x+b)\bmod q}\bmod p$. On the other hand, in a black box context, the forward implementation can implement ${(2^a)}^{x+b}\bmod p$ as $2^{(a\,x+b)\bmod q}\bmod p$, which allows a slightly faster evaluation.

The number of iterations of $f_{(p,a,b)}$ is about $p/2^{33}$ on average, and for $p<2^{34}$ is more than $k$ with probability less than about $2^{-k}$. In order to slow down evaluation (and reversal), we can chain several $h_{(p,a,b)}$ using different $p$, which is better that an overlarge $p$ on two accounts: the evaluation time tends to depend less on $x$; and attacks making heavy pre-computations to later efficiently solve multiple discrete logarithm problem in the same $\Bbb Z_p^*$ are thwarted.


There seems to be better constructions (in the sense of making the ratio of reverse to forward cost higher) using the DLP on a super-singular elliptic curve; see this and its comments, and perhaps this for the curve.

fgrieu
  • 149,326
  • 13
  • 324
  • 622