7

I want to understand the Pollard kangaroo attack on elliptic curves. I found this Pollard's kangaroo attack on Elliptic Curve Groups Q/A pretty helpful, but not complete. The posts provides a pretty good algorithm for the attack:

def pollardKangaroo(P, Q, a, b, N):
    # Tame Kangaroo Iterations:
    xTame, yTame = 0, b * P
    for i in range(0,N):
        xTame += Hash(yTame)
        yTame += Hash(yTame) * P
    # yTame == (b + xTame) * P should be true
    # Wild Kangaroo Iterations:
    xWild, yWild = 0, Q
    wildLimit = b - a + xTame
    while xWild < wildLimit:
        xWild += Hash(yWild)
        yWild += Hash(yWild) * P
        if yWild == yTame: return b + xTame - xWild
    # No result was found:
    return None

I did the algorithm on paper and it worked. $P$ and $Q$ are the points in the ECDLP: $Q = n\cdot P$. $a$ and $b$ give the interval, in which the attack searches for $n$. So the algorithm can only succeed if $n \in [a,b]$. Now I got two problems: The hash-function and the parameter $N$ are not explained/defined.

My questions:

  1. Is the hash-function just a semi-random generator and can be pretty simple (e.g. H(point) = x + y + 1)?
  2. How exactly is $N$ defined? What value should $N$ be? How does the value of $N$ affect the algorithm?
Titanlord
  • 2,812
  • 13
  • 37

1 Answers1

3

My First Attempts:

So I did some testings on the curve $E: y^2 = x^3 + x^2 + x$ with $F_{131}$ and the points $P = (42,69)$ and $Q = 42 \cdot P$. My results for different $N$:

enter image description here

My result for a different Hash function:

enter image description here

So this got me confused, because I did not see any results for different N and I thought only the hash-function is for optimization. But the real answer is much more complex. My sources are wikipedia, handbook of elliptic and hyperelliptic curve cryptography and the original paper.

Answers:

  1. Yes, the hash-function is a semi-random number generator. But it is important for the algorithm! The runtime of the algorithm and the failure rate depends on the hash-function. If the result set is to small, the runtime gets pretty bad. If the result set is to big, the failure rate increases. With the handbook I got the result set $\{ 1,2,..., \sqrt{(b-a)}/2 \}$ and it works pretty good.

  2. I found the answer in the origional paper: $N$ defines the failure rate. If $N$ is low, the failure rate is bigger. So that's the reason I did not see significant changes in the plots. Hint: I still have no idea, if I have to store all intermediate results of the tame kangaroo or not. ( I will edit the post, if I find the answer )

New Code:

The handbook is the main source for the code optimizations. This python code is used with SageMath:

hashValue = 0
def Hash(P): 
    if P == 0: return 1
    return int(P.xy()[0]) % hashValue +int(P.xy()[1]) % hashValue+ 1

def pollardKangaroo(P, Q, a, b): global hashValue hashValue = math.ceil(sqrt((b-a))/2) # Tame Kangaroo Iterations: xTame, yTame = 0, b * P for i in range(0,math.ceil(0.7sqrt(b-a))): xTame += Hash(yTame) yTame += Hash(yTame) P # yTame == (b + xTame) * P should be true # Wild Kangaroo Iterations: xWild, yWild = 0, Q for i in range(0, math.ceil(2.7sqrt(b-a) ) ): xWild += Hash(yWild) yWild += Hash(yWild) P if yWild == yTame: return b + xTame - xWild # No result was found: return 0

This now always generates a pretty reasonable plot for the wild kangaroo (same curve and basepoint):enter image description here

Reminder:

There are a lot of improvments of the algorithm. My algorithm is not perfect! My main goal was to understand how the hash-function and the numbers of iterations affect the algorithm. And! I will edit this post, if I will find some more important informations.

Titanlord
  • 2,812
  • 13
  • 37