The generalized perceptron convergence theorem is for a defined threshold T.
When you do the maths it all comes to an upper bound and a lower bound.
The lower bound looks like this!
Therefore
$$ (x^{\ast})^T x(k) > k \delta. \qquad\qquad (4.67)$$
We can see if the threshold is equal to zero, this is still true.
But what happens with the upper bound?
The generalized theorem looks like this:
We now have an upper bound (Eq. (4.75)) and a lower bound (Eq. (4.70)) on the squared length of the weight vector at iteration $k$. If we combine the two inequalities we find
$$ k\Pi \geq {\lVert x(k) \rVert}^2 > \frac{(k \delta)^2}{{\lVert x^\ast \rVert }^2} \quad \text{or} \quad k < \frac{\Pi {\lVert x^\ast\rVert}^2}{\delta^2}. \qquad \qquad (4.76)$$
Because $k$ has a an upper bound, this means that the weights will only be changed a finite number of times. Therefore, the perceptron learning rule will converge in a finite number of iterations.
I am struggling to find how this is true for a zero threshold too since you can't divide by zero.
Can someone help me here get it for this special case?