In Discovering faster matrix multiplication algorithms with reinforcement learning (Nature, 2022; lightweight intro), the authors used reinforcement learning (an artificial intelligence technique) to devise a new, slightly faster matrix multiplication algorithm.
Could a similar technique work towards a better multiple-precision modular multiplication algorithm, as at the core of RSA and ECC using prime fields, for practical parameters (say, 256 to 4096-bit modulus)?
The question is focused on computing $a\times b\bmod n$ (or as a special case $a\times a\bmod n$ ) for arbitrary $a,b\in[0,n)$ with $n$ of $\ell$ bits (possibly restricted to odd $n$), using standard CPU instructions operating on $w$-bit words. I'm looking for a practical competitor to Karatsuba, Toom-Cook, or perhaps Schönhage-Strassen multiplication (but AFAIK the later is not competitive until much larger $\ell/w$ than used in RSA).
Optimizing modular exponentiation or ECC arithmetic might also be possible, but the question is not about that.