I'm trying to understand the formulation for 3D Zernike descriptors used in A Comparative Study of Object Classification Methods Using 3D Zernike Moment on 3D Point Cloud. For context, I'm looking for a translation invariant method for comparing point cloud representations of molecular surfaces. In researching this, the subject of 3D Zernike moments comes up a lot (for example). I'm having trouble finding a clean algorithm to actually compute these values.
The "3D Zernike Moment on 3D Point Cloud" paper promises a fast computation of these moments, but quite give enough to go on. Here's what the paper says:
\begin{align} Z_{l,m,n}(X) = \sum_{v=0}^{k} Q_{k,m,n} |X|^{2v}e_{m,n}(X) \end{align}
\begin{align} Q_{k,m,n} = \frac{(-1)^{k}}{2^{2k}} \sqrt{\frac{2m+4k+3}{3}} \binom{2k}{k} (-1)^{v} \frac{\binom{k}{v}\binom{2(k+m+v)+1}{2k}}{\binom{k+m+v}{k}} \end{align}
Where
\begin{align} l \in [0, \text{Max}], \quad m\in[0,l], \quad n\in[-m,m], \quad k=\frac{l-m}{2} \end{align}
The notation of $|X|^{2v}e_{m,n}(X)$ is not defined in the paper.
The paper references this algorithm for computing the moments, but I can't seem to match the notation between papers.
I found aa Python implementation here, but this implementation uses meshes (point clouds with triangular faces) rather than just point clouds, as described in the first paper. This method also requires iterating over all faces, which is very slow. The paper describes computation in milliseconds to seconds.
Is anyone familiar with how to compute these values?