Gradient of a function with a double sum

Question

The following function is the total energy $U$ of $n$ charges $q_1, \dots, q_n$ on a spherical surface of radius $R$, where the $i$-th charge has spherical coordinates $(R, \theta_i, \phi_i)$

$$ U (\phi_1, \phi_2, \ldots, \phi_n, \theta_1, \theta_2, \ldots, \theta_n) := \sum_{i=1}^n \sum_{\substack{j=1 \\ j \neq i}}^n \frac{k q_i q_j}{\ell(\phi_i, \phi_j, \theta_i, \theta_j)} $$

with the distance between $q_i$ and $q_j$ being

$$ \ell(\phi_i, \phi_j, \theta_i, \theta_j) = \sqrt{2} R \sqrt{1 - \sin\theta_i \sin\theta_j \cos(\phi_i - \phi_j) - \cos\theta_i \cos\theta_j} $$

I'm trying to analytically find the gradient of $U$:

$$ \nabla U = \left( \frac{\partial U}{\partial \phi_1}, \frac{\partial U}{\partial \phi_2}, \ldots, \frac{\partial U}{\partial \phi_n}, \frac{\partial U}{\partial \theta_1}, \frac{\partial U}{\partial \theta_2}, \ldots, \frac{\partial U}{\partial \theta_n} \right)^T $$

For this I only need the derivatives of $\ell$:

\begin{align} \frac{\partial \ell}{\partial \phi} &= \frac{\partial}{\partial \phi} \sqrt{2} R \sqrt{1 - \sin\theta \sin\theta' \cos(\phi - \phi') - \cos\theta \cos\theta'}= \frac{R^2\sin\theta \sin\theta' \sin(\phi - \phi')}{\ell(\phi, \phi', \theta, \theta')} \\\\ \frac{\partial \ell}{\partial \theta}&= \frac{R^2(\sin\theta \cos\theta'-\cos\theta \sin\theta' \cos(\phi - \phi'))}{\ell(\phi, \phi', \theta, \theta')} \end{align}

Now the derivatives of $1/\ell$: \begin{align} &\frac{\partial}{\partial \phi} \frac{1}{\ell} = - \frac{1}{\ell^2} \frac{\partial l}{\partial \phi} = -\frac{R^2\sin\theta \sin\theta' \sin(\phi - \phi')}{\ell(\phi, \phi', \theta, \theta')^3} \\\\ &\frac{\partial}{\partial \theta} \frac{1}{\ell} = - \frac{1}{\ell^2} \frac{\partial l}{\partial \theta} = - \frac{R^2(\sin\theta \cos\theta'-\cos\theta \sin\theta' \cos(\phi - \phi'))}{\ell(\phi, \phi', \theta, \theta')^3} \end{align} Thanks to the derivative we only have one sum now: \begin{align} \frac{\partial U}{\partial \theta_m} =& \frac{\partial}{\partial \theta_m}\sum_{i=1}^n \sum_{\substack{j=1 \\ j \neq i}}^n \frac{k q_i q_j}{\ell(\phi_i, \phi_j, \theta_i, \theta_j)} \\\\ =& \sum_{i=1, i \neq m}^n k q_i q_m \frac{\partial}{\partial \theta_m} \frac{1}{\ell(\phi_i, \phi_m, \theta_i, \theta_m)} \\\\ =& -R^2 k q_m \sum_{i=1, i \neq m}^n q_i \frac{\sin\theta_i \cos\theta_m - \cos\theta_i \sin\theta_m \cos(\phi_i - \phi_m)}{\ell(\phi_i, \phi_m, \theta_i, \theta_m)^3} \end{align} Similarly: \begin{align} \frac{\partial U}{\partial \phi_m} =& -R^2 k \sin\theta_m q_m \sum_{i=1, i \neq m}^n q_i \frac{\sin\theta_i \sin(\phi_i - \phi_m)}{\ell(\phi_i, \phi_m, \theta_i, \theta_m)^3} \end{align}

Now I have everything for the gradient, however the $\frac{\partial U}{\partial \theta_m}$ terms aren't right. The $\frac{\partial U}{\partial \phi_m}$ terms are correct on the other hand.

I know this because I implemented a numerical gradient (which I tested to be right), and compared it with the results I get from my analytical Implementation. I've added the complete code here:

% Parameters
global n q R;
%Test 1 
n = 2; % number of particles 
phi = [-1; 4]; % azimuthal angle
the = [3; 1]; % polar angle
q = [1; 1]; % charges set to 1 
R = 0.5; % radius of all charges
% The first halfs of the vectors are identical, meaning that the method for
% finding dU/dphi is correct
% The other half is not correct:
disp("Gradient numerically: (right)")
gradient_central(@U, [phi; the], 1e-8)
disp("Gradient analytically:")
analytical_gradient(phi, the)
function g=analytical_gradient(phi, the)
    global n R q
    g = zeros(2n, 1); % Set g to the right dimensions
    k = 1; % Set k = 1
    for m = 1:n
        pm = phi(m); % phi_m
        tm = the(m); % theta_m
        for i = 1:n
            pi = phi(i); % phi_i
            ti = the(i); % theta_i
            l = Rsqrt(2)sqrt(...
                                1-sin(ti)sin(tm)cos(pi-pm) - cos(ti)cos(tm)...
                                ); % length between the two points
            if i ~= m
                % Gradient for phi:
                g(m) = g(m) + sin(tm)q(m)q(i)sin(ti)sin(pi-pm) / l^3; 
                % Gradient for theta:
                % My old solution (not working):
                %g(m+n) = g(m+n) + q(m)q(i)(   sin(ti)cos(tm)-cos(ti)sin(tm)cos(pi-pm)   ) / l^3;
                % Simplified solution sugested by @Tensor Wes (also not working)
                g(m+n) = g(m+n) + q(m)q(i)(   sin(tm+ti)(1-cos(pi-pm))   ) / l^3;
            end
        end
    end
    g = 2R^2k*g;
end
function U = U(all_inputs) % calculate the total energy of the system
    global n R q;
phi = all_inputs(1:n);
the = all_inputs(n+1:end);

U = 0; % initialize energy
k = 1; % Set k = 1
for i = 1:n
    ti = the(i);
    pi = phi(i);
    for j = 1:n
        if j ~= i
            tj = the(j);
            pj = phi(j);
            l = R*sqrt(2)*sqrt(...
                            1-sin(ti)*sin(tj)*cos(pi-pj)- cos(ti)*cos(tj)...
                            );
            U = U + q(i)*q(j)*k / l;   
        end
    end
end

end
% calculates the numerical CENTRAL gradient of func at the place inp_vec (with step size h)
function g = gradient_central(func, inp_vec, h) 
    n = length(inp_vec);
    g = nan(n,1);
    ide = eye(n);
    for i = 1:n
        g(i) = ( func(inp_vec+hide(:,i))-func(inp_vec-hide(:,i)) ) / (2*h);
    end
end

Sorry if this question is a little too specific, but for the life of me I can't figure out where the mistake is. Thanks for any help.

Do you agree with my edits? Why not use symbolic methods? Say, in SymPy? — Rodrigo de Azevedo, Jan 20 '25 at 10:35
Thanks for the edits. I'd prefer not too, I use the gradient for a gradient descent algorithm. But for the sake of accuracy I want to move away from the numerical method to the analytical one (which is only partially working). — haifisch123, Jan 21 '25 at 14:05
If you are performing gradient descent, then why not skip the analytical gradient calculation altogether and use PyTorch instead? PyTorch can’t provide a closed-form expression for the gradient function, but it can be used to evaluate the gradient function for arbitrary inputs. It also has well-documented functions that implement gradient descent and other optimization algorithms. — Mahmoud, Jan 21 '25 at 15:00

Steph · Answer 1 · 2025-01-23T19:00:55.167

Start with the differential of the energy function \begin{eqnarray} dU %&=&k \mathbf{q}\mathbf{q}^T : \mathbf{D} &=& 2k q_i \sum_{j\neq i} q_j d\left\lbrace \frac{1}{\| \mathbf{x}_i-\mathbf{x}_j \|} \right\rbrace \\ dU &=& 2k q_i \sum_{j\neq i} \frac{-q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^2} d\left\lbrace \| \mathbf{x}_i-\mathbf{x}_j \| \right\rbrace \\ &=& 2k q_i \sum_{j\neq i} \frac{+q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \left(\mathbf{x}_j-\mathbf{x}_i\right)^T d\mathbf{x}_i \end{eqnarray}

Moreover it holds $d\mathbf{x}= \mathbf{J} d\mathbf{u}$ where we use the Jacobian

\begin{equation} \mathbf{J} = R \begin{pmatrix} -\sin \phi \sin \theta & \cos \phi \cos \theta \\ \cos \phi \sin \theta & \sin \phi \cos \theta \\ 0 & -\sin \theta \end{pmatrix}, d\mathbf{u} = \begin{pmatrix} d\phi \\ d\theta \end{pmatrix}, \end{equation}

We have the gradient as \begin{equation} \frac{\partial U}{\partial \mathbf{u}_i} = 2k q_i \sum_{j\neq i} \frac{q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \mathbf{J}_i^T \left(\mathbf{x}_j-\mathbf{x}_i \right) \end{equation}

We deduce for instance that \begin{eqnarray} \frac{\partial U}{\partial \phi_i} &=& 2kR q_i \sum_{j\neq i} \frac{q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \begin{bmatrix} -\sin \phi_i \sin \theta_i & \cos \phi_i \sin \theta_i & 0 \end{bmatrix} \left(\mathbf{x}_j-\mathbf{x}_i\right) \\ &=& 2kR^2 q_i \sin \theta_i \sum_{j\neq i} \frac{q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \begin{bmatrix} -\sin \phi_i & \cos \phi_i & 0 \end{bmatrix} \left(\mathbf{x}_j-\mathbf{x}_i\right) \\ &=& -2kR^2 q_i \sin \theta_i \sum_{j\neq i} \frac{q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \begin{bmatrix} \sin \phi_i & -\cos \phi_i & 0 \end{bmatrix} \mathbf{x}_j \\ &=& -2kR^2 q_i \sin \theta_i \sum_{j\neq i} \frac{q_j \sin (\phi_i - \phi_j) \sin \theta_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \end{eqnarray}

It follows \begin{eqnarray} \frac{\partial U}{\partial \theta_i} &=& 2kR q_i \sum_{j\neq i} \frac{q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \begin{bmatrix} \cos \phi_i \cos \theta_i & \sin \phi_i \cos \theta_i & -\sin \theta_i \end{bmatrix} \left(\mathbf{x}_j-\mathbf{x}_i\right) \\ &=& 2kR^2 q_i \sum_{j\neq i} \frac{q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \left \lbrace \cos \theta_i \begin{bmatrix} \cos \phi_i & \sin \phi_i & 0 \end{bmatrix} \left(\mathbf{x}_j-\mathbf{x}_i\right) + \sin \theta_i (\cos \theta_i-\cos \theta_j) \right\rbrace \\ &=& 2kR^2 q_i \sum_{j\neq i} \frac{q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \left \lbrace \cos \theta_i [\cos (\phi_i-\phi_j) \sin \theta_j-\sin \theta_i] + \sin \theta_i (\cos \theta_i-\cos \theta_j) \right\rbrace \\ &=& 2kR^2 q_i \sum_{j\neq i} \frac{q_j}{\| \mathbf{x}_i-\mathbf{x}_j \|^3} \left[ \cos \theta_i \sin \theta_j \cos (\phi_i-\phi_j) - \sin \theta_i \cos \theta_j \right] \end{eqnarray}

I have not completely checked it numerically but I am pretty sure of the vector form. I let you check if this is correct now.

Gradient of a function with a double sum

1 Answers1