Mean distance between matrix entries

Question

Given a $4$ by $4$ matrix, or in general an $n$ by $n$ square matrix, can we determine the mean euclidean distance (i.e. $\sqrt{\Delta x ^2 + \Delta y^2}$) between entries that are not neighbours?

Two matrix entries $(x_1,y_1)$ and $(x_2,y_2)$ are neighbours if $\Delta x=\Delta y=1$ or $\Delta x=1$ and $\Delta y=0$; or lastly $\Delta x=0$ and $\Delta y=1.$ This means for example that the middle entry $(2,2)$ in a $3$ by $3$ matrix has $8$ neighbours, which means that this entry cannot be paired with any other matrix entry to create a non-neighbouring pair. This also means that all entries in a 2 by 2 matrix (below the entry coordinates in the matrix are written)

\begin{matrix} (1,1) & (1,2) \\ (2,1) & (2,2) \end{matrix}

are considered as neighbours.

I understand the first part to solve is: for a given matrix size, how many non-neighbouring pairs of entries are there? For example in a $3$ by $3$ matrix, we have $16$ combinations of entries that are not neighbours. Note we care about combinations of entries because for example the pair $(1,3) \& (3,3)$ and the pair $(3,3) \& (1,3)$ are considered as indistinguishable, i.e., swapping the position of an entry in a pair does not count as a new pair of entries. Once we know the count $C$ of non-neighbouring pairs, then we can safely assign each pair with a probability $1/C,$ in order to solve the mean distance problem.

If I understand correctly the matrix setting is not completely necessary. Wouldn't you just be dealing with points on a regular integer $n\times n$ grid and look for the average distance between points that are more than $\sqrt 2$ appart? — A.G., Jun 15 '17 at 19:46

score 3 · Accepted Answer · answered Jun 15 '17 at 19:39

Disclaimer. This is by no means a complete and fully rigorous answer, but should be a pretty good starting point.

First, unless I misunderstand what you're trying to capture with your definition of neighbour, shouldn't it be that $(x_1,y_1)\neq(x_2,y_2)$ are neighbours if and only if one of the following occurs

$\Delta x=\Delta y=1$;
$\Delta x=1$ and $\Delta y=0$; or
$\Delta x=0$ and $\Delta y=1$.

Otherwise $\Delta x+\Delta y=2$ could allow $(1,1)$ to be a neighbour of $(1,3)$, which would mean that there are actually $10$ non-neighbour combinations in a $3\times 3$ matrix.

If my comment above is actually what you're interested in, then the number $C_n$ of possible combinations of non-neighbouring matrix entries in an $n\times n$ matrix is the number of ways to place 2 nonattacking kings on an $n\times n$ board, which is apparently given by the explicit formula $$C_n=\frac{(n - 1)(n - 2)(n^2 + 3n - 2)}2,\qquad n\in\mathbb N.$$ Note that this means that $C_3=16$ instead of $18$. This formula is confirmed by the following MatLab code (you can play around with the parameter $N$ to change the dimension, and the output $C$ is the number of pairs we're looking for).

N=3; % Dimension of matrix

X=zeros(2,N^2); % 2 x N^2 array whose columns represent matrix entries

% The loop below generates all matrix entries
ctr=1;
for(i=1:N)
    for(j=1:N)
        X(1,ctr)=i;
        X(2,ctr)=j;
        ctr=ctr+1;
    end
end

% All possible combinations of two entries
C=nchoosek(N^2,2);

% The loop below goes through all combinations.
% If a combination contains neighbouring entries, we remove one from C.
% After going through all combinations, the number C left is precisely
% the number of non-neighbouring combinations.
for(i=1:N^2)
    for(j=i+1:N^2)
        dx=abs(X(1,i)-X(1,j));
        dy=abs(X(2,i)-X(2,j));
        if((dx==1 && dy==0) || (dx==0 && dy==1) || (dx==1 && dy==1))
            C=C-1;
        end
    end
end
C

I suspect that a nice exact formula for the average Euclidean distance of a uniform non-neighbouring pair for arbitrary $n$ may not exist. However, it could still be possible to get a tractable asymptotic answer. In this line of thought, my suspicion is that your random variable can be thought of as a finite-dimensional approximation of the average distance between two random points in a square, which is apparently equal to $$\gamma:=\frac{2+\sqrt{2}+5\log(1+\sqrt{2})}{15}.$$

More precisely, we can represent an $n\times n$ matrix as a square grid inside the unit square (see here for an example with $n=13$), where the sides of the smaller squares in the grid are of size $n^{-1}$, and each smaller square represents a matrix entry.

Given two points $z_1=(x_1,y_1)$ and $z_2:=(x_2,y_2)$ sampled uniformly from the unit square in $\mathbb R^2$, we can obtain two randomly selected matrix entries by looking at which of the smaller squares $z_1$ and $z_2$ land into. Of course, this does not take into account your requirement that the matrix entries be non-neighbours, but for very large $n$, the probability that two random points $z_1$ and $z_2$ in $[0,1]^2$ land in neighbouring squares converges to zero, so I expect that this approximation should work out regardless.

If this approximation does indeed work, then we can expect that the average euclidean distance you're looking for, at least for large $n$, is approximately equal to $$n\cdot\gamma= n\cdot \frac{2+\sqrt{2}+5\log(1+\sqrt{2})}{15}.$$

After doing some MatLab simulations, it appears that this is a pretty good approximation, as shown in the picture below (the purple line is the asymptotic value $n\cdot \gamma$, and the yellow $+$'s are the actual averages computed using a very similar code to the one I used above to compute the values of $C_n$).

Brilliant, I like this approach a lot, haha nice mapping of the problem to different known results. You are right in your 3 given conditions, indeed my single Eq was actually not exactly correct, I will fix this in the post too. Maybe we can find a way to use our knowledge of $C_n$ and come up with a correction for our last estimate. — user929304, Jun 15 '17 at 20:05

Servaes · Answer 2 · 2017-06-24T14:43:28.897

Disclaimer: This approach is a work in progress, and it is not yet clear whether it leads to a nice solution. But perhaps it inspires someone else to a solution.

The mean distance between a pair of points, including neighbours, is $$\frac12\cdot\frac{1}{\binom{n^2}{2}}\sum_{i,j,k,l}\sqrt{(i-j)^2+(k-l)^2}=\frac{1}{n^2(n^2-1)}\sum_{i,j,k,l}\sqrt{(i-j)^2+(k-l)^2},$$ where all indices run from $1$ to $n$. Here we count every pair twice, and hence divide the whole thing by $2$. The contribution from the pairs of neighbours is as follows: \begin{align*} \text{corners}:&&4\times(2\times1+1\times\sqrt{2}) =&&8+4\sqrt{2},\\ \text{sides}:&&4(n-2)(3\times1+2\times\sqrt{2}) =&&4(3+2\sqrt{2})n-8(3+2\sqrt{2}),\\ \text{center}:&&(n-2)^2(4\times1+4\times\sqrt{2}) =&&4(1+\sqrt{2})n^2-16(1+\sqrt{2})n+16(1+\sqrt{2}). \end{align*} So the total number of pairs we overcounted is $$4\times3+(n-2)\times5+(n-2)^2\times8=8n^2-54n+78,$$ and the sum of their distances is $$4(1+\sqrt{2})n^2-4(1+2\sqrt{2})n+4\sqrt{2}.$$ Now it remains to determine the value of the sum $$S(n):=\sum_{i,j,k,l}\sqrt{(i-j)^2+(k-l)^2},$$ where the indices all range from $1$ to $n$. This seems nice enough that someone might have written down some thoughts on it before...

Mean distance between matrix entries

2 Answers2