2

Problem:

Given $\mathbf{A}_D\in [0,1]^{N\times N}$ ($D,N\in\mathbb{Z}^+$ and $D\ge N$) converging to the identity matrix $\mathbf{I}_N$ in probability, i.e., for any $\epsilon>0$ and choice of norm $\|\cdot\|$, there is: $$ \mathbb{P}[\|\mathbf{A}_D-\mathbf{I}_N\|\geq\epsilon]\to0~~(D\rightarrow \infty). $$

Can we say that $\mathbb{E}[\ln(\det(\mathbf{A}_D))] \rightarrow 0$? How to prove/disprove this?

Can we directly calculate the value of $\mathbb{E}[\ln(\det(\mathbf{A}_D))]$?

(Please see the Update part for more details about how $\mathbf{A}_D$ is generated in my task.)


Background:

I post a previous problem at Here, which is resolved by the answer from @JacobManaker.

Now I am confused by how to show if the convergence of expectation holds. I first try to learn something from Here. However, the above problem is still too difficult for me.

Intuitively, I guess that $\mathbf{A}_D\rightarrow \mathbf{I}_N$, $\det(\mathbf{A}_D)\rightarrow 1$ and $\ln(\det(\mathbf{A}_D))\rightarrow 0$.

One key thing is that all elements of $\mathbf{A}_D$ are bounded in $[0,1]$.

But how to exactly analyse this?


Update 1 (The Generation Method of $\mathbf{A}_D$):

Here I supplement more details about how the matrix $\mathbf{A}_D$ is generated (the previous problem):

Given $\alpha\in\mathbb{R}^+$, $N\in \mathbb{Z}^+$ and $D\in \{N, N+1, N+2, \cdots\}$, a random matrix $\mathbf{A}_D$ is generated by the following steps:

$(1)$ Randomly select $N$ numbers from $\{1,2,\cdots,D\}$ to form a sequence $p=\{p_i\}_{i=1}^N$.

$(2)$ Then calculate $\mathbf{A}_D=[a_{ij}]_{N\times N}$, where $a_{ij}=e^{-\alpha |p_i - p_j|}$.


Update 2 (Some of My Efforts):

I am confused by how to start.

I may know that the diagonal elements of $\mathbf{A}$ will be all ones, since $|p_i-p_i|=0$.

And I may know that all elements of $\mathbf{A}$ are in $[0,1]$ and $\mathbf{A}$ is symmetric.

Intuitively, I guess that when $D$ increases, the absolute distances between each two $p_i$s may become larger and larger, so $a_{ij}$ is expected to be smaller and smaller.

I also write the following Python program for numerical validation:

import numpy as np
import random
from scipy import spatial

alpha = 1 N = 10 I = np.eye(N) for D in range(N, 10000): MSE = 0.0 for i in range(100): p = np.array(random.sample(range(1, D + 1), N)).reshape(N, 1) A = np.exp(-alpha * spatial.distance.cdist(p, p)) MSE += np.sum((A - I) ** 2.0) MSE /= (100 * N * N) print(MSE)

I can see that when $D$ increases, the mean squared error between $\mathbf{A}$ and $\mathbf{I}_N$ converges to zero.

0.027683220252563596
0.02508590350202309
0.02317795057344325
...
0.0001934704436327538
0.00032059290537374806
0.0003270223508894337
...
5.786435956425624e-05
1.1065792791574203e-05
5.786469182583059e-05
BinChen
  • 648

2 Answers2

4

Edit: This answer was posted before the question was updated, so it concerns the general case where the generation method of $\mathbf{A}_D$ is not specified.

It is false. Take for example the following sequence of random matrices: $$\mathbf{A}_D=\left\{ \begin{align} &\mathbf{I}_N &\mbox{with probability } 1-1/D \\ &0 &\mbox{with probability } 1/D \end{align} \right. $$ Then $\mathbf{A}_D \rightarrow \mathbf{I}_N$ in probability but $$\mathbb{E}(\ln \det \mathbf{A}_D)=-\infty$$ for every $D \in \mathbb{N}$

If you want to avoid messing with infinites, you can just slightly change this example. Let $\mathbf{B}_D$ be the diagonal matrix with diagonal elements $1,1,...,1,e^{-D}$, and put $$\mathbf{A}_D=\left\{ \begin{align} &\mathbf{I}_N &\mbox{with probability } 1-1/D \\ &\mathbf{B}_D &\mbox{with probability } 1/D \end{align} \right. $$ In this case $$\mathbb{E}(\ln \det \mathbf{A}_D)=-1$$

  • Hi @DarkMagician, thank you very much! I can now understand that the expectation value is affected by how $\mathbf{A}_D$ is defined. I have added more details about that (please see the supplemented Update part). However, the generation method is complicated. I still have no idea about how to analyse it. – BinChen Sep 12 '22 at 15:42
  • 1
    Well, this definitely makes the question very different and more difficult. I don't see how to approach the problem :-( – Dark Magician Sep 12 '22 at 16:01
  • I accept this answer since it already resolves the original problem. I would particularly appreciate @DarkMagician for his/her efforts and providing such a long and detailed answer with two impressive examples. – BinChen Sep 13 '22 at 03:41
2

By the Continuous Mapping Theorem you know that if $\ln(\det(\mathbf{A}_D))$ is almost surely well defined (i.e. $\det(\mathbf{A}_D)>0$ with probability $1$ for all $D$), then $\ln(\det(\mathbf{A}_D))\rightarrow 0$ in probability.
It then follows from the result in this post that the sequence $\ln(\det(\mathbf{A}_D))$ converges to $0$ in expectation as well.

The assumption on the "well definedness" of $\ln(\det(\mathbf{A}_D))$, referred to as the zero-measure of the set of discontinuity points of $\ln\circ \det$ in the Wiki article, is crucial, as shown by @Dark Magician's counterexample. I don't know how the sequence $(\mathbf{A}_D) $ is defined, but you have to check whether that hypothesis holds in order to conclude.


Edit : What I wrote above is not correct. Indeed, although the continuous mapping theorem can be applied here and we can conclude that $Y_D:=\ln(\det(\mathbf{A}_D))\rightarrow 0$ in probability, we need the sequence to be bounded almost surely (and not just in probability) in order to to conclude that $Y_D\to0$ in expectation (as done in this proof).
Again, thanks to @Dark Magician for the comments and the great counterexample.

So, what can be said about the sequence $Y_D$ ? Does it converge in expectation or not ? The above condition of almost sure boundedness is sufficient, but not necessary, hence we can't conclude whether $Y_D$ converges in $L^1$ or not.

Thankfully, there is a sufficient and necessary condition to guarantee convergence in expectation : it is called uniform integrability (u.i.).
See the Wiki for the exact definition, but essentially, a sequence of random variables is u.i. means that all of the elements have most of their mass on the same bounded set.

We have the following Theorem (see page 1 of these lecture notes for a proof) :

A sequence of random variables $(X_n)$ converges in $L^1$ to $X$ if and only if $X_n$ is u.i. and converges to $X$ in probability

So all that is left to do is to check whether $Y_D$ is u.i. : if it is then we will know that it converges to $0$ in $L^1$, if it isn't then we will know that it doesn't converge to $0$ in $L^1$.

However it is easy to see that $(Y_D)$ is not u.i., in fact it is not even integrable ! Here is a quick proof :
We have for all $D$, $$\begin{align}\mathbb P(|Y_D|=\infty) &=\mathbb P(\det \mathbf A_D=0) \\ &\ge \mathbb P(p_1=p_2=\ldots=p_N)\\ &= \frac{1}{D^N}>0 \end{align}$$ Therefore it follows that for all $D$ $$\mathbb E[|Y_D|] \ge \mathbb E[|Y_D|\mathbf1_{|Y_D|=\infty}] =\infty\cdot \mathbb P(|Y_D|=\infty)\ge \infty\cdot\frac{1}{D^N}=\infty $$ Hence the sequence $(Y_D)$ is not integrable, which implies that it is not u.i., and we can thus conclude by the above theorem that $(Y_D)$ does not converge to $0$ in expectation.

  • 1
    Well, in my second counterexample the well-definedness is ok, but it still works. – Dark Magician Sep 12 '22 at 14:46
  • 1
    I think the problem with your argument is that $\ln(\det(\mathbf{A}_D))$ may be unbounded (so convergence in probability does not imply convergence of the expectations). You have to assume $\det(\mathbf{A}_D) \geq c>0$ (almost surely). – Dark Magician Sep 12 '22 at 14:49
  • 1
    I don't think that the possible unboundedness is an issue, as for sufficiently large $D$, $\mathbf A_D$ remains in an arbitrarily small ball centered around $\mathbf I_N$, and assuming that everything is well defined, the continuity of $\det$ and $\ln$ guarantees that $\ln(\det(\mathbf{A}_D))$ remains bounded as well. The OP didn't specify the matrix norm though so maybe there is a possible pathological case that I am overlooking...

    Also in your second example the sequence you wrote doesn't converge to $\mathbf I_N$ but rather to $\text{diag}(1,\ldots,1,0)$

    – Stratos supports the strike Sep 12 '22 at 15:00
  • 2
    Maybe I did not write the second example clearly. I mean that $\mathbf{A}_D$ is still equal to $\mathbf{I}_N$ with probability $1-1/D$. I just changed its value in the event with probability $1/D$, which does not compromise convergence in probability, since $$\mathbb{P}(\mathbf{A}_D \neq \mathbf{I}_N)=\frac{1}{D} \rightarrow 0$$ – Dark Magician Sep 12 '22 at 15:06
  • 2
    I edited my answer to make it clear. – Dark Magician Sep 12 '22 at 15:15
  • 2
    Oh yeah right, I misunderstood your example sorry. You definitely have a point, it seems that the well definedness is not enough (already upvoted your answer). I will look into it in more details later and edit my post accordingly. The easiest way to fix the problem would be to require uniform integrability instead I guess. – Stratos supports the strike Sep 12 '22 at 15:16
  • 1
    Yeah I think so, you just need to pass to the limit under the integral sign. – Dark Magician Sep 12 '22 at 15:18
  • Hi @StratosFair, after reading your answer and comments above, I may now understand that the generation method of $\mathbf{A}_D$ is quite critical. And I may understand the examples given by Dark Magician. I have supplemented more details about how the $\mathbf{A}_D$ is generated in my task. However, I think that it is so complicated and difficult for analysing. – BinChen Sep 12 '22 at 15:56
  • 1
    @BinChen thanks for the comment and the additional details on $\mathbf A_D$, I have edited my answer to take your edit (and the above discussion) into account. – Stratos supports the strike Sep 13 '22 at 08:12
  • Hi @StratosFair, thanks for providing more details! However, $p_i$s $(i=1,2,\cdots,N)$ are different numbers in ${1,2,\cdots,D}~(D\ge N)$, so $\mathbb{P}(p_1=p_2=\cdots=p_N)=0$. There are $\left( \begin{array}{c} D\ N\ \end{array} \right) $ situations of selection. – BinChen Sep 13 '22 at 09:15
  • 1
    I'm not sure I understand what you mean. I thought that the $p_i$'s were i.i.d. uniform in ${1,2,\cdots,D}$, do you mean the $N$ numbers are chosen without replacement ? – Stratos supports the strike Sep 13 '22 at 09:29