5

Let's say $v \in \mathbb{R}^n \sim \mathcal{N}(0, \sigma I)$. That is, $v$ is a gaussian random vector, whose entries are distributed $\mathcal{N}(0, \sigma)$ i.i.d.

From the book "C. Giraud. Introduction to high-dimensional statistics", it can be concluded that $$ P\left(\frac{\sigma}{\sqrt{2}} \le \frac{\|v\|}{\sqrt{n}} \le \left( 2 - \frac{1}{\sqrt{2}} \right) \sigma \right) \ge 1 - (1 + e^2) e^{-n/24} $$

Now, $W \in \mathbb{R}^{m \times n}$ is a deterministic matrix with normalized rows. My goal is to bound the event $$ P\left(\frac{\sigma}{\sqrt{2}} \le \frac{\|W v\|}{\sqrt{m}} \le 2 \sigma \right) $$ preferably with a similar bound as above. We can see that $W v \in \mathbb{R}^m \sim \mathcal{N}(0, \sigma W W^T)$.

  1. My first idea was to define it as a generalized chi-squared distribution. It seems as an overkill, since this case is a lot simpler than the generalized one and it doesn't have a closed form.
  2. Second, I tried writing it as a sum of weighted chi-squared, where the weights are the eigenvalues of $W$. This could work, but I'd rather have it in terms of $W$, not its eigenvalues. Is there a better way?

I would like it very much to hear new interesting approaches.

Thank you! I really appreciate the help!

1 Answers1

1

Since the euclidean norm is a 1-lipschitz function we have that $\|Wv\| - \mathbb E\|Wv\|$ is subgaussian with parameter at most $1$.

We also know that $\mathbb E\|Wv\| \leq \sigma \sqrt{Tr WW^T} = \sigma \sqrt m$

and by Cryme's comment here we have the lower bound: $\sigma \sqrt {m\, 2/\pi}$

From subgaussianity we have the upper deviation bound:

$$\mathbb P\left( \frac{\|Wv\|}{\sqrt m} - \mathbb E \frac{\|Wv\|}{\sqrt m} \geq t \right ) \leq e^{-mt^2/2}$$

so that taking $t = \sigma$ gives:

$$\mathbb P\left( \frac{\|Wv\|}{\sqrt m} \geq 2\sigma \right ) \leq e^{-m\sigma^2/2}$$

with a similar bound for the lower deviation.

dmh
  • 3,082