7

Let $x$ be a random vector uniformly distributed on the unit sphere $\mathbb{S}^{n-1}$. Let $V$ be a linear subspace of $\mathbb{R}^n$ of dimension $k$ and let $P_V(x)$ be the orthogonal projection of $x$ onto $V$. I have seen quoted in the literature that \begin{align} \mathbb{P}[|\left\| P_V(x)\right\|_2 - \sqrt{k/n} | \le \epsilon] \ge 1 -2\exp(-n\epsilon^2/2). \, \, \, \, \, \, \, (1) \end{align} However, i can still not find a concrete proof. What i do understand is that for a $1$-Lipschitz function $f:\mathbb{S}^{n-1} \rightarrow \mathbb{R}$ such as $x \mapsto |\left\| P_V(x)\right\|_2$, we have that \begin{align} \mathbb{P}[|f - M_f | \le \epsilon] \ge 1 -2\exp(-n\epsilon^2/2), \, \, \, \, \, \, \, (2) \end{align} where $M_f$ is the median of $f$. (2) mostly follows from the isoperimetric inequality on the sphere. The issue though with (1) is that $\sqrt{k/n}$ does not seem to be the median of $x \mapsto |\left\| P_V(x)\right\|_2$. Is anyone able to provide a clean argument for (1) or a self-contained reference in the literature? Many thanks.

Manos
  • 26,949
  • 1
    Although the median is not $\sqrt{k/n}$ they are close enough so that this does not affect tail estimate much. So the inequality may be true, but I haven't seen such neat tail estimates in the literature. The most precise bounds I've seen are in P. Frankl and H. Maehara, Some geometric applications of the beta distribution, Ann Inst Stat Math 42(3) (1990), 463–474. Another place to look is "Concentration Inequalities: A Nonasymptotic Theory of Independence" by Boucheron, Lugosi, Massart but I don't have this book. I guess your source had some confusion. What is the source, anyway? –  Dec 10 '17 at 23:24
  • @if....Thank you very much for the very interesting references. My source is an arXiv preprint of a machine learning paper (i would rather not say which), which quotes this result with a reference to Keith Ball's "Convex Geometry". But i checked the latter and does not seem to have this result anywhere. – Manos Jan 09 '18 at 13:15

2 Answers2

1

Have you checked Artstein's Proportional concentration phenomena on the sphere? He is addressing the question of such estimates (in section $6$), although I have not been able to find this exact one.

Or Theorem 7.5 of this paper. It is a version of your estimate where the median is replaced with the average. Not exactly what you are looking for either but it feels we are getting close, maybe a combination of these methods?

0

I believe result is a version of the so-called Johnson-Lindenstrauss Lemma which was initially proved by analytic methods. The probabilistic version was proved in the paper referenced below. See here for a link.

Dasgupta and Gupta, An elementary proof of a theorem of Johnson and Lindenstrauss, Random Structures and Algorithms, 22:60-65, 2002.

http://cseweb.ucsd.edu/~dasgupta/papers/jl.pdf

This seems to be a key observation in the paper:

Hence the aim is to estimate the length of a unit vector in $R^d$ when it is projected onto a random $k-$dimensional subspace. However, this length has the same distribution as the length of a random unit vector projected down onto a fixed $k-$dimensional subspace. Here we take this subspace to be the space spanned by the first $k$ coordinate vectors, for simplicity.

kodlu
  • 10,287
  • Yes, it's a form of the Johnson-Lindenstrauss lemma. Question is, where is a proof of the tail estimate (1) in the question above? –  Jan 15 '18 at 19:59
  • Do you know a similar result for the following case: If I draw two orthogonal vectors uniformly at random from the suitable Haar measure on $O(n)$, and then keep their first $k$ coordinates, what is the probability of the inner product between these two "shorter" vectors to be larger than some $\epsilon$? – user3350919 Nov 09 '22 at 07:36