Expected number of random binary vectors to make matrix of order n

Question

I have the following problem:

I pick random vectors from $\mathrm{F}_2^n$. The chance that position $i$ is $1$ equals $p_i$, $0$ otherwise (each position is picked independently). Let $X$ be a random variable - a minimal number of vectors needed so that they span the whole space (in other words, number of vectors required to create a matrix of order $n$). What is the expected value of $X$? Or its variance? (the answer depends on $p_1$, $p_2$, $\ldots$).

The answer for a special case of $p_1 = p_2 = \ldots = p_n = 0.5$ is:

$E(X)$ is asymptotically $n + E$ where E is Erdős-Borwein Constant ($\approx 1.6$)
$Var(X)$ is something close to $2.74403388876$ (asymptotically).

I leave it as an exercise to warm up before the main question :). I see a blurry connection to Markov Chains, but I somewhat cannot make it work. Does anybody has any idea how to attack this?

Cheers, Tomasz

PS. It seems related to: Expected rank of a random binary matrix?

Gareth · Answer 1 · 2013-12-07T23:31:44.793

I can do your warm up, but the more general problem will be harder because you can't use the same state space reduction idea.

Write for our state space the number of dimensions spanned so far, $\{0,1,2,3,\ldots,n\}$. One we have $k$ dimensions spanned, $2^k$ of the $2^n$ possible vectors will not give us any new dimensions, while $2^n-2^k$ of them will give us one new dimension. Therefore, on our state space evolution is according to the following stochastic matrix $$P=\begin{pmatrix} \frac{1}{2^n} & 1- \frac{1}{2^n} \\ & \frac{2}{2^n} & 1- \frac{2}{2^n} \\ && \frac{4}{2^n} & 1- \frac{4}{2^n} \\ &&& \ddots\\ &&&& \frac{2^{n-1}}{2^n} & 1- \frac{2^{n-1}}{2^n} \\ &&&&& 1\\ \end{pmatrix}.$$

This defines a discrete discrete phase-type distribution, which we partition the matrix as $$ P = \begin{pmatrix} T & T^0 \\ 0 & 1 \end{pmatrix}$$ and can compute moments using results in Lecture notes on phase-type distributions for 02407 Stochastic Processes by Bo Friis Nielsen (page 9) to be

$$\begin{align} \mathbb E(X) &= \begin{pmatrix}1 & 0 & \cdots & 0\end{pmatrix} (I-T)^{-1} \begin{pmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{pmatrix}\\ &= \begin{pmatrix}1 & 0 & \cdots & 0\end{pmatrix} \begin{pmatrix} \frac{2^n}{2^n-1} & \frac{2^{n-1}}{2^{n-1}-1} & \cdots\\ 0 & \frac{2^{n-1}}{2^{n-1}-1} & \cdots\\ 0 & 0 & \cdots\\ \end{pmatrix} \begin{pmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{pmatrix}\\ &=\sum_{i=1}^n \frac{2^i}{2^i-1} =n+\sum_{i=1}^n \frac{1}{2^i-1} \end{align} $$

which converges to $n+E$, as you state with $E$ the Erdős–Borwein constant.

Using the formula in the notes for second factorial moments we can compute $$\begin{align}\mathbb E(X(X-1)) &= 2! \begin{pmatrix}1 & 0 & \cdots & 0\end{pmatrix} (I-T)^{-2} T \begin{pmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{pmatrix}\\ &=\frac{2^n}{(2^n-1)^2} + \sum_{i=1}^{n-2} \left( \left(1-\frac{2^i}{2^n}\right) \left(\sum_{j=0}^j \frac{2^{n-j}}{2^{n-j}-1} \right) \frac{2^{n-i}}{2^{n-i}-1} + \frac{2^{i+1}}{2^n}\left( \sum_{j=0}^{i+1} \frac{2^{n-j}}{2^{n-j}-1}\right) \frac{2^{n-i-1}}{2^{n-i-1}-1} \right) \end{align}$$

from which can can compute the variance. Numerically we see the convergence you describe:

n   Var(X)
1   2.00000
2   2.44444
3   2.60771
4   2.67882
5   2.71212
6   2.72824
7   2.73618
8   2.74012
9   2.74208
10  2.74306
11  2.74355
12  2.74379
13  2.74391
14  2.74397
15  2.74400
16  2.74402
17  2.74403
18  2.74403
19  2.74403
20  2.74403

Thanks, I didn't know about discrete-phase type distribution (I did it more "directly"). I've learned something! Btw., the simplest form of series for the variance I know is: $\sum_{k=1}^{\infty} \frac{1}{2^k + 2^{-k} - 2}$. — tombuc, Dec 13 '13 at 11:58

score 2 · Answer 2 · answered Jun 18 '16 at 18:03

The existing answer is massive overkill.

Each new vector either raises the dimension by one, or it doesn't. If you've already spanned a $k$-dimensional space, $2^k$ vectors don't raise the dimension, so (in the warm-up scenario) the probability of gaining a dimension is $\left(2^n-2^k\right)/\,2^n$. Thus the expected number of steps is

$$ \sum_{k=0}^{n-1}\frac{2^n}{2^n-2^k} $$

and the variance is

\begin{align} \sum_{k=0}^{n-1}\frac{1-\frac{2^n-2^k}{2^n}}{\left(\frac{2^n-2^k}{2^n}\right)^2} &=\sum_{k=0}^{n-1}\frac1{2^{n-k}+2^{k-n}-2}\\ &=\sum_{k=1}^n\frac1{2^k+2^{-k}-2}\;. \end{align}

Expected number of random binary vectors to make matrix of order n

2 Answers2

Linked