30

The problem: I'm trying to find the probability of a random undirected graph being connected. I'm using the model $G(n,p)$, where there are at most $n(n-1) \over 2$ edges (no self-loops or duplicate edges) and each edge has a probability $p$ of existing. I found a simple formula online where $f(n)$ is the probability of $G(n,p)$ being connected. But apparently it's too trivial for the writer to explain the formula (it was just stated briefly).

The desired formula: $f(n) = 1-\sum\limits_{i=1}^{n-1}f(i){n-1 \choose i-1}(1-p)^{i(n-i)}$

My method is: Consider any vertex $v$. Then there's a probability ${n-1 \choose i}p^i(1-p)^{n-1-i}$ that it will have $i$ neighbours. After connecting $v$ to these $i$ neighbours, we contract them ($v$ and its neighbours) into a single connected component, so we are left with the problem of $n-i$ vertices (the connected component plus $n-i-1$ other "normal" vertices.

Except that now the probability of the vertex representing connected component being connected to any other vertex is $1-(1-p)^i$. So I introduced another parameter $s$ into the formula, giving us:

$g(n,s)=\sum\limits_{i=1}^{n-1}g(n-i,i){n-1 \choose i}q^i(1-q)^{n-1-i}$,

where $q=1-(1-p)^s$. Then $f(n)=g(n,1)$. But this is nowhere as simple as the mentioned formula, as it has an additional parameter...

Can someone explain how the formula $f(n)$ is obtained? Thanks!

Wakaka
  • 1,373
  • 1
    So the subtracted term in $f(n)$ is the complement (so $f(n) = 1 - P(G$ is disconnected$)$). The way I read it is that the probability of being disconnected is the prob. of, for each i, having one connected component $C$ with $i$ vertices, and all other vertices not adjacent to any vertex of $C$. What I can't figure out is why it's ${n - 1 \choose i - 1}$ and not ${n\choose i}$. – Manuel Lafond Nov 29 '13 at 04:39
  • @manuellafond Wow that's a very interesting observation! I was always wondering how is it even possible to have a product $i(n-i)$ as a power of $(1-p)$, but your comment enlightened me :) The reason why it's not ${n \choose i}$ is we cannot pick just any component of $i$ vertices to be disconnected from the rest, otherwise we will overcount. For eg, consider a graph with two components, then it will be counted twice. So we fix a vertex $v$, and consider the component containing $v$. Then there are ${n-1 \choose i-1}$ ways to pick the component if its size is $i$. – Wakaka Nov 29 '13 at 05:28
  • I am curious. Can one invert the relation and find a formula for f(n) using this? – Isomorphism Jul 14 '14 at 06:48

2 Answers2

18

Here is one way to view the formula. First, note that it suffices to show that $$ \sum_{i=1}^n {n-1 \choose i-1} f(i) (1-p)^{i(n-i)} = 1, $$ since ${n-1 \choose n-1}f(n)(1-p)^{n(n-n)}=f(n).$ Let $v_1, v_2, \ldots, v_n$ be $n$ vertices. Trivially, $$ 1= \sum_{i=1}^n P(v_1 \text{ is in a component of size }i). $$ Now let's consider why $$P(v_1 \text{ is in a component of size }i)={n-1\choose i-1}P(G(i,p) \text{ is connected}) (1-p)^{i(n-i)}.$$ If $\mathcal{A}$ is the set of all $i-1$ subsets of $\{v_2, v_3, \ldots, v_n\}$, then \begin{align*} P(v_1 \text{ in comp of size }i)&=\sum_{A \in \mathcal{A}}P(\text{the comp of }v_1\text{ is }\{v_1\}\cup A)\\&=\sum_{A \in \mathcal{A}}P(\{v_1\}\cup A \text{ is conn'd})P(\text{no edge btwn }A\cup\{v_1\} \& \ (A\cup \{v_1\})^c). \end{align*} This last equality is due to the fact that the edges within $\{v_1\}\cup A$ are independent of the edges from $A$ to $A^c$. There are precisely ${n-1 \choose i-1}$ elements in $\mathcal{A}$. The probability that $\{v_1\}\cup A$ is connected is equal to the probability that $G(1+|A|,p)$ is connected, which is $f(i)$. There are $i(n-i)$ missing edges from $\{v_1\}\cup A$ to $(\{v_1\}\cup A)^c$.

D Poole
  • 2,810
5

What happens is that he/she sums over all cuts of the graph. To avoid counting anything twice, you focus specifically on some specific vertex, calling it $1$.

If the graph is disconnected, there is going to be a connected component of some size, which contains $1$.

So you sum over the size this component might have, and with multiplicity $n-1 \choose i-1$ which is the number of ways to choose $1$'s fellow set members. You want the component containing $1$ to be connected, so you multiply by $f(i)$, but nothing in the component can be connected to something not there, so you multiply by $(1-p)^{i(n-i)}$. The $(n-i)$ part of the graph can be connected or not, it doesn't matter.

A slight problem with the formula is, that it is not stable for small $p$s and large $n$s.

Henry
  • 169,616
Thomas Ahle
  • 5,629
  • Thanks for the explanation! Why is the formula not stable for small p's and large n's? I don't see how the reasoning breaks down in that case. – Alice Schwarze Jul 20 '23 at 19:47