6

I want to generate a random connected simple labeled graph with $n$ vertices and $m$ edges, selected uniformly over all connected graphs with such $n$ and $m$. I found this approach. It says: build a random spanning tree using a loop-erased random walk (also called Wilson's algorithm); then add remaining $m-n+1$ edges between random pairs of vertices.

I implemented and investigated this algorithm and have several doubts.

First, what do they denote with random spanning tree in the first part? A random spanning tree of $K_n$? If so, there are $n^{n-2}$ of them, so we may directly restore the tree by its (random) Prüfer sequence. Why do we need the loop-erased random walk here? As far as I experimented it gives the same distribution as the Prüfer-based generation.

Second. If I'm right at the first part and each tree is equiprobable, then different graphs may have different probabilities. For example, if $n=4$ and $m=4$, we have two graphs (I skipped the labels as they are irrelevant here):

o--o    o---o
|  |    |\ /
o--o    o o

The first one can be produced from some spanning tree in 4 distinct ways, while the second one only in 3 ways. If all trees are equiprobable, this obviously introduces bias in graphs.

Where is my mistake? I don't understand Wilson's algo and the distribution on trees is not uniform, or I don't understand the latter part, or this algorithm is in fact incorrect?

Finally, if it is incorrect, how does one generate a random connected graph? The approach of generating a random graph and checking its connectivity fails if $m = \Theta(n)$.

P.S. When saying about randomness, I assume that I have an oracle which returns me a uniform random number in range $[0, n)$ for a reasonable $n$.

Ivan Smirnov
  • 964
  • 6
  • 13

1 Answers1

4

Here are two alternative algorithms. There are probably better ones.

Large $m$ When $m \gg \frac{1}{2} n\log n$, it is very likely that a $G(n,m)$ random graph will be connected. Generate $G(n,m)$ graphs until one of them is connected.

Small $m$ When $m$ is small, you can try the following algorithm, which however might be a bit slow (or perhaps very slow). Generate a random graph according to the stackoverflow procedure, calculate the number of $T$ spanning trees it has, and accept with probability $1/T$. You can check that this produces a uniform sample, though the expected running time is $n^{n-2}\binom{(n-1)(n-2)/2}{m-n+1}/G_{n,m}$, where $G_{n,m}$ is the number of connected graphs on $n$ vertices and $m$ edges.

You can speed up the algorithm by sacrificing the accuracy of the sample. Find a number $M$ such that "most" graphs generated by the stackoverflow algorithm have at least $M$ spanning trees, and accept with probability $\min(M/T,1)$. You can estimate $M$ by sampling.

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514