8

Suppose we generate a random regular expression $R$ in the following way:

We start with a single meta-symbol $S$. Then each turn we independently replace all $S$ in our word with $\{0\}$, $\{1\}$, $(S \cup S)$, $SS$ or $S^*$ with equal probability. This process terminates with probability $1$ due to the extinction criterion Galton-Watson branching processes.

What is the probability $p$ that $R$ defines the language of all binary strings $\{0, 1\}^*$?

All I know about this number is that it lies in in $[\frac{1}{5 \sqrt{5}};\frac{1}{2}]$.

Here $\frac{1}{5 \sqrt{5}}$ is the probability of $R$ containing $(\{0\} \cup \{1\})^*$ or $(\{1\} \cup \{0\})^*$ as a subexpression, derived from the following equation:

$x = \frac{2}{5^4} + \frac{1}{5}(5x - 2x^2)$

$2x^2 = \frac{2}{5^3}$

$x = \frac{1}{5 \sqrt{5}}$

And $\frac{1}{2}$ is the probability of $R$ defining an infinite language (which happens iff $R$ contains a Kleene star operator), derived from the following equation:

$5y = 1 + 4y - 2y^2$

$2y^2 + y - 1 = 0$

$y = \frac{1}{2}$

However, I have no idea how to find the exact value of $p$.

Chain Markov
  • 16,012
  • 1
    Containing $(0\cup 1)^$ as a subexpression is not a sufficient condition to define ${0,1}^$, but perhaps I misunderstand your argument. – Hendrik Jan May 28 '21 at 04:49

0 Answers0