6

Currently I have a graph (basically, a state graph) in Scala which is similar to an NFA

  • some nodes have multiple in/outgoing edges

  • single start state

  • there might be multiple final states

  • a state can contain self loops where a symbol is consumed
  • no epsilon transitions

Example

What i'm trying to do is generate a regular expression from this graph. I've found the following topic using Brzozowski algebraic method: https://cs.stackexchange.com/q/2392

However, it seems to be for DFA's, so my question:

can I use Brzozowski algebraic method OR Transitive closure method with my NFA, or do i need conversion first?

I need an algorithm that can be done by a computer to automatically generate a regex from a given graph.

1 Answers1

5

Brzozowski's algebraic method is safe as long as you don't have epsilon transitions. It even works if your transitions are labelled by languages not containing the empty word. It may also work if some transitions contain the empty word, as long as you can eliminate them, as in the example below: enter image description here

the system can be written as \begin{align*} X_1 &= (a^*b + a)X_2 + b^*X_3 \\ X_2 &= (a+b)X_1 + bX_3 + 1 \\ X_3 &= aX_1 + aX_2 + 1 \end{align*} Replacing $X_3$ by $aX_1 + aX_2 + 1$, and observing that $a + b^*a = b^*a$, we obtain the equivalent system \begin{align*} X_1 &= (a^*b + a)X_2 + b^*(aX_1 + aX_2 + 1) = b^*aX_1 + (a^*b + b^*a)X_2 + b^* \\ X_2 &= (a+b)X_1 + b(aX_1 + aX_2 + 1) + 1 = (a + b + ba)X_1 + baX_2 + b + 1\\ X_3 &= aX_1 + aX_2 + 1 \end{align*} We deduce from the second equation $$ X_2 = (ba)^*((a + b + ba)X_1 + b + 1) $$ and replacing $X_2$ by its value in the first equation, we obtain \begin{align*} X_1 &= b^*aX_1 + (a^*b + b^*a)(ba)^*((a + b + ba)X_1 + b + 1) + b^*\\ &= (b^*a + (a^*b + b^*a)(ba)^*(a + b + ba))X_1 + (a^*b + b^*a)(ba)^*(b + 1) + b^* \end{align*} Finally, the language recognised by the automaton is $$ X_1 = \bigl(b^*a + (a^*b + b^*a)(ba)^*(a + b + ba)\bigr)^*[(a^*b + b^*a)(ba)^*(b + 1) + b^*] $$ since $1$ is the unique initial state.

Now, if you want to implement this algorithm, you may end up with more complicated regular expressions. For instance, the sentence "observing that $a + b^*a = b^*a$" might be difficult to implement. Other useful simplifications like $a^* + 1 = a^*$ or even $K + K = K$ or $K + L = L + K$, where $K$ and $L$ are regular expressions, are also not easy to implement and require more sophisticated rewriting systems.

J.-E. Pin
  • 6,219
  • 21
  • 39