12

So, I know that testing if a regular language $R$ is a subset of regular language $S$ is decidable, since we can convert them both to DFAs, compute $R \cap \bar{S}$, and then test if this language is empty.

However, since this requires converting to DFAs, it's possible that the DFAs, and thus the testing algorithm, will be exponential in terms of the number of states in the input NFAs.

Is there a known way to do this in polynomial time? Has this problem in general been proved Co-NP complete?

Note that the problem is in Co-NP since a word accepted by $R$ but not by $S$ would be a polynomial certifier that $R \not \subseteq S $.

EDIT: this is incorrect, as there is no guarantee that such a word would be polynomial in the number of states.

Joey Eremondi
  • 30,277
  • 5
  • 67
  • 122

3 Answers3

15

The problem of deciding language containment in NFAs is $PSPACE$-complete. To prove this, it is easy to reduce from the universaility problem for NFAs (testing whether $L(A)=\Sigma^*$) So, in a way, you have to determinize, but you may do so on-the-fly.

Your observation about co-NP is wrong (but nice). Such a witness can indeed be checked in polynomial time in the witness, but the shortest witness itself may be exponential in the length of the input. Since $PSPACE=co-PSPACE$, then deciding non-containment is also $PSPACE$-complete.

To state things more carefully, deciding whether $L(A)\subseteq L(B)$ is $PSPACE$ in the size of $B$ (since only $B$ needs to be complemented), and $NLOGSPACE$ in the size of $A$.

Shaull
  • 17,814
  • 1
  • 41
  • 67
4

You should have a look at Jean-François Raskin's paper Antichain Algorithms for Finite Automata.

In our experiments, the antichain-based inclusion test performed one or two orders of magnitude better than the "traditionally" approaches.

If I remember correctly, this algorithm is implemented in the libAMoRE++ library.

Juho
  • 22,905
  • 7
  • 63
  • 117
Dan
  • 341
  • 3
  • 8
3

One of the best, most thorough state-of-the-art and highly optimized, free FSM libraries available online is the AT&T FSM library. It implements "fsmdifference" exactly as you describe, requiring a determinized epsilon-free FSM to do the difference. One idea is to minimize one or both of the FSMs before doing the difference, that may help in some cases. (i.e. determinizing is not the same as minimizing.) This package also has an "approximate" or "greedy" minimization that is designed to be possibly faster than a full minimization.

However, studying similar problems, I believe there is some generalization or construction of FSMs that do not appear in the literature that can help with this problem by avoiding the determinization step, i.e. basically inverting an NFA without creating an additional determinized FSM. The idea is to traverse the NFA edges "in parallel" and keep track of the set of nodes that are part of the current "superstate" (set of states) just like with the standard determinizing algorithm. Then, the NFA complement accepts if and only if the set of current superstate nodes are "all nonaccepting" (in contrast to the determinizing construction which accepts iff "any accepting").

However, I have not seen this written up before and don't see it via a quick online search. There are many references that suggest or imply that the only way to work with the complement of an NFA is to determinize it.

Here are two "nearby" references that might be useful for some ideas. I would be interested to hear of any/others that are "closer". You mention you are working on program verification, which may be a field that has more direct research on the problem.

[1] Construction of Intersection of Nondeterministic Finite Automata using Z Notation Nazir Ahmad Zafar, Nabeel Sabir, and Amir Ali

[2] Complementation Constructions for Nondeterministic Automata on Infinite Words Orna Kupferman and Moshe Vardi

Juho
  • 22,905
  • 7
  • 63
  • 117
vzn
  • 11,162
  • 1
  • 28
  • 52