3

I've come across this problem in my studies, and I've abstracted it to the more general case here.

Given a finite alphabet, what is a regular expression that matches all strings over the alphabet, except one particular finite substring?

As an example:

Given $\Sigma = \{a, b, c\}$

What is a regular expression that matches all of $\Sigma$ except the substring $ ba$?

What I really want is simply $\Sigma^* - ba$.

Raphael
  • 73,212
  • 30
  • 182
  • 400
user2666425
  • 145
  • 1
  • 2

2 Answers2

4

Just draw the minimal complete DFA accepting your string, change the final states to get the complement and now convert this new DFA to a regular expression.

In your case, you will get $\mathcal{A} = (Q, A, \cdot, 1, F)$ with $Q = \{0, 1, 2, 3\}$, $A = \{a, b, c \}$, $F = \{3\}$ and $1 \cdot b = 2$, $2 \cdot a = 3$ and $q \cdot x = 0$ for all other transitions. Thus the automaton for the complement is $\mathcal{A}' = (Q, A, \cdot, 1, F')$ with $F' = \{0, 1, 2\}$.

Converting $\mathcal{A}'$ to a regular expression gives $$ 1 + b + (c + bc + baA)A^* $$

J.-E. Pin
  • 6,219
  • 21
  • 39
1

One way of finding such a regular expression is $$ \epsilon+a+b+c+aa+ab+ac+bb+bc+ca+cb+cc+(a+b+c)(a+b+c)(a+b+c)(a+b+c)^*. $$ There might be more succinct solutions, but this always works (for all finite languages).

Yuval Filmus
  • 280,205
  • 27
  • 317
  • 514