0

Can someone show in some simple steps how one goes about creating a regular expression from binary digits that excludes all occurrences of 111? What I am having trouble is how one starts these sorts of problems, and how to validate if the solution is correct. There must be some methodology or strategy to tackle these sorts of problems?

I'm also unsure if a string like "1011" counts as as a "111", or is it only three of the digits 1 repeated next to each other like "111"? I assume it means "111"...

What I came up with is:

0*(10)(ϵ ∪ 1)(10 ∪ 01* ∪ 0*)

but I have no idea if this is right or how I can validate the solution other than plugging random strings into it to test.

The way I understood this expression I wrote is "any combo of zeros or 10s (i.e. 101010), that end in zero, one or epsilon, followed by 10s, 01 digits or strings of zeroes." This was my thought process, but I feel sketchy if there are other strings that might also work or numerical logic I have overlooked.

J.-E. Pin
  • 42,871
  • About your second answer, 1011 is not 111, so it can appear as a substring. Your regular expression doesnt include 101011, which is valid – Zanzag Apr 09 '22 at 23:12
  • Does this answer your question? – anankElpis Apr 12 '22 at 10:46
  • @StefanAlbrecht not really. because i want to understand the steps to solve it and how someone approaches it and verifies the solution is correct. trial and error seems a hit and miss methodology. – Dario Brogento Apr 13 '22 at 16:48

1 Answers1

1

It seems as though the goal is to create a regular expression that can represent a language of infinite length based on the alphabet {0, 1}, including a stipulation that the string '111' may not appear at any point.

Since the only stipulation for the language is that the sequence '111' does not appear as a substring within it. This means that any number of leading zeroes may exist before any string in the language, which can be shown as the regular expression $0^*$.

It can also be shown that multiple ones may appear in a string so long as the number of consecutive ones is less than 3. This can be shown by the expression $(1|11)^*$. This, however, would lead to an infinite number of ones appearing consecutively, therefore, it is clear that there must be something following the sets of ones, and since the alphabet is {0, 1}, then the ones bust be followed by one or more zeroes. Noting that one or more can be defined in a regular expression through the use of $^+$, we can now denote the next phase of the regular expression as $(10^+|110^+)^*$.

In order to cover all strings of the language, we must also add an expression that makes it possible for a string to end in a number of ones. This can be accomplished simply by the expression $(ϵ|1|11)$.

Given the above regular expressions, applying the rules of concatenation of expressions, we are left with $0^*(10^+|110^+)^*(ϵ|1|11)$.

This is a regular expression that should generate a language with the alphabet {0, 1} and the stipulation that the substring '111' not appear at any point within the language. I am unsure of ways to verify the final solution algorithmically, given the infinite nature of the product language, but the process by which the individual expressions are created and concatenated should act to show that the expression is valid.