Setting:
- regular expressions with backreferences
- unary language (1-symbol alphabet)
Is the following problem decidable in this setting:
- Given a regular expression with backreferences, does it define a regular language?
For example, (aa+)\1 defines a regular language, while (aa+)\1+ doesn't. Can we decide which one is the case?
For concreteness, "regular expressions with backreferences" here refer to e.g. the following subset of the usual Perl-compatible regular expressions:
amatches charactera(the only character in the alphabet)X*matches 0 or more occurrences ofXX|YmatchesXorY- parentheses can be used for grouping and capturing
\1.\2, etc. match the same string as the 1st, 2nd, etc. pair of parentheses
We can also use the normal shorthands e.g. X+ = XX*.