Here are few mostly “artificial” regular expression problems.
Strings of z, e and d of length divisible by three. ⇒
([zed][zed][zed])*
Strings of a and b that do not start with b. (That allows the
empty string, so a[ab]* won't cut it.) ⇒
(ab*)*
Decimal integers which are divisible by two. Allow the string 0, but
don't allow leading zeros. ⇒
[1-9][0-9]*[02468]|[02468]
This is complicated a bit by the no-leading-zeros rule.
Could use a ? to shorten it a bit.
Strings of p and q that do not contain more than two p's in a row. ⇒
q*((p|pp)q+)*(p|pp)?
Insert p's one or two at a time, then follow with q's.
Need an optional
unit of p's at the end so the string can end with them.
Strings of m and w which contain an even number of w's. ⇒
(wm*w|m)*
Strings of a, b and c, such that
the first c comes after the first b, and the first b comes
after the first a.
(Of course, if this string has a c,
it must also have a b, etc.) ⇒
(a+([ab]+[abc]*)?)?
The Pascal language uses double quotes to surround string constants,
just a C, C++ and Java do. But instead of using a backward slash to
include double quotes inside the string, you double the quotes.
"""Like this,"" he said.". How about an RE for those
constants? Allow all the quotes
to contain any printable ASCII
charater. ⇒
"([ -!#-~]|"")*"
The set [ -!#-~], space through bang and hash through tilde
is all the printable ASCIIs except for double quote.