------------------------------------------------------------------------------
MC logo
Regular Expression Problems
[^] Syntax
------------------------------------------------------------------------------
[Ch. 1: Overview and History] [Syntax] [Names and Scope] [Types and Type Systems] [Semantics] [Functions] [Memory Management] [Imperitive Programs and Functional Abstraction] [Modular and Class Abstraction] [Functional Programming] [Logic Programming]
[ECFG for Tucker and Noonan's Clite Language] [Plain C CFG] [Abstract Syntax for for Tucker and Noonan's Clite Language] [Derivation Problem] [Regular Expression Problems]
Here are few mostly “artificial” regular expression problems.
 
  1. Strings of z, e and d of length divisible by three.
    1. ([zed][zed][zed])*
  2. Strings of a and b that do not start with b. (That allows the empty string, so a[ab]* won't cut it.)
    1. (ab*)*
  3. Decimal integers which are divisible by two. Allow the string 0, but don't allow leading zeros.
    1. [1-9][0-9]*[02468]|[02468]
    This is complicated a bit by the no-leading-zeros rule. Could use a ? to shorten it a bit.
  4. Strings of p and q that do not contain more than two p's in a row.
    1. q*((p|pp)q+)*(p|pp)?
    Insert p's one or two at a time, then follow with q's. Need an optional unit of p's at the end so the string can end with them.
  5. Strings of m and w which contain an even number of w's.
    1. (wm*w|m)*
  6. Strings of a, b and c, such that the first c comes after the first b, and the first b comes after the first a. (Of course, if this string has a c, it must also have a b, etc.)
    1. (a+([ab]+[abc]*)?)?
  7. The Pascal language uses double quotes to surround string constants, just a C, C++ and Java do. But instead of using a backward slash to include double quotes inside the string, you double the quotes. """Like this,"" he said.". How about an RE for those constants? Allow all the quotes to contain any printable ASCII charater.
    1. "([ -!#-~]|"")*"
    The set [ -!#-~], space through bang and hash through tilde is all the printable ASCIIs except for double quote.