genjava / link collations / regular expressions

Regular Expressions in Java

"A regular expression is a pattern denoted by a sequence of symbols representing a state-machine or mini-program that is capable of matching particular sequences of characters."

Basically a regular expression is a very powerful way of describing a series of characters, using symbols rather than the vagaries of spoken language. Instead of 'a word of any length in the lowercase of the english alphabet' the following is used: '[a-z]*'

Sadly Sun, the instigators of the Java language, have never included a regular expression engine (the mini-program) with the Java language. However implementations of regular expressions have been written for Java.

*IBM's
*GNU Regex
*javaregex.com's
*Apache Jakarta ORO
*GNU Rex
*Apache Regexp

With the release of JDK1.4, Sun have started to include a regular expressions engine. It looks to lack the power of the IBM, GNU and ORO classes however.

*java.util.regex

For a long time the GNU regex engine seemed to be the most common, however recently, with the submission of OROMatcher to Apache, ORO has pushed ahead. Java's new java.util.regex seems to lack the complexity of its major opposition, so it will be interesting to see if that complexity is needed, or whether the simpler API will dominate.