GenJavaCore's StringW.java

StringW is an extensive String library from generationjava.com. It exists because the Java language's String and StringBuffer combination are quite limited.

The W in StringW stands for 'Wrapper', indicating that it is a wrapper class for a standard Java class. There are many other 'W' classes available from generationjava.com.

StringW was created by looking at the Ruby, C# and PHP languages, and adding any string methods contained within. It was created initially due to a request on a newsgroup for PHP-string functions in JSP.

A slight variant on StringW is available from Apache Jakarta's Commons project. In particular it is in the Lang sub-project in a component named org.apache.commons.lang.Strings. The major differences are that a couple of the methods have been refactored out of the class [and into the Codec sub- project and a few methods have been renamed. ]

Examples

java.lang.String methods

All the methods in the StringW class are static and stateleess, indeed I like to refer to this type of method as a function rather than a method as it is rather an old-school non-OO thing to do.

The simplest functions are ones that map directly to java.lang.String. These exist in StringW so as to allow StringW to provide additional functionality such as null-checking, or bug-fixing. These methods are:

  • String trim(String)
  • String lowerCase(String)
  • String upperCase(String)
  • String substring(String)
  • String substring(String, int)
  • String substring(String, int, int)
Their functionalities should be known to all Java programmers already.

Capitalisation

The first set of real functions are the capitalisation functions. These are usually the first functions that anyone puts in their StringUtil class and there's no difference here. Two things of note are that firstly I use the English spelling and not the American spelling and that secondly I implement capitalise 'correctly', that is I use toTitleCase and not toUpperCase. This makes StringW.capitalise internationalised. I hope.

  • String capitalise(String)
  • String uncapitalise(String)
  • String capitaliseAllWords(String)

Replacement

The next function that people write for their StringUtil class is a replace function, and yet again there are no great surprises here. There are three replace functions of note.

  1. String replaceStringOnce(String text, String replace, String with)
  2. String replaceString(String text, String replace, String with)
  3. String replaceString(String text, String replace, String with, int n)
The first will replace the String 'replace' with 'with' in the text at most once. The second replaces it as many times as necessary while the last replaces it 'n' times.

*ploding

The third and final common function to find in a StringUtil is the explode or split function. The job of this function is to take a String with a common delimiter, such as "one,two,three", and split it up into an array of Strings. The naming here can lead to religious debates. A lot of people (especially those with perl backgrounds) like to use split/join as their function names. These suffer from a problem in that 'join' is already a method on java.lang.Object, and although the arguments are different it seems poor form to overload it. Split and unsplit are an option, although semantically odd. I chose explode/implode due to having used those in previous languages. For the Apache project I agreed to rename them to split/join so the versions in StringW may change in the future.

The exploding/imploding functions available are:

    1. String[] explode(String)
    2. String[] explode(String text, String delimiter)
    3. String[] explode(String text, String delimiter, int n)
    1. String implode(Object[], String delimiter)
    2. String implode(Object[], String delimiter, String pre, String post)
    3. String implode(Iterator, String delimiter)
    4. String implode(Enumeration, String delimiter)
Most are quite obvious, the only weird one is the second implode method. It will prepend a pre-tag and append a post-tag only if the object array is not empty. Quite probably a version is needed for Iterator and Enumeration. The explode without a delimiter uses a default delimiter of a space character. Possibly this will change to include tabs. There will also be an option someday to allow "one two three" to handle the varying sized delimiters.

Chomping

A commonly used function in perl is the chomp function. Its most common use is to remove newlines from lines of text, but it's in fact far more powerful. The best place for its use is to handle easy indexOf/substring things. For example: String subtxt = StringW.chomp("name:login@email.com", ":") where subtxt will be "name". A slight modification to: String subtxt = StringW.chomp(StringW.prechomp("name:login@email.com", ":"), "@"); Here we first prechomp to remove the "name:" then we chomp to remove the "@email.com" leaving us with "login". While the name is a touch odd, chomp, prechomp and their friends are immensely useful.

  • String chomp(String)
  • String chomp(String text, String delimiter)
  • String chompLast(String)
  • String chompLast(String text, String delimiter)
  • String chop(String)
  • String chopNewline(String)
  • String prechomp(String text, String delimiter)
  • String getChomp(String text, String delimiter)
  • String getPrechomp(String text, String delimiter)
Chomp and prechomp are both described above, getChomp and getPrechomp are the inverses of these methods. Note however that getChomp will include the delimiter, it's a complete inverse. Same for getPrechomp. So: StringW.getPrechomp("foo:bar",":")+StringW.prechomp("foo:bar",":") is in fact "foo:bar:, ie) the string doesn't change. If not given a paramter, chomp will remove the last newline and everything after it, in the same fashion as perl's chomp. The chop method is slightly different. It removes the last character, although if it finds a \r\n it will remove both of these. The chopNewline function does the same thing, except it will only remove the \r\n or \n values, so anything else stays safe. Lastly there is chompLast. This is pretty similar to chop except that it can handle more than one character. It is a chomp function that only works if the passed in delimiter (or automatic newline) is at the end of the line. This may seem as though there are lots of redundant functions, in fact that is true. It is an example of the 'There's More Than One Way To Do It' philosophy that Perl holds to heart.

Trimming

This section does not include the eponymous trim function, but does include lots of trim-like functions. They are:

  • String strip(String)
  • String strip(String text, String delimiter)
  • String stripEnd(String text, String delimiter)
  • String stripStart(String text, String delimiter)
These are basically trim functions that let you specify the actual type of character to strip. The default is to strip whitespace, much like trim itself.

Padding/Aligning

A lot of people create pad functions in their StringUtils, and yet again, StringW is no different. There are two functions available to center a String inside a larger string, there are two leftPad functions and two rightPad functions. Lastly there is a repeat function and a function for overlaying a String on top of another String.

  • String center(String text, int n)
  • String center(String text, int n, String delimiter)
  • String leftPad(String text, int n)
  • String leftPad(String text, int n, String delimiter)
  • String rightPad(String text, int n)
  • String rightPad(String text, int n, String delimiter)
  • String repeat(String)
  • String overlayString(String text, String overlay, int start, int end)
Mainly they're quite obvious. Center/left/right position a piece of text within either a bunch of whitespace characters or within repeats of the delimiter. The repeat function just repeats the passed in String n times. OverlayString is the nice one. Given a piece of text, an overlay text and a pair of indices to overlay at, it will place the overlay text on top of the text. So: StringW.overlayString("onelongpieceoftext", "short", 3, 7) will result in "oneshortpieceoftext". Effectively the overlayString function is the indexOf equivalent of a replace function. It's weaker, but allows more precision.

Random text creation

StringW has a powerful set of functions to output random bits of text. It can output a random (length specified) string of numbers, ascii, letters or alphanumerics. It can also take a set of characters to randomise, and randomise over a range of unicode character, making such things as random chinese passwords possible.

  • String random(int)
  • String random(int, boolean letters, boolean numbers)
  • String random(int, char[])
  • String random(int count, int start, int end, boolean letters, boolean numbers)
  • String random(int count, int start, int end, boolean letters, boolean numbers, char[] chars)
  • String random(int count, String set)
  • String randomAlphabetic(int)
  • String randomAlphanumeric(int)
  • String randomNumeric(int)
  • String randomAscii(int)

CharSet manipulations

The StringW class provides a set of features linked to a CharSet class. This class allows a set of characters to be described as: { "a-z", "d", "dqw" }. The available functions handle deleting, counting, translating and squeezing characters.

  • int count(String text, String set)
  • String delete(String text, String set)
  • String squeeze(String text, String set)
  • String translate(String text, String replaceSet, String withSet)

Soundex's

One feature that is not quite String-based, but kind of, is the soundex algorithm which produces a hash-code type String based on the sound of the word. The metaphone algorithm is a more advanced form of the soundex algorithm. StringW provides both:

  • String soundex(String)
  • String metaphone(String)
  • boolean isMetaphoneEqual(String txt1, String txt2)

Reverses

Three simple functions which reverse a String. The first is just a reverse, the second reverses a String based on a delimiter, so: com/genjava/test becomes test/genjava/com. The last is a dedicated method to reversing dotted names, like a Java classname, ip address or domain name.

  • String reverse(String)
  • String reverseDottedName(String)
  • String reverseDelimitedString(String text, String delimiter)

Booleans

StringW provides a bunch of boolean functions to interrogate a String. They're quite obvious I hope.

  • boolean isAlphanumeric(String)
  • boolean isLine(String)
  • boolean isWord(String)
  • boolean isNumeric(String)

Index Ofs

These guys come from .Net. Basically they find the first index of a set of possible values. So you can ask, what is index of the first vowel. These are a good candidate to become CharSet methods [see above].

  • int indexOfAny(String, String[])
  • int lastIndexOfAny(String, String[])

Escaping

A couple of ones to handle escaping either the Java escapes, ie) \t etc, or regular expression characters. XML/HTML escaping is handled by XmlW.

  • String escape(String)
  • String quoteRegularExpression(String)

Miscellaneous

Here are the, still very useful, methods which lack a good grouping. Firstly we have defaultString. This allows a developer to handle nulls. StringW.defaultString(vari) will return "" if vari is null. StringW.defaultString(vari, "none") returns "none" if vari is null.

Next we have getNestedString. This is pretty useful, especially for very quick XML grabbing, (though XmlW provides higher level functions built on this). Given a start String and an end String, it will get the text between them. An overloaded method exists which assumes the start and end String are the same.

Penultimately we have the interpolate method. It allows you to take a String like "The name is ${name}", and a Map with name="Bond" and interpolate them so ${..} variables in the String are replaced with values from the Map.

Lastly there is wordWrap. Quite a complex method, it lets you specify a width for a piece of text, the character to use as a delimiter, ie) a newline usually, and the character to use when text is split, usually a '-' character. Then you can force the piece of text into a fixed width text.

  • String defaultString(String)
  • String defaultString(String text, String default)
  • String getNestedString(String text, String start, String end)
  • String interpolate(String, Map)
  • String wordWrap(String text, int width, String delimiter, String split)