2.4. Splitting a String


You want to split a string on a number of different character delimiters.


Use StringUtils.split() , and supply a series of characters to split upon. The following example demonstrates splitting strings of a comma and a space:

import org.apache.commons.lang.ArrayUtils;

String input = "Frantically oblong";
String input2 = "Pharmacy, basketball,funky";
String[] array1 = StringUtils.split( input, " ,", 2 );
String[] array2 = StringUtils.split( input2, " ,", 2 );

System.out.println( ArrayUtils.toString( array1 ) );
System.out.println( ArrayUtils.toString( array2 ) );

This produces the output:

{ "Frantically", "oblong" }
{ "Pharmacy", "basketball" }


The StringUtils.split( ) function does not return empty strings for adjacent delimiters. A number of different delimiters can be specified by passing in a string with a space and a comma. This last example limited the number of tokens returned by split with a third parameter to StringUtils.split(). The input2 variable contains three possible tokens, but the split function only returns an array of two elements.

The most recent version of J2SE 1.4 has a String.split() method, but the lack of split( ) in previous versions was an annoyance. To split a string in the old days, one had to instantiate a StringTokenizer, and iterate through an Enumeration to get the components of a delimited string. Anyone who has programmed in Perl and then had to use the StringTokenizer class will tell you that programming without split( ) is time consuming and frustrating. If you are stuck with an older Java Development Kit (JDK), StringUtils adds a split function that returns an Object array. Keep this in mind when you question the need for StringUtils.split(); there are still applications and platforms that do not have a stable 1.4 virtual machine.

The J2SE 1.4 String class has a split() method, but it takes a regular expression. Regular expressions are exceedingly powerful tools, but, for some tasks, regular expressions are needlessly complex. One regular expression to match either a space character or a comma character is [' '',']. I’m sure there are a thousand other ways to match a space or a comma in a regular expression, but, in this example, you simply want to split a string on one of two characters:

String test = "One, Two Three, Four Five";
String[] tokens = test.split( "[' '',']" );

System.out.println( ArrayUtils.toString( tokens );

This example prints out the tokens array:

{ "One", "", "Two", "Three", "", "Four", "Five" }

The array the previous example returns has blanks; the String.split( ) method returns empty strings for adjacent delimiters. This example also uses a rather ugly regular expression involving brackets and single quotes. Don’t get me wrong, regular expressions are a welcome addition in Java 1.4, but the same requirements can be satisfied using StringUtils.split(" .")—a simpler way to split a piece of text.

See Also

Note the use of ArrayUtils.toString( ) in the solution section. See Chapter 1 for more information about ArrayUtils in Commons Lang.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.