Use variations of substring()
from StringUtils
. This
next example parses a string that contains five numbers delimited by
parentheses, brackets, and a pipe symbol
(N0
*
(N1
,N2
)
[N3
,N4
] |
N5
):
String formatted = " 25 * (30,40) [50,60] | 30" PrintWriter out = System.out; out.print("N0: " + StringUtils.substringBeforeLast( formatted, "*" ) ); out.print(", N1: " + StringUtils.substringBetween( formatted, "(", "," ) ); out.print(", N2: " + StringUtils.substringBetween( formatted, ",", ")" ) ); out.print(", N3: " + StringUtils.substringBetween( formatted, "[", "," ) ); out.print(", N4: " + StringUtils.substringBetween( formatted, ",", "]" ) ); out.print(", N5: " + StringUtils.substringAfterLast( formatted, "|" ) );
This parses the formatted text and prints the following output:
N0: 25, N1: 30, N2: 40, N3: 50, N4: 60, N5: 30
The following public static methods come in handy when trying to extract information from a formatted string:
StringUtils.substringBetween( )
Captures content between two strings
StringUtils.substringAfter( )
Captures content that occurs after the specified string
StringUtils.substringBefore( )
Captures content that occurs before a specified string
StringUtils.substringBeforeLast( )
Captures content after the last occurrence of a specified string
StringUtils.substringAfterLast( )
Captures content before the last occurrence of a specified string
To illustrate the use of these methods, here is an example of a feed of sports scores. Each record in the feed has a defined format, which resembles this feed description:
(SOT)<sport>[<team1>,<team2>] (<score1>,<score2>)(ETX) Notes: (SOT) is ASCII character 2 "Start of Text", (ETX) is ASCII character 4 "End of Transmission". Example: (SOT)Baseball[BOS,SEA] (24,22)(ETX) (SOT)Basketball[CHI,NYC] (29,5)(ETX)
The following example parses this feed using
StringUtils
methods trim( )
,
substringBetween( )
, and substringBefore( )
. The boxScore
variable holds a test
string to parse, and, once parsed, this code prints out the game
score:
// Create a formatted string to parse - get this from a feed char SOT = 'u0002'; char ETX = 'u0004'; String boxScore = SOT + "Basketball[CHI,BOS](69,75) " + ETX; // Get rid of the archaic control characters boxScore = StringUtils.trim( boxScore ); // Parse the score into component parts String sport = StringUtils.substringBefore( boxScore, "[" ); String team1 = StringUtils.substringBetween( boxScore, "[", "," ); String team2 = StringUtils.substringBetween( boxScore, ",", "]" ); String score1 = StringUtils.substringBetween( boxScore, "(", "," ); String score2 = StringUtils.substringBetween( boxScore, ",", ")" ); PrintWriter out = System.out out.println( "**** " + sport + " Score" ); out.println( " " + team1 + " " + score1 ); out.println( " " + team2 + " " + score2 );
This code parses a score, and prints the following output:
**** Basketball CHI 69 BOS 75
In the previous example, StringUtils.trim( )
rids
the text of the SOT
and ETX
control characters. StringUtils.substringBefore( )
then reads the sport
name—“Basketball”—and
substringBetween( )
is used to retrieve the teams
and scores.
At first glance, the value of these substring( )
variations is not obvious. The previous example parsed this simple
formatted string using three static methods on
StringUtils
, but how difficult would it have been
to implement this parsing without the aid of Commons Lang? The
following example parses the same string using only methods available
in the Java 1.4 J2SE:
// Find the sport name without using StringUtils boxScore = boxScore.trim( ); int firstBracket = boxScore.indexOf( "[" ); String sport = boxScore.substring( 0, firstBracket ); int firstComma = boxScore.indexOf( "," ); String team1 = boxScore.substring( firstBracket + 1, firstComma ); int secondBracket = boxScore.indexOf( "]" ); String team2 = boxScore.substring( firstComma + 1, secondBracket ); int firstParen = boxScore.indexOf( "(" ); int secondComma = boxScore.indexOf( ",", firstParen ); String score1 = boxScore.substring( firstParen + 1, secondComma ); int secondParen = boxScore.indexOf( ")" ); String score2 = boxScore.substring( secondComma + 1, secondParen );
This parses the string in a similar number of lines, but the code is
less straightforward and much more difficult to maintain. Instead of
simply calling a substringBetween( )
method, the
previous example calls String.indexOf( )
and
performs arithmetic with an index while calling
String.substring( )
. Additionally, the
substring( )
methods on
StringUtils
are null
-safe; the
Java 1.4 example could throw a
NullPointerException
if
boxScore
was null
.
String.trim( )
has the same behavior as
StringUtils.trim( )
, stripping the string of all
whitespace and ASCII control characters. StringUtils.trim()
is simply a wrapper for the String.trim( )
method, but the StringUtils.trim( )
method can gracefully handle a null
input. If a
null
value is passed to
StringUtils.trim()
, a null
value is returned.
3.131.38.14