Efficient manipulation of strings

The use of strings is an important component of most applications and can contribute to poor performance if not managed correctly. This recipe examines techniques used to improve the use of strings.

Getting ready

String manipulation in Java is supported through three java.lang classes:

  • String An immutable object
  • StringBuilder Performs string manipulation but does not use synchronized methods
  • StringBuffer Performs string manipulation using synchronized methods

Each of these classes has its place. For simple strings that are not changed, the String class is a good choice. If strings are manipulated using operations such as concatenation, StringBuilder and StringBuffer are better choices. However, since StringBuffer synchronizes most of its methods to achieve thread safety, use of this class can be more expensive than using StringBuilder. If the string is used in a multi-threaded environment, then StringBuffer should be used.

Tip

A few general string guidelines:

  • Do not use the String class when significant string manipulation is needed
  • When the string length is known initialize the length of a StringBuilder or StringBuffer object using its constructor
  • Make sure you understand how testing for string equality works

String concatenation is expensive when performed using the String class. This is because the String object is immutable. For many situations this is fine. However, if it is necessary to repeatedly change the string, then using the String class means new String objects will be created which introduces the expense of object creation and, potentially, garbage collection.

How to do it...

Consider the following getList method below which returns a comma-delimited string based on an array of names. The array, names, is initialized with four names and the String variable, list, is initialized to an empty string. Within the for loop, each name is concatenated to list with a comma appended between them. Each time the list is modified a new String object is created and the old one is discarded.

public String getList() {
String names[] = {"Bill", "Sue", "Mary", "Fred"};
String list = "";
for(int i=0; i<names.length; i++) {
list += names[i];
if(i < names.length-1) {
list += ", ";
}
}
return list;
}

A more efficient version of this method follows and uses a StringBuilder object instead. Notice the initialization of list to 100. This size is more than adequate for the data used here. Concatenation is achieved using the append method which adds its argument to the end of list. The toString method converts the StringBuilder instance to a String object.

public String getList() {
String names[] = {"Bill", "Sue", "Mary", "Fred"};
StringBuilder list = new StringBuilder(100);
for(String name : names){
if(list.length() > 0) {
list.append(", ");
}
list.append(name);
}
return list.toString();
}

Only one StringBuilder object and one String object has been created. This reduces the overhead of multiple object creation required in the first version of the method.

The initialization of the StringBuffer size is larger than needed. It is often possible to calculate the size beforehand which can save space but at the expense of an additional calculation.

Testing for string equality can be performed using one of several techniques. The first approach uses the equality operator.

if(name == "Peter") ...

This approach checks if the variable name references the string literal "Peter". Most likely this is not the case. Remember, the equality operator in this situation tests for equality of references, not if the two referenced objects are the same.

The next approach uses the compareTo method. While it works, it is more complicated than it needs to be. The compareTo operator returns a negative value if name is less than "Peter", 0 if they are equal to each other and a positive number if name follows "Peter" lexicographically.

if(name.compareTo("Peter") == 0) ...

A better approach is to use the equals method. Alternatively, the equalsIgnoreCase method can be used if the case of the strings is not important.

if(name.equals("Peter")) ...

When dealing with null strings there are two considerations which deserve attention. If we need to test for an empty string it is better to use the length method.

if(name.length() == 0) ...

Also, using the following statement will avoid a NullPointerException should name contain a null value.

if("".equals(name)) ...

How it works...

The string list examples illustrated the efficiency gained though the use of the StringBuilder class. Fewer object creations were required. Several approaches were demonstrated for comparing strings. These illustrate either a more valid, convenient, or efficient technique for the comparison of two strings.

Bear in mind most compilers perform optimization on source code. Any compiler level optimization can render source level optimizations mute or at least more of a style issue. Optimizations should always begin with making sure the application is implementing the correct functionality, uses a sound architecture and the most efficient algorithms before too much effort is devoted to source level optimizations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.218.157