Chapter 2

Applying Data Types in Java Programming

In the examples in Chapter 1, you saw casual use of data types to help you understand Java class structure. Now we dig in a bit deeper to understand the two basic types of data we have in Java, objects and primitives. As you learn about them, you’ll get a better understanding of the rules that support them and the cases the drive their use.

We’ll also take our first look at two key Java classes, String and StringBuilder, which you’ll need quite frequently in everyday programming.

In this chapter, we’ll cover the following topics:

  • Creating, using, and destroying objects
  • Distinguishing between object references and primitives
  • Declaring and initializing variables
  • Default initialization of variables
  • Reading and writing object fields
  • Using the StringBuilder class and its methods
  • Creating and manipulating strings

Creating, Using, and Destroying Objects

certobjective.eps

Creating an object of a Java class occurs through a process called construction. While the programmer can exert some control over how construction takes place, the process itself is so important that it will be inserted into every class at compilation time if the programmer does nothing with it.

You don’t have to include a constructor in a class, but you can, even if you just want to write what the compiler would produce. The default constructor, also known as the no-arg constructor, looks like the one shown in Listing 2-1.


I will declare all Java classes final until further notice.

Listing 2-1: NewThing class with default constructor

$ cat NewThing.java
public final class NewThing {
public NewThing() { }
}

As you can see, a default constructor takes no parameters and includes no code. All the necessary steps to construct a NewThing object in memory are built into the JVM, so a programmer is never directly responsible for object construction. Rather, the constructor form gives the programmer a way to alter or elaborate upon the construction process. You’ll learn how to do both with the default constructor in Chapter 8, “Using Java Constructors.”

Before we explore further with objects and data types, we need the help of two new terms, referent and reference. A referent is an actual object in Java memory. A reference is a variable. It has a declared type and a name, and it can point to the location of one referent with the same type. In a casual discussion, it’s common for a speaker to point to a variable in some code and call it “such-and-such an object” instead of calling it the thing that represents an object.


Primitives are discussed in the next section, “Distinguishing between Object References and Primitives.”

Normally it’s not important to be more precise because the speaker expects that their listeners know the difference and don’t mind the shortcut in conversation. If you’re new to such presentations, however, and you never hear it any other way, you might think the two are more or less the same thing—but they really aren’t. Distinguishing between an object reference and the thing it points to—the referent—will not only help you better understand the life cycle of Java objects, it will also help you understand how they are very different creatures from what Java calls a primitive.


We’ll stick to simple Unix commands for listing, compiling, and running code. To follow along on Windows, you’ll need to translate the commands as needed.

You can observe this relationship, in a sense, by printing a java.lang.Object referent, as shown in Listing 2-2.

Listing 2-2: PrintObject class with output

$ cat PrintObject.java
public final class PrintObject {
public static void main(String args[]) {
      Object obj = new Object();
      System.out.println(obj);
   }
}
$ javac PrintObject.java
$ java PrintObject
java.lang.Object@12b6651

When printed, the obj reveals the contents of the referent: the object’s type and a value that, for the sake of simplicity, we’ll say is a location where the object resides. Notice the reference obj doesn’t appear to point to the content of the referent, just its location. Even this location doesn’t mean too much. It’s simply an address the JVM assigned the object. The value you see won’t be the same. In fact, you may not even see the same address if you run this program more than once. It will depend in part on what system you use to run the code.

This relationship between reference and referent is an indirect but powerful one that has several benefits. First, since a reference points only to a memory address and not actual content, it only has to have room to store the address. Referent addresses have one size, so it doesn’t matter what type of object you want to refer to. All object references have a common size. The referent is of course as big as it needs to be, up to the limit the JVM allows. The reference only has to get you to it.

Second, references are a way of expressing interest in the referent. When we remove all references to a referent—either by assigning them to other referents or the keyword null—the referent becomes expendable. The memory it consumes can be released and made available for future objects. The lifetime of any referent, therefore, depends on how many references (variables) point to it. In particular, it remains active for as long as its longest-living reference refers to it.


The process of finding expendable objects in memory and removing them is called garbage collection.

To better appreciate this relationship, let’s write a Java class called FreeCopy with a method that will return a reference to itself to the caller. Listing 2-3 shows the FreeCopy class with example output.

Listing 2-3: The FreeCopy class with output

$ cat FreeCopy.java
public final class FreeCopy {
   FreeCopy local;

   FreeCopy getOne() {
return this;
   }
// "this" lets the called method
// refer to its own object
public static void main(String args[]) {
      FreeCopy fc1 = new FreeCopy();
      FreeCopy fc2 = new FreeCopy();
      FreeCopy fc3 = fc2.getOne();
fc3.local = fc1.getOne();
      System.out.println(fc1);
      System.out.println(fc2);
      System.out.println(fc3);
      System.out.println(fc3.local);
   }
}

$ javac FreeCopy.java
$ java FreeCopy
FreeCopy@12b6651
FreeCopy@4a5ab2
FreeCopy@4a5ab2
FreeCopy@12b6651
$

Study the output first. The addresses printed show that the first and fourth references point to one referent. The second and third references point to another. Thus there are four different references but only two FreeCopy objects among them.

Next read each statement in this program carefully and make sure you get what’s happening. (If it doesn’t turn your brain to jelly the first time through, you’re going to outgrow this book quickly.) First, note that the FreeCopy class has a member variable of type FreeCopy. The variables fc1 and fc2 each get assigned their own FreeCopy objects. fc3 is assigned the same referent as fc2. The member variable fc3.local is assigned to the same referent as fc1. The program prints out each referent’s type and location.


Thinking of Java Classes as Containers
Let’s take a moment to consider how an object can contain an object of its own type. If it doesn’t look odd to you on first sight, congratulations! It sure did to me, and it took me longer than a moment to think through it.
On one hand, you can think about it as one box that holds another box. One box has to be big enough to hold the other, of course. And while boxes are conceptually one thing, each instance is a different thing with its own properties. We don’t let the concept of a box distract us from the material reality of boxes.
In Java, we can think of a reference as the conceptual thing and a referent as the material thing. When we declare a FreeCopy variable inside a FreeCopy class, we’re saying that the referent can contain a reference of the same type. The reference is a referent location holder. It doesn’t become “real” until we assign it a referent, and it can’t get a referent until after the object that holds it is created. There’s no real potential for confusion once you think it through.

There are two ways to reduce the reference count we have on our FreeCopy referents. One, we could assign a variable to a different FreeCopy instance. Two, we could change their value to the keyword null, which reassigns the variable so it refers to no referent. What we can’t do is release or destroy the variable. Variables go away only when the scope in which we declared them expires. Two good object conservation practices derive from this lesson: One, you should declare object references (variables) in the narrowest scope where they still do something useful for you, and two, you should assign variables to null as soon as you no longer need the referent.

Distinguishing between Object References and Primitives

You’ve now learned an important subtlety of creating objects in Java. How they consume and release memory is removed from our direct view as a matter of language design. And, as you will no doubt hear, programmers who assume this memory management takes no effort to support often have painful stories to tell about bloated server programs that regularly have to be restarted. The two guidelines previously given can go a long way to avoiding that outcome: Keep your object references in as narrow a scope as you can and assign them to null as soon as you can. When you assign an object reference to null, you allow JVM to determine whether the original referent may be cleared from memory, making the space available for a future object.

certobjective.eps

Still, there are many cases where using an object provides no benefit for the overhead required. For example, if you just need to store a simple value—an integer, floating-point number, or Boolean value—it’s easier and cheaper to use a variable that actually contains that value, not a reference to an object’s type and location. For such cases, Java provides a set of types called primitives.


Boolean values have one of two possible states: true or false.

Each primitive is a type. It just isn’t defined by a class. Instead, Java represents them with keywords that are enforced by the compiler and virtual machine. Keywords have special meaning in the language and cannot be used as variables or names. Primitives contain a literal, type-legal value only, no methods or members. Table 2-1 lists each primitive keyword and its meaning.

Table 2-1: Java primitives

KeywordType
booleantrue or false value
byte8-bit integral value
short16-bit integral value
int32-bit integral value
long64-bit integral value
char16-bit Unicode value
float32-bit floating-point value
double64-bit floating-point value

There are several things to notice here. A boolean variable can store only the value true or false. These words are also reserved in the Java language. Many other languages allow zero or nonzero values to represent false or true. Java insists on type-specific values.

The types byte, short, int, and long all store integer values. The range of each type is defined by how many bits it gets for storage. An 8-bit range allows for 28, or 256, possible values. To allow for negative values, the highest bit holds the sign, leaving room for 27 nonnegative and 27 negative values. Zero is included in the nonnegative range. A byte variable, therefore, can store any value from -128 to 127. A short variable can store any value from -32768 to 32767, and so on.


Number storage is defined at the bit level. Java integers follow a storage scheme known as signed-two’s complement.

The char type is a funny case. Its 16-bit value range is the same as an unsigned integer, allowing for 216, or 65,536, values. Instead of mapping to a numeric range, however, the char type maps to Unicode values, a collection of character sets that spans the major written languages of the world but they are still stored as numbers. You can perform numeric calculations with char values; you just won’t get a numeric result from them..


Unicode characters are expressed in the form 'u####'. The four pound signs are replaced by hexadecimal digits. 'u0020', for example, is a space.

Floating-point values are trickier things to understand. How, for example, do you store 1/3, or a transcendental number like π, in a fixed amount of space? And how do you represent the decimal point in bit form? The short answer is that however you do it, you end up sacrificing some precision of your value in the process.

There are other issues as well, such as how to deal with values very close to zero. The definition for a floating-point primitive therefore has to lay out rules and conditions the numbers-focused programmer can rely on for storing and retrieving these values. The types float and double follow the same rules but offer different levels of precision based on their bit ranges.


The definition for Java’s floating-point types are based on a document called “The IEEE Standard for Floating-Point Arithmetic,” or IEEE 754 for short.

Just because we call these types primitive doesn’t mean they are simple. In fact, many Java programming puzzles merely test the reader’s intimate knowledge of primitive types and their behavior in uncommon cases. For example, what happens when you add 1 to a byte variable that currently stores 127? What happens when you try to divide a number by zero and store it in a floating-point variable? Both questions have answers that have nothing to do with intellect and everything to do with knowing the rules for representing numbers in machine storage. If the kind of code you write relies on intimate knowledge of these types, you’ll need to know them inside and out. Even if you don’t, not knowing the rules means, up to a point, leaving the outcomes of your code to chance.

Primitive types behave like registers. That is, they store literal values and have no concept of pointing to a memory location. Even if you declare several primitives of one type and assign them all the same value, they will each get independent storage locations.

With that in mind, each primitive variable is said to hold a copy of some literal value. An object variable holds a copy of a referent address. In the same way that three int variables, all holding the value 7, are distinct elements of storage, so are three object references all pointing to the same referent.

This distinction causes no small amount of misunderstanding for beginning programmers. A variable that is just another name for the value 7 is different from a variable that knows where to find a 7 object in memory. Still, each variable only holds a copy of its content. No variable in Java ever has an exclusive copy of any value. We’ll study the implications of this meaning in detail in Chapter 7, “Using Java Methods to Communicate.”


In the C language, a pointer variable refers to the type and location of data in memory. Object references are not as open to manipulation as pointers are.

Declaring and Initializing Variables

I’ve defined several aspects of variables already. For the sake of review, and to collect those points under an easy-to-find heading, I’ll repeat those points and elaborate on them.

When you declare a variable in Java, you fix the type and name of one storage element in your code. These two properties endure for as long as the scope in which the declaration was made.

When you declare instance variable in a class (one that is not defined as static), that variable is part of every object made from that class. It expires only when the object referent has no more references to maintain it.


There are other scopes in Java we won’t discuss in this guide, but these rules will apply for them as well.

If you declare a variable inside a method, it expires once the method returns control to its caller. These variables are sometimes called automatic or local variables because their lifetime is limited to the enclosing method. This immediate context allows a method body to name variables without concern for names in the enclosing class or in other methods.

certobjective.eps

We initialize variables by assignment with an expression. In the statements

int y = 25;
Object obj = new Object();

the value 25 is a numeric literal; the result of the expression new Object() is a referent. We call both of them expressions because we want to establish, even for the simplest case, that they must resolve to a value the compiler can match to the variable’s declared type.

The individual operands of an expression, however, do not have to observe this rule. That is, you could subtract one large int value from another and assign their difference to a short, assuming the difference falls within short range. Consider the class in Listing 2-4.


An expression is sometimes referred to as the right-hand side (RHS) of an assignment statement.

Listing 2-4: The Short class with a variable assignment

public final class Short {
public static void main(String args[]) {
short stack = 64000—60000;
      System.out.println(stack);
   }
}

The variable stack is a short, a 16-bit signed integer type that cannot store values greater than 32767. Thus the numbers used in the expression are too large for a short, but the difference between them is not. By the same token, if we assign a sum of two short values, say 32000 and 8000, that exceed what a short can contain, the compiler will reject it. It is therefore only the result of an expression you have to worry about. That said, you will encounter more than one expression in your programming career that will oblige you to work out on paper why the compiler won’t accept it.


It’s common in test questions to assign an expression to a char or byte variable and test the programmer’s understanding of range and type.

There are several ways to mislead a programmer with expressions that seem inappropriate to a variable in code of this sort, but if you remember that the compiler cares only about the resulting type of the expression, you’ll catch most attempts to fool you, if not all of them. Declaring and initializing a variable in one statement makes code easy to read, but it’s not always practical. When and where you initialize a variable is influenced by factors that are sometimes at odds with each other. In methods, for example, you will often use the parameters received from a caller to modify the value of a class variable. Consider the example in Listing 2-5.

Listing 2-5: EasyMath class with addTwo() method

public final class EasyMath {
int sum;
public int addTwo(int x, int y) {
sum = x + y;
return sum;
   }
}

The variable sum is a class member. The value we want to assign it is the sum of two method parameters. If we declared sum inside the method, it would still receive the assignment, but then we’d lose the sum variable when the method returns.


To extend the communication metaphor, we sometimes refer to the parameters passed to a method as a message.

I don’t mean to make separating a variable’s declaration from its initialization sound like a problem. It is in fact a necessity. It is how one object communicates with another, by identifying a method through its name and parameter list and asking it to operate on the values provided. The only way the called object can remember that action is by providing a store for the result.

For simple code examples, this figurative language may sound fancy or academic. When discussing complex program design, however, it’s often easier to exchange ideas first using these abstract terms. I explain these terms now so you’ll be prepared for readings and conversations you’ll have in the future. There is a problem in all of this, sort of, but it has to do with managing class code well, not the act of separating variable declaration from initialization. Consider a case where we declare a member variable in a class with many methods. Perhaps the first listed method that initializes the variable doesn’t appear in the code for another 50 lines. Perhaps there are several methods that initialize or modify the variable.

Class code doesn’t determine the order in which these methods are called. The callers do that. If there is a necessary process to setting or getting a member’s value, the programmer has some options. One is to document the necessary steps, thereby delegating responsibility to the calling programmer. Another is to enforce the behavior through code, which takes more knowledge and skill than we have at the moment. Either way, this interaction between member variables and callers through the class’s methods has to be thought out carefully in a large class.

One bit of good news, as mentioned previously, is that each method has its own scope and namespace that is separate from other methods. Any number of methods can use the same name for a local variable without worry, including reusing names declared at the class level. The variable names declared by method parameters belong to the same scope.

Keep to simple, obvious cases when reusing variable names between member and method scopes. As Listing 2-6 illustrates, it’s not hard to get into trouble.

Listing 2-6: ScopesTrial class with multiple place variables

public final class ScopesTrial {
   Object place = new Object();
public void setObject(Object newObj) {
      Object place = newObj;
   }
}

It appears the setObject() method should update the member variable place. As written, however, a temporary variable place is assigned the value of the parameter, and the method exits. The member variable place doesn’t change. What was the intended result? It may be easy to figure out in a short code example, but as a class’s code gets longer and more complex, it gets harder to tell the difference between poor name choices and an actual code bug.

If you allow more than one method to modify the same member variable, you must think through the consequences of call order. This issue will grow in proportion to the number of methods involved, so it’s important to catch it early in a growing class. Listing 2-7 is a simple case.

Listing 2-7: Converter class with implied call order

public final class Converter {
float temp;
void setTemperature(float newTemp) {
temp = newTemp;
   }
void subtract() {
temp = temp—32;
   }
void divide() {
temp = temp / 1.8f;
   }
public static void main(String args[]) {
      Converter co = new Converter();
float myTemp = 68.0f;
co.setTemperature(myTemp);
      System.out.println(co.temp + " degrees F");
co.divide();
co.subtract();
      System.out.println(co.temp + " degrees C");
   }
}
$ javac Converter.java
$ java Converter
68.0 degrees F
5.7777786 degrees C

In the main() method we create a Converter object, call set temperature to set a Fahrenheit value. We call the methods that each perform part of the conversion and print the result. If you live in the United States, it may take longer for the horror to set in, but the result is wrong: 68 degrees Fahrenheit is much closer to 20 degrees Celsius. To produce the right result, the program must subtract 32 first and divide by 1.8 second.

It’s an easy problem to fix, but it’s far better to prevent it altogether. You could document the class warning potential callers about the need to follow order. You could combine the methods into a single call so the class member temp is updated in one place. You could write another method that calls subtract() and divide() in the right order on behalf of the caller.

Each solution has benefits and trade-offs. Documenting sounds weak, but it means you don’t have to change the code at all, and you might need it the way it is for other reasons. Combining the methods seems ideal to me, but maybe the methods in question are used by other callers with different needs. A method that calls these methods for us also works, but then we have to make sure the caller uses the right one.

For code that just doesn’t work as expected, we don’t have to be so delicate. But for code that works mostly well, the most straightforward fix is sometimes out of the question.

Default Initialization of Variables

I hope you noticed something in reading the previous section. The state of an object “in between” method calls can become a question mark in otherwise legal code. As programmers writing Java classes, we have to plan how we will let methods touch the state of the object, that is, the fields that store its values.

The compiler and the JVM have a similar job at a lower level. In order to support type-safety in our code, their checks must ensure that each member has a legal value as soon as it is declared. But as you just saw, declaring and initializing every member ourselves may not have any meaning to our code: What is a default temperature? Fortunately, the compiler and JVM don’t have to figure that out either. They just have to make sure each member has a type-appropriate value before they become accessible.

Let’s say you wrote a class like the one in Listing 2-8.

Listing 2-8: A busy Java class

public final class Mess {
   Object obj1;
public void setObject1(Object obj1) {
this.obj1 = obj1;
   }
   Object obj2;
public void setObject2(Object obj2) {
this.obj2 = obj2;
   }
// additional code here
}

Notice you can declare members anywhere in the class’s scope. This way you can declare variables close to where they are initialized.

It’s ugly to read but legal code. Once the Mess class is created, its members obj1 and obj2 become accessible to a caller. To protect the system’s type rules, the compiler has to ensure that these members refer to valid referents before they are ever accessed. It doesn’t matter to the compiler whether a value is usable in a program, only that it’s legal. This issue is important enough that the compiler will always initialize members in class scope if the programmer doesn’t. Consequently, it is always the programmer’s responsibility to establish a starting value for each field if the compiler’s default assignment makes no sense to the program.

The default value for each type is shown in Table 2-2.

Table 2-2: Default initialization values by type

Variable typeDefault initialization value
booleanfalse
byte, short, int, long0 (in the type’s bit-length)
float, double0.0 (in the type’s bit-length)
char'u0000' (NUL)
All object referencesnull

Inside method code, it’s a different matter. The compiler does not concern itself with variables that are declared inside a method. The programmer is responsible for initializing local variables before they are used in an expression. If you use a variable in an expression without initializing it, however, the compiler will complain, as shown in Listing 2-9.

Listing 2-9: Using a variable before initializing it

public final class DeclareInitialize {
public void setValue() {
int x;
int y;
y = x;
   }
}

$ javac DeclareInitialize.java
DeclareInitialize.java:5: variable x might not have been initialized
      y = x;
          ^
1 error

The variables x and y are declared inside the setValue() method. Notice the compiler does not complain about y; it hasn’t been used in an expression. It is x, once the compiler tests its value, that becomes suspect. The language of the complaint suggests the compiler doesn’t even know for sure. What it does know is that x is not initialized for every possible case, and that’s the standard.

Seems like a raw deal, doesn’t it? The compiler won’t do the work and complains if you don’t. Why? The compiler only has to enforce type-safety in your program. It doesn’t have to assist, according to the rules in the Java Language Specification. It provides default initialization for class members; it doesn’t for local variables. And those are the rules.

This error can also occur if the variable x was redeclared by mistake and should have referred to a class member. Remember, every method body gets its own namespace. In a large class, it’s easy to forget why or if you meant to shadow a member variable with a local one by the same name. Limit reuse of variable names for simple cases, like giving an incoming parameter the same name as the field you mean to modify, and you’ll make this mistake less often.

Reading and Writing Object Fields

certobjective.eps

By now you might imagine there’s not much left to say about object fields. They’re variables in a referent that must be addressed by an object reference, or by using the this keyword. In Chapter 6, “Encapsulating Data and Exposing Methods in Java,” we’ll discuss why, in typical Java code, you don’t want to allow open access to your fields in this manner. For now, however, you’re learning one thing at a time, starting with Java code that is legal. Your long-term goal is writing legal code that is easy to read, reflects common practices, and looks more like the working code as you understand more what it should look like.

Relying on direct field access for now, we can use a version of our Point3D class without methods and just set the values for x, y, z as shown in Listing 2-10.

Listing 2-10: Using a Point3D class without methods

public final class Point3D {
int x, y, z;
int uselessSum;
public static void main(String args[]) {
      Point3D point = new Point3D();
point.x = 5;
point.y = 12;
point.z = 13;
point.uselessSum = point.x + point.y + point.z;
   }
}

It is legal to declare multiple variables of one type in a single statement, as was done in Listing 2-8. Mostly it saves a little space.

Once we’ve made a Point3D object, we can use the reference to access each field. In this example we just set each member’s value, but we can also use them in expressions. If we wanted to calculate point’s linear distance from (0,0,0), for example, we could write an expression using the appropriate formula and store the results to a variable, either a temporary one or a class member.


A Word on Direct Field Access in Objects
As simple as this mechanism is, it is not useful in general programming. Direct field access implies that whoever handles the object reference will always set values in a useful way or retrieve values at the right time (before or after a state change).
Say you wanted to use some Point3D objects to populate a map of 10000×10000×10000 points. How do you make sure no Point3D object has coordinate values that exceed the map’s limit? And if the map supports only nonnegative coordinates, how do you guard against an object with any of its coordinates set to a negative value?
You need code to guard the storage and validate any changes made to it. That’s where methods come in. Whatever services a method provides to its callers, it must also help protect the integrity of the object data from accidental or intentional misuse. For reasons that are less obvious, you might also want to protect a field from being read directly at any time by any reference holder. Again, we’ll investigate the means for managing fields in Chapter 6.

We’ve discussed type-safety a few times so far and emphasized the compiler’s role in enforcing it. This discussion might lead you to wonder how exactly you can print an object reference. How is it that any type can be printed? It’s a fine question. Answering it will help you better understand how all objects, in some way, can be converted to a stream of printable characters. Let’s start by adding the following lines to the end of the main() method in Listing 2-8.

System.out.println(point.x);
System.out.println(point.y);
System.out.println(point.z);

The answer is that the program prints 5, 12, and 13, each on a separate line. When we read an object field, we treat it as an expression, meaning we ask the JVM to evaluate it. Whether it evaluates to a literal or a referent location, it also resolves to a type. The receiving method must accept that type as a valid parameter. The type is int in all three cases here, but this println() method will in fact accept any type our expression resolves to.

It’s not magic; the output still has to be converted to a number of printable characters (in proper sequence) for the operating system to receive and process them to an output channel (such as a terminal window). Therefore, the JVM must perform some kind of conversion behind the scenes, first by reading the object field, then by converting its type to something the operating system can handle. With that in mind, we should discuss string types; they are fundamental to every meaningful program you will ever write.

Using the StringBuilder Class and Its Methods

You now have both motive and information enough to learn your first class interface. A class interface is just a list of fields and methods you can access through an object of that class.

On a higher level, a class interface also communicates its capabilities by the types it uses (as parameters and return values) and the names it uses for its methods and fields. When these elements are well-chosen and documented, the class usually takes less time to learn and apply. It also tends to be easier to maintain and debug.

One thing you’ll notice about string types in general is how many of their methods let you manage a string as an array of characters. A string is basically that: an ordered list of characters. It could make sense as a human-readable word or a stream of words. It could also encode complex information intended for a specific application, perhaps one that can parse it into smaller units of meaning and convert it to primitive values or other types.


You can find other JDK classes by replacing the package name elements of this link with the one you want and appending .html to the class name.

You’re ready to start reading JDK class interfaces, so let’s examine the one for java.lang.StringBuilder. All JDK classes are published online by their major release version, so make sure you find the Java SE 7 version, which is here:

http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html

Just scrolling, it looks like the StringBuilder class has a lot of methods, but many of them are variants of the same method. Two of StringBuilder’s methods, append() and insert(), account for 25 of the method entries, so it’s not as much to absorb as it looks like.


A Word on the java.lang.String Class
Strings are an essential type. Historically, Java programmers have learned the java.lang.String. class first for that reason. And they have learned an odd thing along the way: If you declare and initialize a java.lang.String type in one statement, you don’t have to make an object. It is in fact legal to write String name = "Michael"; the compiler knows what to do.
This exception seems to provide a minor convenience—you don’t have to type the new keyword—but it really has to do with performance. Understanding why Java allows assignment without construction, however, means you have to think about this exception to your class-object model before you’ve spent enough time with the rules. We discuss java.lang.StringBuilder first for that reason. We’ll address the String class and the reasons behind its exceptional syntax in the next section.

I’ll use the diagram in Figure 2-1 to cut things down and focus on the key methods. In diagrams of this sort, you should usually be less concerned with full details of parameters and more concerned with the names of methods, or operations, the class makes available.

Figure 2-1: Selected methods from the StringBuilder class

c02f001.eps

The class name appears at the top followed by fields (if they are significant to what the diagram wants to illustrate), which are in turn followed by methods. Diagrams of this sort aren’t used just to convert Java code into a graphic form, although that is a useful benefit. They’re meant to focus on aspects of the class that promote one discussion topic or another. Think of these diagrams as suggestive rather than exhaustive.


You’ll learn how to overload a method in Chapter 7, “Using Java Methods to Communicate.”

Notice that I’ve replaced some parameter lists with an ellipsis () to indicate that the method accepts several different parameters. This technique is called overloading. If we couldn’t vary parameter lists for the same method easily, we’d probably have to rename the methods to signify their accepted types, like appendBoolean() or appendDouble(). It’s better to have two methods to think about generally instead of 25 to think about specifically. Overloading, when it’s well done, can take a big bite out of the chore of learning a new class.

In Figure 2-1, the methods are also grouped with related behavior to hint at the general services the class’s methods provide. The intentions of each method group are described here:

Modifying object content

append(): Converts to string form any primitive or object and adds it to the end of the current object
insert(): Puts the parameter data supplied at a specified location in the current object
delete(): Removes characters using starting and ending index numbers
setCharAt(): Changes the character at the specified index

Retrieving part of a string

charAt(): Retrieves the character at the specified index
substring(): Retrieves characters using starting and ending index numbers

Managing the actual and available size of object data

length(): Returns the number of characters in the object
capacity(): Returns how many characters the object could hold without allocating more space
trimToSize(): Reduces the capacity of the object to its current length, if possible

Providing surprisingly useful services

reverse(): Inverts the current order of the characters

Also, there are four ways to construct a StringBuilder object:

  • No parameter: the initial capacity is 16 characters; initial length is 0.
  • With an int: capacity is set to the value given (must be non-negative).
  • With a String object.
  • With a CharSequence object.

A Word on Reading Class Interfaces
Breaking a class interface into smaller, logical groups takes time. You’ll get faster and more precise with experience. Here are some basic tips to start:
  • For one method name, ignore any parameter differences.
  • Look for logical name pairs (open/close, add/remove).
  • Create your own categories. You can refine them as you go.

To learn how to apply these methods and constructors, you’ll want to do two things. One, write simple test programs for them to confirm your understanding. Two (and more important), get into the habit of writing test code. Sooner or later, working knowledge of any class comes from using it; there is no substitute. Listing 2-11, as an example, tests the StringBuilder no-arg constructor.

Listing 2-11: A test of the default StringBuilder constructor

public final class StringBuilderNoParam {
public static void main(String args[]) {
      StringBuilder sb = new StringBuilder ();
      System.out.println("capacity: " + sb.capacity());
      System.out.println("length: " + sb.length());
   }
}
$ javac StringBuilderNoParam.java
$ java StringBuilderNoParam
capacity: 16
length: 0
$

As expected, the object’s capacity was set to 16 and length was set to zero. It’s nice (or a relief!) when your findings match the documentation, but you should also write tests you expect to fail. A good starting point for such tests is to apply values that are type-legal but don’t make sense in the context of the class, as Listing 2-12 shows.

Listing 2-12: A test of the StringBuilder(int capacity) constructor

public final class StringBuilderTestNegativeInt {
public static void main(String args[]) {
      StringBuilder sb = new StringBuilder (-5);
      System.out.println("capacity: " + sb.capacity());
      System.out.println("length: " + sb.length());
   }
}

The code in Listing 2-12 will compile because -5 falls within a Java integer’s range. It should not work because a new with negative capacity has no meaning. What we need to know is how the StringBuilder class handles this input. Let’s run it and find out:

$ javac StringBuilderTestNegativeInt.java
$ java StringBuilderTestNegativeInt
Exception in thread "main" java.lang.NegativeArraySizeException
        at java.lang.AbstractStringBuilder.<init>
(AbstractStringBuilder.java:64)
        at java.lang.StringBuilder.<init>
(StringBuilder.java:97)
        at StringBuilderTestNegativeInt.
main(StringBuilderTestNegativeInt.java:3)

Exceptions are the topic of Chapter 11, “Throwing and Catching Exceptions in Java.”

The output we receive is what’s called a stack trace. At the top of the stack, we are told what went wrong. In Java, the term exception applies to an error in the program. Errors in Java also take the form of a class, in this case a NegativeArraySizeException class.


We say exceptions are thrown because they alter the desired flow of the program when they occur.

The JVM traced this exception back to line 3 of our code, where we called the constructor and passed in -5. It turns out the StringBuilder constructor passes this value along to a class called AbstractStringBuilder, which throws the exception. In short, our code got past the compiler but then crashed the JVM. We’re fortunate in a sense, though: the name of the exception alone tells us what’s wrong with the program. Or, in our case, confirms that the class won’t tolerate a negative value for the object’s capacity.

That’s all it takes to write a simple test for any StringBuilder method, including the ones that were outlined in this chapter and the ones that weren’t. I didn’t describe the methods that I don’t think will mean much to you right away, but you may feel differently upon browsing the documentation. Express what you think is useful by testing it! Make sure the methods you want to use behave as expected and fail as expected. Sure, it can be slow and tedious work, but the idea that you can blindly rely on code from another source has far bigger consequences for code you will truly depend on.

Creating and Manipulating Strings

There are actually quite a few classes that operate on strings. There’s no simple way to list all of them from the JDK, but you can find the class names that start with the word String. For starters, browse this web page:

http://docs.oracle.com/javase/7/docs/api/allclasses-noframe.html

You can search for all the class names that have the word String in them to give you an idea. So why so many string types? Why isn’t java.lang.String enough? One reason is that certain kinds of string data or forms, such as XML content or a URL, are easier to manage using an object with methods that are dedicated to handling them. But we seem to have more than one general-purpose string class available too. I said before that the String class has a special place in the compiler’s heart. If I simply assign a literal string, signified by double-quotes around a sequence of characters, to a String reference, the compiler will accept it:

String name = "Michael";

But why? Shouldn’t we have to construct an object? As it turns out, there are a couple of factors that make the String class special.

certobjective.eps

It turns out the String class tries to reduce the cost of its own objects. In some production applications, String objects can consume 25 to 40% of the memory spaced required. Object creation cost adds up fast on that scale, and repeatedly using the same strings is usually a big part of that. To mitigate object creation cost, the String class maintains what is called an intern pool, a cache that stores String values. These are referents that survive even when no object reference points to them.


Caching is a common technique for storing frequently reused data in a more convenient memory location.

You know from Chapter 1 that the String class loads very early in the start process of any program, so the intern pool is available at the beginning. “Interned” strings, or just “interns,” when they are added, stick with the class. If a String assignment introduces a new value, it’s added to the intern pool.

When a subsequent assignment that does not use the new keyword matches an existing intern, the reference is made to point to it. The intern pool itself is not free of course, but it provides a net reduction in resource demand. By trading off object construction for a lookup of existing strings, the class prefers to pay with additional memory for the savings in performance.

The intern pool stores these referents as efficiently as it can. One thing it cannot do, as a result, is allow for the referents to change, either in size or in content. Unlike a StringBuilder referent, which directly supports changing its capacity, a String referent is immutable. Its content and size are not open to change.

You can think of a String object as a storage box you have packed perfectly full and whose sides can’t bulge. There’s no way to add objects, nor can you replace objects without disturbing the entire arrangement. The trade-off for optimal packing is zero flexibility. Notice, however, that the String class interface doesn’t say anything about immutable objects. In fact, methods like concat(), replace(), and toLowerCase() all encourage modifying a referent’s contents, whether you used direct assignment or construction to make it.

In fact, these methods end up constructing new objects behind the scenes and pointing you to the new referent. There’s a great deal of convenience in this approach, but it’s expensive. You may not appreciate the cost until you’ve used a very large number of these operations. Consider the example in Listing 2-13.

Listing 2-13: Sample code manipulating a string

public final class ImmutableTest {
public void stringFun() {
      String test = new String("Jackpot!");
test = "bingo";
test = test.toUpperCase();
test = test.concat(test);
      System.out.println(test);
   }
public static void main(String args[]) {
      ImmutableTest it = new ImmutableTest();
it.stringFun();
   }
}
$ javac ImmutableTest.java
$

Here we treat the method stringFun() as a procedure. It changes the object’s state over a number of statements but returns no value. If you have several tests you want to run as a single program, you can write each one as its own procedure, then use main() to call each one in turn. Take care that each method returns the data to a state the other tests can use, or write a testSuite() method that governs the call order, resetting data, and so on.

In Listing 2-13, we assign a constructed referent for test, then reassign it to an interned string. We call some methods to modify the content and print the result. You can of course insert additional println() calls to observe each change.

First ask yourself what the output should be. Answer before you run this code. It is practice that will absolutely pay off on your exams, so don’t cheat yourself. Add in the println() statements if you can’t see why the output is what it is. It’s a common and honorable tactic when you’re really unsure why your code works the way the does. However you go about it, however, you can figure out the results.

What’s completely hidden is the number of String objects that were made in the course of this program? It’s certainly more than the one explicit construction we have. If String referents are immutable, then every change to the data requires a new referent to assign the object reference.

Proving this action, unfortunately, takes more knowledge and skill than you have the foundation to accept. But if you wondered why you should learn StringBuilder in the first place, and you trust what you’ve read so far, you have a likely answer. StringBuilder is a better choice for performance than String if you intend to manipulate your referent’s content with wild abandon. It was in fact introduced to the JDK for that specific purpose.


If you haven’t already, do look at the class interface and documentation for the String class.

Let’s not assume, however, that StringBuilder never constructs temporary objects behind the scenes on its own. It certainly does, and when we get to Chapter 4, “Using Java Arrays,” we’ll dig into the dark secret all string types share. To know exactly how each string class goes about this business, you’d need documentation that details their inner workings or the source code itself. Until you’re ready for that, you have to rely on what the methods tell you and hope it’s all to the good.

On a closing note, here’s the output for Listing 2-13:

$ javac ImmutableTest.java
$ java ImmutableTest

The Essentials and Beyond
This chapter covered the fundamentals of Java’s two general data types: objects and primitives. You learned how to declare and initialize them at the class and method level and why the compiler’s role is different for each. You also learned how to distinguish between variables that are object references and those that store primitive values. We then reviewed the interface of the StringBuilder class and why you should prefer it, in the general case, to the String class.
Additional Exercises
1. Using Listing 2-3, draw boxes to contain each reference and circles to contain each referent as declared in the main() method. Draw arrows from each reference to its referent.
2. Write a class called BadShort. In its main() method, declare a short variable and assign it the value 60000. Compile the code. What does the output mean?
3. Write a class that declares a member of each primitive type and one Object reference. Do not initialize them. In the main() method, create an object of the class and print out the value of each member using System.out.println().
4. Using the code in Listing 2-8, write a colloquial expression that will calculate the distance between (0,0,0) and (point.x, point.y, point.z).
5. Write some code to construct a StringBuffer object using a familiar phrase or sequence of characters. Print out the reverse of the phrase.
Review Questions
1. What would the following statement in a main() method do?
new Object();
A. Nothing; there’s no reference for it.
B. Create a referent.
C. Halt the program.
D. Throw an exception.
2. What does immutability mean?
A. The object reference can’t be modified.
B. The object referent can’t be released.
C. The object reference can’t be released.
D. The object referent can’t be modified.
3. What’s the largest positive value the long type can contain?
A. 264 + 1
B. 263 + 1
C. 264-1
D. 263-1
4. Which primitive types can hold the value 256?
A. byte only
B. short, int, and long
C. byte and short only
D. byte, short, and int
5. What is the capacity value of a StringBuilder object that is constructed as follows:
StringBuilder sb = new StringBuilder("singleandLOVINGit!");
A. 16 characters
B. 18 characters
C. 19 characters
D. 34 characters
6. Identify the true statements. (Choose all that apply.)
A. For a String, capacity would be a redundant property.
B. A StringBuilder’s capacity is always greater than its length.
C. An empty String has a length of one.
D. A StringBuilder’s length is never greater than its capacity.
7. Which of the following is a Java keyword for a primitive type?
A. Boolean
B. integer
C. Char
D. byte
8. Which of the following values can be used to create a StringBuilder object?
A. 15 empty spaces
B. Nothing
C. Any Java keyword
D. All of the above
9. In which case will the String class intern a new value?
A. When it has been constructed the second time
B. When it has been assigned for the first time
C. When it has been declared in the main() method
D. Anytime it has been directly assigned
10. Which variable type stores a copy of its assigned value?
A. Primitives
B. Object references
C. Both

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.136.18.218