Chapter 5. Names, Arrays, Operators, and Accuracy

Names, Arrays, Operators, and Accuracy

This chapter covers more of the language basics: names, arrays, how operators work, and the accuracy you can expect in arithmetic. The chapter finishes up by presenting the standard class java.lang.Math.

Names

What is the difference between an identifier and a name? As we saw in Chapter 4, “Identifiers, Keywords, and Types,” an identifier is just a sequence of letters and digits that don’t match a keyword or the literals “true,” “false,” or “null.” A name, on the other hand, can be prefixed with any number of further identifiers to pinpoint the namespace from which it comes. An identifier is thus the simplest form of name. The general case of name looks like the following:

package1.Package2.PackageN.Class1.Class2.ClassM.memberN 

Since packages can be nested in packages, and classes nested in classes, there can be an arbitrary number of identifiers separated by periods, as in:

java.lang.System.out.println( "goober" ); 

That name refers to the java.lang package. There are several packages in the java hierarchy, and java.lang is the one that contains basic language support. One of the classes in the java.lang package is System. The class System contains a field that is an object of the PrintStream class, called out. PrintStream supports several methods, including one called println() that takes a String as an argument. It’s the way to get text sent to the standard output of a program.

By looking at a lengthy name in isolation, you can’t tell where the package identifiers stop and the class and member identifiers start. You have to do the same kind of evaluation that the compiler does. Since the namespaces are hierarchical, if you have two identifiers that are the same, you can say which you mean by providing another level of name. This is called qualifying the name. For example, if you define your own class called BitSet, and you also want to reference the class of the same name that is in the java.util package, you can distinguish them like this:

          BitSet myBS = new BitSet(); 
java.util.BitSet theirBS = new java.util.BitSet(); 

A namespace isn’t a term that occurs in the Java Language Specification. Instead, it’s a compiler term meaning “place where a group of names are organized as a whole.” By this definition, all the members in a class form a namespace. All the variables in a method form a namespace. A package forms a namespace. Even a local block inside a method forms a namespace.

A compiler will look for an identifier in the namespace that most closely encloses it. If not found, it will look in successively wider namespaces until it finds the first occurrence of the correct identifier. Java also uses the context to resolve names. You won’t confuse Java if you give the same name to a method, to a data field, and to a label. It puts them in different namespaces. When the compiler is looking for a method name, it doesn’t bother looking in the field namespace.

When Can an Identifier Be Forward-Referenced?

A forward reference is the use of a name before that name has been defined, as in the following:

class Fruit {
     void foo() { grams = 22; } // grams not yet declared 

     int grams; 
} 

A primitive field needs to appear before it is used only when the use is in the initialization of a field, like this:

int i = grams; 

In the first example above, the use is in a method, so this is a valid forward reference to the field grams. The declaration of a class never needs to appear before the use of that class, as long as the compiler finds it at some point during that compilation.

Expressions

There’s a lengthy chapter in the Java Language Specification on expressions covering many cases that would be interesting only to language lawyers. What it boils down to is that an expression is any of the alternatives shown in Table 5-1.

Table 5-1. Expressions in Java

Expression

Example of Expression

a literal

245

this object reference

this

a field access

plum.grams

a method call

plum.total_calories()

an object creation

new Fruit( 3.5 )

an array creation

new int[27]

an array access

myArray[i][j]

any expression connected by operators

plum.grams * 1000

any expression in parens

( plum.grams * 1000 )

You evaluate an expression to get a result that will be a variable (as in evaluating this gives you an object you can store into), a value, or nothing (a void expression). You get the last by calling a method with a return value of void.

An expression can appear on either side of an assignment. If it is on the left-hand side of an assignment, the result designates where the evaluated right-hand side should be stored.

The type of an expression is known either at compile time or checked at runtime to be compatible with whatever you are doing with the expression. There is no escape from strong typing in Java.

Arrays

In this section we introduce arrays and describe how to use them.

In Java, arrays are objects. That means array types are reference types, and your array variable is really a reference to an array.

Java arrays are allocated dynamically and keep track of their length. It’s the same drill that we saw with classes. What looks like the declaration of an array:

int day[]; 

is actually a variable that will point to an array-of-ints. When we finally fill in the pointer, it’s good for any size of int array, and in the course of execution it can point to different arrays of different sizes.

For example, when you write an array as the parameter to a method, you write it like this:

void main(String[] args) { ... 

That allows arrays of arbitrary bounds as parameters, because String[] matches any size array of strings.

Array Subscripts Start at Zero

Array indexes are all checked at runtime. If a subscript attempts to access an element outside the bounds of its array, it causes an exception and the program will cease execution rather than overwrite some other part of memory. Exceptions are described in a later chapter.

Here are some ways in which arrays are like objects:

  • They are objects because the language specification says so (section 4.3.1).

  • Array types are reference types, just like object types.

  • Arrays are allocated with the “new” operator, similar to constructors.

  • Arrays are always allocated on the heap, never on the stack.

  • The parent class of all arrays is Object, and you can call any of the methods of Object, such as toString(), on an array.

    On the other hand, here are some ways arrays are not like objects:

  • You can’t make an array be the child of some class other than Object.

  • Arrays have a different syntax from other object classes.

  • You can’t define your own methods for arrays.

Regard arrays as funny kinds of objects that share some key characteristics with regular objects. Operations that are common to all objects can be done on arrays. Operations that require an Object as an operand can be passed an array. The length of an array (the number of elements in it) is a data field in the array class. For example, you can get the size of an array by referencing the following:

myArray.length      // yes 

People always want to treat that as a method call, and write the following:

myArray.length()    // NO! NO! NO! 

To remember this, remind yourself that arrays only have the method calls defined in java.lang.Object, and length() isn’t one of them. So length must be a field for arrays. java.lang.String, on the other hand, is a regular class in all respects, and it has a length() method.

Creating an Array

When you declare an array, as in the following example, that declaration says the “carrot can hold a reference to any size array of int.”

int carrot []; 

You have to make the reference point to an array before you can use it, just as with class types. You might make it point to an existing array, or you might create the array with a new expression, just as with objects.

carrot = new int[100]; 

Once an array has been created, it cannot change in size. You can make the reference variable point to a bigger array into which you copy the same contents.

When you create an object, the fields that are primitive types are created and initialized to zero. The fields that are reference types are initialized to null (don’t point to anything yet). It is exactly the same with arrays.

If the array element type is a primitive type, the values are created when the array is new’d.

carrot = new int [256]; // creates 256 ints 
carrot[7] = 32;          // ok, accesses 1 element. 

If the array elements are a reference type, you get 256 references to objects and you must fill them in before use!

Fruit carrot [] = new Fruit[256]; // creates 256 references 
carrot[7].grams = 32; // NO! NO! NO! 

You need to make each individual reference element point to an object before you can access the object. You need to do something like this:

Fruit carrot [] = new Fruit[256]; 
     for (int i=0; i<carrot.length; i++) {
          carrot[i] = new Fruit(); 
     } 

Failing to create the objects in an array of reference types is the most common novice mistake with arrays, and it causes a NullPointerException error.

Initializing an Array

You can initialize an array in its declaration with an array initializer like this:

byte b[] = { 0, 1, 1, 2, 3 }; 
String wkdays[] = { "Mon", "Tue", "Wed", "Thu", "Fri", }; 

A superfluous trailing comma is allowed in an initialization list—an unnecessary carryover from C. The permissible extra trailing comma is claimed to be of use when a list of initial values is being generated automatically.

A new array object is implicitly created when an array initializer expression is evaluated. You can’t use an array initializer anywhere outside a declaration, like in an assignment statement. So this is not valid:

wkdays = { "Mon", "Tues" }; // NO! NO! NO! 

But it is really useful to be able to allocate and initialize an array in a statement, so array creation expressions were brought to the rescue. It provides the explicit extra information about the type of the thing in braces. This is valid:

wkdays = new String[] { "Mon", "Tues" }; 

That new type[] {values... } is an array creation expression, and it was introduced in JDK 1.1. You can use an array creation expression in a declaration or anywhere a reference to an array is expected. Here is one in a declaration.

Fruit orchard[] = new Fruit [] {new Fruit(), 
                                new Fruit(4,3), 
                                null }; 

There is a method called arraycopy() in class java.lang.System that will copy part or all of an array, like this:

String midweek[] = new String[3]; 
System.arraycopy ( wkdays /*src*/, 1 /*offset*/, 
                   midweek /*dest*/, 0 /*offset*/, 3 /*len*/ ); 

You can clone an array, like this:

int p[] = new int[10]; 
int p2[] = (int[]) p.clone(); // makes a copy of p 

Whereas cloning creates the new array, arraycopy just copies elements into an existing array. As with the clone of anything, you get back an Object which must be cast to the correct type. That’s what ( int[] ) is doing.

Arrays of Arrays of …

The language specification says there are no multidimensional arrays in Java, meaning the language doesn’t use the convention of Pascal or Ada to put several indexes into one set of subscript brackets. Ada allows multidimensional arrays like this:

year : array(1..12, 1..31) of real;

year(i,j) = 924.4;

Ada code for multidimensional array.

Ada also allows arrays of arrays, like this:

type month is array(1..31) of real;

year : array(1..12) of month; 
year(i)(j) = 924.4; 

Ada code for array of arrays.

Java arrays of arrays are declared like this:

Fruit plums [] [] ; 

Array “plums” is composed of an array that is composed of an array whose elements are Fruit objects. You can allocate and assign to any arrays individually.

plums = new Fruit [23] [9]; // an array[23] of array[9] 
plums [i] = new Fruit [17]; // an array[17] 
plums [i][j] = new Fruit(); // an individual Fruit 

Because object declarations do not create objects (I am nagging about this repeatedly—it’s an important point), you will need to fill out or instantiate the elements in an array before using it. If you have an array of arrays, like the one below, you will need to instantiate both the top-level array and at least one bot-tom-level array before you can start storing ints:

int cabbage[][]; 

The bottom-level arrays do not have to all be a single uniform size. Here are several alternative and equivalent ways you could create and fill a triangular array of arrays:

  • Use several array creation expressions, like this:

    int myTable[][] = new int[][] {
                                                        new int[] {0}, 
                                                        new int[] {0,1}, 
                                                        new int[] {0,1,2}, 
                                                        new int[] {0,1,2,3}, 
                                         }; 
  • Lump all the initializers together in a big array initializer, like this:

    int myTable[][] = new int[][] {
                                        {0}, 
                                        {0,1}, 
                                        {0,1,2}, 
                                        {0,1,2,3}, }; 
  • Initialize individual arrays with array creation expressions, like this:

    int myTable[][] = new int[4][]; 
    // then in statements 
    myTable[0] = new int[] {0}; 
    myTable[1] = new int[] {0, 1}; 
    myTable[2] = new int[] {0, 1, 2}; 
    myTable[3] = new int[] {0, 1, 2, 3}; 
  • Use a loop, like this:

    int myTable[][] = new int[4][]; 
    ... // later in statements 
    for( int i=0; i<myTable.length; i++) {
           myTable[i] = new int [i+1]; 
           for (int j=0; j<=i; j++) 
                  myTable[i][j]=j; 
    } 

    This could be done in a static block (if myTable is static) or in a constructor.

If you don’t instantiate all the dimensions at one time, you must instantiate the most significant dimensions first. For example:

int cabbage[][] = new int[5][];   // ok 
int cabbage[][] = new int[5][3];  // ok 

but:

int cabbage[][] = new int[][3];   // NO! NO! NO! 

Arrays with the same element type, and the same number of dimensions (in the C sense, Java doesn’t have multidimensional arrays) can be assigned to each other. The arrays do not need to have the same number of elements because (as you would expect) the assignment just copies one reference variable into another. For example:

int eggs[] = {1,2,3,4}; 
int ham[] = new int[2] {77, 96}; 
ham = eggs; 
ham[3] = 0;    // OK, because ham now has 4 elements. 

This doesn’t make a new copy of eggs; it makes ham and eggs reference the same array object.

Watch the size of those arrays of arrays. The following declaration allocates an array of 4 * 250 * 1000 * 1000 = 1GB.

int bubba[][][] = new int[250][1000][1000]; 

Do you have that much memory on your system? In 1998, that was a joke. In 2001, 1GB was about $120 worth of synchronous 133MHz DRAM, with free shipping from the Yahoo store, and you could buy a Solaris UltraSPARC 64bit workstation running a 64 bit operating system, with a 64 bit version of Java preinstalled for $995. This is the power of Moore’s Law.

Have Array Brackets, Will Travel

There is a quirk of syntax in that the array declaration bracket pairs can “float” to be next to the element type, to be next to the data name, or to be in a mixture of the two. The following are all valid array declarations:

int a [] ; 
int [] b = { a.length, 2, 3 } ; 

char c [][] = new char[12][31]; 
char[] d [] = { {1,1,1,1}, {2,2,2,2} }; // creates d[2][4] 
char[][] e; 

byte f [][][] = new byte [3][3][7]; 
byte [][] g[] = new byte [3][3][7]; 

short [] h, i[], j, k[][]; 

If array brackets appear next to the type, they are part of the type, and apply to every variable in that declaration. In the code above, “j” is an array of short, and “i” is an array of arrays of short.

This is mostly so declarations of functions returning arrays can be read more normally. Here is an example of how returning an array value from a function would look following C rules (you can’t return an array in C, but this is how C syntax would express it if you could):

int funarray()[] { ... }           Pseudo-C CODE 

Here are the alternatives for expressing it in Java (and it is permissible in Java), first following the C paradigm:

int ginger ()[]  { return new int[20]; }      Java CODE 

A better way is to express it like this:

Have Array Brackets, Will Travel

The latter allows the programmer to see all the tokens that comprise the return type grouped together.

Arrays are never allocated on the stack in Java, so you cannot get into trouble returning an array stack variable. If you declare an array as a local variable (perhaps in a method), that actually creates a reference to the array. You need a little more code to create the array itself and that will allocate the array safely on the heap. In C, it is too easy to return a pointer to an array on the stack that will be overwritten by something else pushed on the stack after returning from the method.

Operators

Most of the operators in Java will be readily familiar to any programmer. One novel aspect is that the order of operand evaluation in Java is well-defined. For many older languages, the order of evaluation has been deliberately left unspecified. In other words, in C and C++ the following operands can be evaluated and added together in any order:

i + myArray[i] + functionCall(); 

The function may be called before, during (on adventurous multiprocessing hardware), or after the array reference is evaluated, and the additions may be executed in any order. If the functionCall() adjusts the value of i, the overall result depends on the order of evaluation. The trade-off is that some programs give different results depending on the order of evaluation. A professional programmer would consider such programs to be badly written, but they exist nonetheless.

The order of evaluation was left unspecified in earlier languages so that compiler-writers could reorder operations to optimize register use. Java makes the trade-off in a different place. It recognizes that getting consistent results on all computer systems is much more important than getting varying results a trifle faster on one system. In practice, the opportunities for speeding up expression evaluation through reordering operands seem to be quite limited in many programs. As processor speed and cost improve, it is appropriate that modern languages optimize for programmer sanity instead of performance.

Java specifies not just left-to-right operand evaluation, but the order of everything else, too, such as:

  • The left operand is evaluated before the right operand of a binary operator. This is true even for the assignment operator, which must evaluate the left operand (where the result will be stored) fully before starting on the right operand (what the result is).

  • In an array reference, the expression before the square brackets “[]” is fully evaluated before any part of the index is evaluated.

  • A method call for an object has this general form:

    objectInstance.methodName(arguments); 
  • The objectInstance is fully evaluated before the methodName and arguments. This can make a difference if the objectInstance is given to you by a method that has side effects. Any arguments are evaluated one by one from left to right.

  • In an allocation expression for an array of several dimensions, the dimension expressions are evaluated one by one from left to right.

The Java Language Specification (James Gosling, Bill Joy, and Guy L. Steele, Addison-Wesley, 1996) uses the phrase, “Java guarantees that the operands to operators appear to be evaluated from left-to-right.” This is an escape clause that allows clever compiler-writers to do brilliant optimizations, as long as the appearance of left-to-right evaluation is maintained.

For example, compiler-writers can rely on the associativity of integer addition and multiplication. This means that a+b+c will produce the same result as (a+b)+c or a+(b+c). This is true in Java even in the presence of overflow, because what happens on overflow is well-defined. We have a section on overflow later in this chapter.

If one of the subexpressions occurs again in the same basic block, a clever compiler-writer might be able to arrange for its reuse. In general, because of complications involving infinity and not-a-number (NaN) results, floating-point operands cannot be trivially reordered.

Operators

Note that the usual operator precedence still applies. In an expression like the one below, the multiplication is always done before the addition.

b + c * d 

What the Java order of evaluation says is that for all binary (two argument) operators the left operand is always fully evaluated before the right operand. Therefore, the operand “b” above must be evaluated before the multiplication is done (because the multiplied result is the right operand to the addition).

Left-to-right evaluation means in practice that all operands in an expression (if they are evaluated at all) are evaluated in the left-to-right order in which they are written down on a page. Sometimes an evaluated result must be stored while a higher precedence operation is performed. Although The Java Language Specification only talks about the apparent order of evaluation of operands to individual operators, this is a necessary consequence of the rules.

Java Operators

The Java operators and their precedence are shown in Table 5-2. The arithmetic operators are undoubtedly familiar to the reader. We’ll outline some of the other operators in the next section.

Table 5-2. Java Operators and Their Precedence

Symbol

Note

Precedence (highest number= highest precedence)

COFFEEPOT Property (see next section)

++ -

pre-increment, decrement

16

right

++ -

post-increment, decrement

15

left

flip the bits of an integer

14

right

!

logical not (reverse a boolean)

14

right

- +

arithmetic negation, plus

14

right

( typename )

type conversion (cast)

13

right

* / %

multiplicative operators

12

left

- +

additive operators

11

left

<< >> >>>

left and right bitwise shift

10

left

instanceof <

<= > >=

relational operators

9

left

== !=

equality operators

8

left

&

bitwise and

7

left

^

bitwise exclusive or

6

left

|

bitwise inclusive or

5

left

&&

conditional and

4

left

||

conditional or

3

left

? :

conditional operator

2

right

= *= /= %=

+= -=

<<= >>= >>>=

&= ^= |=

assignment operators

1

right

Char is considered to be an arithmetic type, acting as a 16-bit unsigned integer.

The ++ and -Operators

The pre- and post-increment and decrement operators are shorthand for the common operation of adding or subtracting one from an arithmetic type. You write the operator next to the operand, and the variable is adjusted by one.

++i;      // pre-increment 
     j++; // post-increment 

It makes a difference if you bury the operator in the middle of a larger expression, like this:

int result = myArray[++i]; // pre-increment 

This will increment i before using it as the index. The post-increment version will use the current value of i, and after it has been used, add one to it. It makes a very compact notation. Pre- and post-decrement operators (--x) work in a similar way.

The % and / Operators

The division operator “ / ” is regular division on integer types and floating point types. Integer division just cuts off any decimal part, so -9/2 is -4.5 which is cut to -4. This is also (less meaningfully) termed “rounding towards zero.”

The remainder operator “ % ” means “what is left over after dividing by the right operand a whole number of times.” Thus, -7%2 is -1. This is because -7 divided by 2 is -3, with -1 left over.

Some people call “%” the modulus operator, so “-7 % 2” can be read as “-7 modulo 2”. If you have trouble remembering what modulo does, it may help to recall that all integer arithmetic in Java is modular, meaning that the answer is modulo the range. If working with 32 bits, the answer is the part of the mathematically correct answer that fits in a 32-bit range. If working with 64 bits, the answer is the part of the answer that fits in 64 bits. If doing “-8 modulo 3”, the answer is that remainder part of the division answer that fits in 3, i.e., -2.

The equality shown below is true for division and remainder on integer types:

(x / y) * y + x%y == x 

If you need to work out what sign some remainder will have, just plug the values into that formula.

The << >> and >>> Operators

In Java the “>>” operator does an arithmetic or signed shift right, meaning that the sign bit is propagated. In C, it has always been implementation-defined whether this was a logical shift (fill with 0 bits) or an arithmetic shift (fill with copies of the sign bit). This occasionally caused grief, as programmers discovered the implementation dependency when debugging or porting a system. Here’s how you use the operator in Java:

int eighth = x >> 3; // shift right 3 times same as div by 8 

One new Java operator is “>>>” which means “shift right and zero fill” or “unsigned shift” (do not propagate the sign bit). The “>>>” operator is not very useful in practice. It works as expected on numbers of canonical size, ints, and longs.

It is broken, however, for short and byte, because negative operands of these types are promoted to int with sign propagation before the shift takes place, leaving bits 7-or-15 to 31 as ones. The zero fill thus starts at bit 31! Not at all what you probably intended!

If you want to do unsigned shift on a short or a byte, mask the bits you want and use >>.

byte b = -1; 
b = (byte)((b & 0xff) >> 4); 

That way programs won’t mysteriously stop working when someone changes a type from int to short.

The instanceof Operator

The other new operator is instanceof. We’ve said a couple of times that a class can be set up as a subclass of another class. The instanceof operator is used with superclasses to tell if you have a particular subclass object. For example, we may see the following:

class vehicle { ... 
class car extends vehicle { ... 
class convertible extends car { ... 

vehicle v; ... 
if (v instanceof convertible) ... 

The instanceof operator is often followed by a statement that casts the object from the base type to the subclass, if it turns out that the one is an instance of the other. Before attempting the cast, instanceof lets us check that it is valid. There is more about this in the next chapter.

The & | and ^ Operators

The “&” operator takes two boolean operands, or two integer operands. It always evaluates both operands. For booleans, it ANDs the operands, producing a boolean result. For integer types, it bitwise ANDs the operands, producing a result that is the promoted type of the operands (as in long or int).

int flags = ... ; 
int bitResult = (flags & 0x0F ); 

You can get the two nibbles out of a byte with this code:

byte byteMe = 0xC5; 
byte loNibble = (byte) (byteMe & 0x0F); 
byte hiNibble = (byte) ((byteMe >> 4) & 0x0F); 

If that looks like a lot of casting, the section coming up called “Widening and Narrowing Conversions” explains what is happening.

“|” is the corresponding bitwise OR operation.

“^” is the corresponding bitwise XOR operation.

The && and || Operators

The “&&” is a conditional AND that takes only boolean operands. It avoids evaluating its second operand if possible. If a is evaluated to false, the AND result must be false and the b operand is not evaluated. This is sometimes called short-circuited evaluation. “||” is the corresponding short-circuited OR operation. There is no short-circuited XOR operation.

You often use a short-circuited operation to check if a variable refers to something before calling a method on it.

if ((anObject != null) && (anObject instanceof String)) { ... 

In the example above, if the variable anObject is null, then the second half of the expression is skipped. Possible mnemonic: The longer operators “ && ” or “ || ” try to shorten themselves by not evaluating the second operator if they can.

The ? ... : Operator

The “? ... :” operator is unusual in that it is a ternary or three-operand operator. It is best understood by comparing it to an equivalent if statement:

if (someCondition)     truePart     else falsePart 
someCondition ? trueExpression      :   falseExpression 

The conditional operator can appear in the middle of an expression, whereas an if statement cannot. The value of the expression is either the true expression or the false expression. Only one of the two expressions is evaluated. If you do use this operator, don’t nest one inside another, as it quickly becomes hard to follow. This example of ? is from the Java runtime library:

int maxValue = (a >= b) ? a : b; 

The parentheses are not required, but they make the code more legible.

The Assignment Operators

Assignment operators are another notational shortcut. They are a combination of an assignment and an operation where the same variable is the left operand and the place to store the result. For example, these two lines are equivalent:

i += 4;      // i gets increased by 4. 
i = i + 4;   // same thing. 

There are assignment operator versions of all the arithmetic, shifting, and bit-twiddling operators where the same variable is the left operand and the place to store the result. Here’s another example:

ypoints[i] += deltaY; 

Assignment operators are carried over from C into Java, where they were originally intended to help the compiler-writer generate efficient code by leaving off a repetition of one operand. That way it was trivial to identify and reuse quantities that were already in a register.

The Comma Operator Is Gone

Finally, note that Java cut back on the use of the obscure comma operator. Even if you’re quite an experienced C programmer, you might never have seen the comma operator, as it was rarely used. The only place it occurs in Java is in “for” loops. The comma allows you to put several expressions (separated by commas) into each clause of a “for” loop.

for (i=0, j=0; i<10; i++, j++) 

It’s not actually counted as an operator in Java, so it doesn’t appear in Table 5-2. It’s treated as an aspect of the for statement.

Associativity

Associativity is one of those subjects that is poorly explained in many programming texts, especially the ones that come from authors who are technical writers, not programmers. In fact, a good way to judge a programming text is to look for its explanation of associativity. Silence is not golden.

There are three factors that influence the ultimate value of an expression in any algorithmic language, and they work in this order: precedence, associativity, and order of evaluation.

Precedence says that some operations bind more tightly than others. Precedence tells us that the multiplication in a + b * c will be done before the addition, i.e., we have a + (b * c) rather than (a + b) * c. Precedence tells us how to bind operands in an expression that contains different operators.

Associativity is the tie breaker for deciding the binding when we have several operators of equal precedence strung together. If we have 3 * 5 % 3, should we evaluate it as (3 * 5) % 3, that is 15 % 3, which is 0 ? Or should we evaluate it as 3 * (5 % 3), that is 3 * 2, which is 6 ? Multiplication and the “%” remainder operation have the same precedence, so precedence does not give the answer. But they are left-associative, meaning when you have a bunch of them strung together you start associating operators with operands from the left. Push the result back as a new operand, and continue until the expression is evaluated. In this case, (3 * 5) % 3 is the correct grouping.

Associativity is a terrible name for the process of deciding which operands belong with which operators of equal precedence. A more meaningful description would be, Code Order For Finding/Evaluating Equal Precedence Operator Text-strings.” This is the “COFFEEPOT property” mentioned in Table 5-2.

Note that associativity deals solely with deciding which operands go with which of a sequence of adjacent operators of equal precedence. It doesn’t say anything about the order in which those operands are evaluated.

Order of evaluation, if it is specified in a language, tells us the sequence for each operator in which the operands are evaluated. In a strict left-to-right language like Java, the order of evaluation tells us that in (i=2) * i++, the left operand to the multiplication will be evaluated before the right operand, then the multiplication will be done, yielding a result of 4, with i set to 3. Why isn’t the auto-increment done before the multiplication? It has a higher precedence after all. The reason is because it is a post increment, and so by definition the operation is not done until the operand has been used. In C and C++, this expression is undefined because it modifies the same i-value more than once. It is legal in Java because the order of evaluation is well defined.

How Accurate Are Calculations?

The accuracy when evaluating a result is referred to as the precision of an expression. The precision may be expressed either as number of bits (64 bits), or as the data type of the result (double precision, meaning 64-bit floating-point format). In Java, the precision of evaluating each operator depends on the types of the operands. Java looks at the types of the operands around an operator and picks the biggest of what it sees: double, float, and long, in that order of preference. Both operands are then promoted to this type, and that is the type of the result. If there are no doubles, floats, or longs in the expression, both operands are promoted to int, and that is the type of the result. This continues from left to right through the entire expression.

A Java compiler follows this algorithm to compile each operation:

  • If either operand is a double, do the operation in double precision.

  • Otherwise, if either operand is a float, do the operation in single precision.

  • Otherwise, if either operand is a long, do the operation at long precision.

  • Otherwise, do the operation at 32-bit int precision.

In summary, Java expressions end up with the type of the biggest, floatiest type (double, float, long) in the expression. They are otherwise 32-bit integers.

Most programmers already understand that floating-point numbers are approximations to real numbers. They may inherently contain tiny inaccuracies that can mount up as you iterate through an expression. (Actually, most programmers learn this the hard way.) Do not expect ten iterations of adding 0.1 to a float variable to cause it to exactly equal 1.0F! If this comes as a surprise to you, try this test program immediately, and thank your good fortune at having the chance to learn about it before you stumble over it as a difficult debugging problem.

public class inexact1 {
         public static void main(String s[]) {
                  float pear = 0.0F; 
                  for (int i=0; i<10; i++) pear = pear + 0.1F; 

                  if (pear==1.0F) System.out.println("pear is 1.0F"); 
                  if (pear!=1.0F) System.out.println("pear is NOT 1.0F"); 
         } 
} 

You will see this results in the following:

pear is NOT 1.0F 

Since 0.1 is not a fraction that can be represented exactly with powers of two, summing ten of them does not exactly sum to one. This is why you should never use a floating-point variable as a loop counter. A longer explanation of this thorny topic is in “What Every Computer Scientist Should Know about Floating Point” by David Goldberg, in the March 1991 issue of Computing Surveys (volume 23, number 1). You can find that paper from a link on the CD, or with a web search at the site docs.sun.com. Note that this is a characteristic of floating-point numbers in all programming languages, and not a quality unique to Java.

Accuracy is not just the range of values of a type, but also (for real types) the number of decimal places that can be stored. Since the type float can store about six to seven digits accurately, when a long (which can hold at least 18 places of integer values) is implicitly or explicitly converted to a float, some precision may be lost.

public class inexact2 {
         public static void main(String s[]) {
                  long  orig = 9000000000000000000L; 
                  float castMe = orig; // assign the long into a float 

                  orig = (long) castMe; //cast the float back into a long 
                  System.out.println(
                             "orig (started as 9e18, assigned to float) is: 
” 
                             +orig); 
           } 
} 

The output is as follows:

orig (started as 9e18, assigned to float) is: 
9000000202358128640 

As you can see, after being assigned to and retrieved back from the float variable, the long has lost all accuracy after six or seven significant figures. The truth is that if a float has the value shown, it could stand for any real value in the interval between the nearest representable floating-point number on each side. The library is entitled, within the bounds of good taste and consistency, to print it out as any value within that interval. If all this makes you fidget uncomfortably in your seat, maybe you better take a look at that Goldberg article.

The limitations of floating point arithmetic apply to all programming languages. But people notice them a lot more in Java because Java doesn’t round floating point numbers to six decimal places when it prints them. The C and C++ languages do round by default, hiding the floating point limitations from the unwary. The chapter on I/O has some examples of formatting numbers using class java.text.DecimalFormat to get the desired number of decimal places displayed.

Floating-Point Extension

A new keyword was added to JDK 1.2: strictfp. Without this keyword, a method or class is free to use IEEE 754 extended precision (80-bit) when calculating intermediate results.

This matches what the hardware does by default on an Intel x86 processor or an IBM PowerPC. The keyword strictfp disallows this time optimization and requires the use of standard precision. The two alternatives can produce slightly different results.

There are more details at java.sun.com/docs/books/jls/strictfp-changes.pdf.

Here’s the wording for telling a method not to use extended precision:

strictfp void doCalc (float x, float y) {
     // some calculations not to be done in extended precision... 
} 

The strictfp keyword is used when you need strictly portable floating point arithmetic on all your different platforms, and this is more important than slightly more accurate results on some platforms. If the strictfp keyword is not present, extended accuracy can be used, allowing slightly different arithmetic results.

Widening and Narrowing Conversions

This section provides more details on when a cast is needed, and also introduces the terminology of type conversions. This is explained in terms of an assignment between one variable and another, and exactly the same rules apply in the transfer of values from actual parameters to formal parameters.

When you assign an expression to a variable, a conversion must be done. Conversions among the primitive types are either identity, widening, or narrowing conversions.

  • Identity conversions. are an assignment between two identical types, like an int to int assignment. The conversion is trivial: just copy the bits unchanged.

  • Widening conversions. occur when you assign from a less capacious type (such as a short) to a more capacious one (such as a long). You may lose some digits of precision when you convert either way between an integer type and a floating point type. An example of this appeared in the previous section with a long-to-float assignment. Widening conversions preserve the approximate magnitude of the result, even if it cannot be represented exactly in the new type.

  • Narrowing conversions. are the remaining conversions. These are assignments from one type to a different type with a smaller range. They may lose the magnitude information. Magnitude means the largeness of a number, as in the phrase “order of magnitude.” So a conversion from a long to a byte will lose information about the millions and billions, but will preserve the least significant digits.

Widening conversions are inserted automatically by the compiler. Narrowing conversions always require an explicit cast.

The previous section explained how expressions are evaluated in one of the canonical types (int, long, float or double). That means if your expression is assigned to a non-canonical type (byte, short, or char), an identity or a narrowing conversion is going to be required. If a narrowing conversion is required, you must write a cast. The cast tells the compiler, “OK, I am aware that the most significant digits are being lost here. Just go ahead and do it.”

Now it should be clear why we have to use a cast in:

byte loNibble = (byte) (byteMe & 0x0F); 

Each operand in the expression “ byteMe & 0xOF ” is promoted to int, then the “and” operation is done yielding a 32-bit int result. The variable receiving the expression is an 8-bit byte, so a narrowing conversion is required. Hence, the cast “ (byte) ” is applied to the entire right hand side result to mollify the compiler.

What Happens on Overflow?

What Happens on Overflow?

When a result is too big for the type intended to hold it because of a cast, an implicit type conversion, or the evaluation of an expression, something has to give! What happens depends on whether the result type is integer or floating-point.

Integer Overflow

When an integer-valued expression is too big for its type, only the low end (least significant) bits get stored. Because of the way two’s-complement numbers are stored, adding one to the highest positive integer value gives a result of the highest negative integer value. Watch out for this (it’s true for all languages that use standard arithmetic, not just Java).

There is one case in which integer calculation ceases and overflow is reported to the programmer: division by zero (using / or %) will throw an exception. To “throw an exception” is covered in a later chapter.

There is a class called Integer that is part of the standard Java libraries. It contains some useful constants relating to the primitive type int.

public static final int   MIN_VALUE = 0x80000000; // class Integer 
public static final int   MAX_VALUE = 0x7fffffff; // class Integer 

There are similar values in the related class Long. Notice how these constants (final) are also static. If something is constant, you surely don’t need a copy of it in every object. You can use just one copy for the whole class, so make it static.

One possible use for these constants would be to evaluate an expression at long precision, and then compare the result to these int endpoints. If it is between them, then you know the result can be cast into an int without losing bits, unless it overflowed long, of course.

Floating Point Overflow

When a floating-point expression (double or float) overflows, the result becomes infinity. When it underflows (reaches a value that is too small to represent), the result goes to zero. When you do something undefined like divide zero by zero, you get a NaN. Under no circumstances is an exception ever raised from a float-ing-point expression.

The class Float, which is part of the standard Java libraries, contains some useful constants relating to the primitive type float.

public static final float POSITIVE_INFINITY; 
public static final float NEGATIVE_INFINITY; 
public static final float NaN; 
public static final float MAX_VALUE = 3.40282346638528860e+38f; 
public static final float MIN_VALUE = 1.40129846432481707e-45f; 

One pitfall is that it doesn’t help to compare a value to NaN, for NaN compares false to everything (including itself)! Instead, test the NaNiness of a value like this:

if (Float.isNaN( myfloat ) ) ... // It’s a NaN 

There are similar values in the class Double.

Arithmetic That Cannot Overflow

There is a class called java.math.BigInteger that supports arithmetic on unbounded integers, and a class called java.math.BigDecimal that does the same thing for real numbers. We’ll give some examples of these classes later. They simulate arbitrary precision arithmetic in software. They are not as fast as arithmetic on the primitive types, but they offer arbitrary precision operands and results.

The Math Package

Let’s introduce another of the standard classes. This one is called java.lang.Math and it has a couple of dozen useful mathematical functions and constants, including trig routines (watch out—these expect an argument in radians, not degrees), pseudorandom numbers, square root, rounding, and the constants pi and e.

There are two methods in Math to convert between degrees and radians:

public static double toDegrees(double); // new in JDK 1.2 
public static double toRadians(double); // new in JDK 1.2 

You’ll need these when you call the trig functions if your measurements are in degrees.

You can review the source of the Math package at $JAVA-HOME/src/java/lang/Math.java and in the browser looking at the Java API.

The Math.log() function returns a natural (base e) logarithm. Convert natural logarithms to base 10 logarithms with code like this:

double nat_log = ... 

double base10log = nat_log / Math.log(10.0); 

The list of members in the java.lang.Math class is:

public final class Math {
         public static final double E = 2.7182818284590452354; 
         public static final double PI = 3.14159265358979323846; 
         public static native double IEEEremainder(double, double); 
         public static double abs(double); 
         public static float abs(float); 
         public static int abs(int); 
         public static long abs(long); 

// trig functions 
         public static double toDegrees(double); 
         public static double toRadians(double); 
         public static native double sin(double); 
         public static native double cos(double); 
         public static native double tan(double); 
         public static native double asin(double); 
         public static native double acos(double); 
         public static native double atan(double); 
         public static native double atan2(double, double); 
         public static native double exp(double); 
         public static native double pow(double, double); 
         public static native double log(double); 
         public static native double sqrt(double); 

// rounding and comparing 
         public static native double ceil(double); 
         public static native double floor(double); 
         public static double max(double, double); 
         public static float max(float, float); 
         public static int max(int, int); 
         public static long max(long, long); 
         public static double min(double, double); 
         public static float min(float, float); 
         public static int min(int, int); 
         public static long min(long, long); 
         public static long round(double); 
         public static int round(float); 

// returns a random number between 0.0 and 1.0 
         public static synchronized double random(); 

// rounds the argument to an integer, stored as a double 
         public static native double rint(double); 
} 

All the Math methods are “strictfp”.

Further Reading

“What Every Computer Scientist Should Know About Floating-Point Arithmetic” by David Goldberg, Computing Surveys, March 1991, published by the Association for Computing Machinery.

It explains why all floating-point arithmetic is approximate and how errors can creep in. There is a hyperlink to an online copy of this paper in the Java Programmers FAQ which is on the CD.

ANSI/IEEE Standard 754-1985 for Binary Floating-Point Arithmetic

Institute of Electrical and Electronic Engineers, New York, published 1985. Reprinted in SIGPLAN 22(2) pages 9–25.

Some Light Relief—Too Much Bread

Dough. Spondulicks. Moolah. Cabbage. Oof. Bread.

It’s what we get for writing programs that other people want, instead of spending all our time writing programs that interest us. Since we program for money, we might as well try to maximize the flow that comes our way. There are various ways to do that, all involving trade-offs. One way, not unusual in Silicon Valley where I work, is to put the perfection of the programming craft above all other life-style considerations. That can get a bit dire in the long run, though. This story concerns a programmer who tried to maximize his income with unexpected results. We’ll call him “Archie” here, because that’s his name.

Archie made a vast pile of loot by contract programming. Archie wasn’t all that good at programming and often had to skip to a new contract every few months. He was very good at selling himself, though, and giving the appearance of competence, which is how he survived as a contract programmer for so long. He fiddled his taxes by fixing things so that he appeared to be an employee of a shell company in an offshore tax haven. Actually, he was a partner in that shell company, paid himself a pittance in the taxable country, and stashed a tax-free fortune offshore. He did this for a number of years.

So, Archie lived a jet-set life and never missed a chance to rub it in the faces of us wage slaves. Archie had everything: badly-written code, oodles of cash, regular ski trips, and many girlfriends (whom he gleefully two-timed). In short, he was a bit of a reptile who thought he’d figured out a way to beat the system.

Eventually, Archie had so much stashed away in his offshore bank account that he decided to put the money to use. Since he couldn’t easily spend the money domestically, he bought a sandwich shop franchise in Barcelona. The sandwich shop was doing fine under its current owner and was almost certain to continue to prosper. Archie hired a local manager to replace the old owner, and everything ran well for about a month.

Then Archie received an urgent call from the bank in Spain saying that the venture wasn’t generating enough cash to meet its payroll. Archie transferred the funds and put it down to start-up costs. The same thing happened the next month. At great cost, Archie hired a local auditor who eventually determined the local manager was skimming from the profits! Archie flew to Barcelona, fired the manager, and stayed in a hotel for three weeks until he was able to hire a new manager. This investment was already proving expensive.

The new manager wasn’t any more honest than the last one, and Archie had to fly out and hire another one. That manager lasted only a couple of weeks, and Archie realized that he had a serious problem on his hands. So far, all the flights, hotels, payroll, auditing, and advertising had cost him about as much again as his original investment. He needed to put more cash in every month to keep the venture solvent. He didn’t have a new manager, and worst of all, since it now had a track record of six months of losses, he could no longer sell the business as a going concern.

Archie quickly figured out that he’d have to take charge personally to solve these problems. This is why you could find Archie—jet-setter and lavishly paid contract programmer—waiting tables in a sandwich shop in Barcelona throughout 1998. While he’s serving sandwiches and trying to build up a record of profitability for the business so he can unload it, Archie has no time to work as a contract programmer. Thus, his losses also include what he’d make in a year of contract programming. It couldn’t happen to a nicer guy.

The last time I passed through Barcelona on business, I felt like having a snack, so I called in at the sandwich store. Archie was neatly dressed in a green apron and a hair net, and he was standing behind a counter slicing a towering pile of loaves. “Hi Archie. How’s things?” I called out. Archie glanced up and gestured at the loaves all around him. “Too much bread,” he muttered. I could only nod my silent agreement with his pronouncement. Yes, way too much bread.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.96.5