Chapter 9. Operators and Expressions

 

Work is of two kinds: first, altering the position of matter at or near the earth's surface relative to other matter; second, telling other people to do so.

 
 --Bertrand Russell

This chapter teaches you about the fundamental computational building blocks of the programming language—namely, operators and expressions. You have already seen a lot of code and have gained familiarity with its components. This chapter describes these basic elements in detail.

Each operator is described in terms of its basic operation, upon operands of the base expected type—such as a numeric primitive type or a reference type. The actual evaluation of an expression in a program may involve type conversions as described in Section 9.4 on page 216.

Arithmetic Operations

There are several binary arithmetic operators that operate on any of the primitive numerical types:

+

addition

-

subtraction

*

multiplication

/

division

%

remainder

You can also use unary - for negation. The sign of a number can be inverted with code like this:

val = -val;

For completeness there is also a unary +, as in +2.0.

The exact actions of the binary arithmetic operators depend on the types of operands involved. The following sections look at the different rules for integer and floating-point arithmetic.

Integer Arithmetic

Integer arithmetic is modular two's-complement arithmetic—that is, if a value exceeds the range of its type (int or long), it is reduced modulo the range. So integer arithmetic never overflows or underflows but only wraps.

Integer division truncates toward zero (7/2 is 3, and -7/2 is -3). For integer types, division and remainder obey the rule

(x/y)*y + x%y == x

So 7%2 is 1, and -7%2 is –1. Dividing by zero or remainder by zero is invalid for integer arithmetic and throws ArithmeticException.

Character arithmetic is integer arithmetic after the char is implicitly converted to int—see “Expression Type” on page 215.

Floating-Point Arithmetic

Floating-point arithmetic can overflow to infinity (become too large for a double or float) or underflow (become too small for a double or float). Underflow results in a loss of precision, possibly enough to yield a zero value.[1] The result of an invalid expression, such as dividing infinity by infinity, is a NaN value—for “Not-a-Number.”

Arithmetic with finite operands performs as expected, within the limits of precision of double or float. Signs of floating-point arithmetic results are also as expected. Multiplying two numbers having the same sign results in a positive value; multiplying two numbers having opposite signs results in a negative value.

Adding two infinities results in the same infinity if their signs are the same, and NaN if their signs differ. Subtracting infinities of the same sign produces NaN; subtracting infinities of opposite signs produces an infinity of the same sign as the left operand. For example, (∞ - (-∞)) is ∞. Arithmetic operations involving any value that is NaN have a result that is also NaN. Overflows result in a value that is an infinity of the proper sign. Underflows also result in values of the proper sign. Floating-point arithmetic has a negative zero -0.0, which compares equal to +0.0. Although they compare equal, the two zeros can produce different results. For example, the expression 1f/0f yields positive infinity and 1f/-0f yields negative infinity.

If the result of an underflow is -0.0 and if -0.0== 0.0, how do you test for a negative zero? You must use the zero in an expression where sign matters and then test the result. For example, if x has a zero value, the expression 1/x will yield negative infinity if x is negative zero, or positive infinity if x is positive zero.

The rules for operations on infinities match normal mathematical expectations. Adding or subtracting any number to or from either infinity results in that infinity. For example, (- ∞ + x) is -∞ for any finite number x.

You can get an infinity value from the constants POSITIVE_INFINITY and NEGATIVE_INFINITY in the wrapper classes Float and Double. For example, Double.NEGATIVE_INFINITY is the double value of minus infinity.

Multiplying infinity by zero yields NaN. Multiplying infinity by a non-zero finite number produces an infinity of the appropriate sign.

Floating-point division and remainder can produce infinities or NaN but never throw an exception. This table shows the results of the various combinations:

x

y

x/y

x%y

Finite

±0.0

±∞

NaN

Finite

±∞

±0.0

x

±0.0

±0.0

NaN

NaN

±∞

Finite

±∞

NaN

±∞

±∞

NaN

NaN

Otherwise, floating-point remainder (%) acts analogously to integer remainder as described earlier. See the IEEEremainder method in “Math and StrictMath” on page 657 for a different remainder calculation.

Exercise 9.1Write a program that uses the operators +, , *, and /, on two infinite operands and show the result. Ensure that you try both same signed and opposite-signed infinity values.

Strict and Non-Strict Floating-Point Arithmetic

Floating-point arithmetic can be executed in one of two modes: FP-strict or not FP-strict. For simplicity, we refer to these as strict and non-strict, respectively. Strict floating-point evaluation follows constrained rules about exact floating-point operations: When you execute strict floating-point code you will always get exactly equivalent results on all Java virtual machine implementations. Floating-point arithmetic that is non-strict can be executed with somewhat relaxed rules. These rules allow the use of floating-point representations that may avoid some overflows or underflows that would occur if the arithmetic were executed strictly. This means that some applications may run differently on different virtual machines when under non-strict floating point. Non-strict floating point might also execute faster than strict floating point, because the relaxed rules may allow the use of representations that are supported directly by the underlying hardware.

The strictness of floating-point arithmetic is determined by the presence of the modifier strictfp, which can be applied to a class, interface, or method. When you declare a method strictfp, all the code in the method will be executed according to strict constraints. When you use strictfp on a class or interface, all code in the class, including initializers and code in nested types, will be evaluated strictly. To determine if an expression is strict, all methods, classes, and interfaces within which the expression is contained are examined; if any of them is declared strictfp then the expression is strict. For example, an inner class declared within a strict outer class, or within a strict method, is also strict. However, strictness is not inherited—the presence of strictfp on a class or interface declaration does not cause extended classes or interfaces to be strict.

Constant expressions that require floating point are always evaluated strictly. Otherwise, any code that is not marked as strictfp can be executed under rules that do not require repeatable results. If you want to guarantee bit-for-bit exact results across all Java virtual-machine implementations, you should use strictfp on relevant methods, classes, and interfaces. You should also note that a virtual machine can satisfy the rules for non-strict floating point by always acting strictly: Non-strict floating point does not require the virtual machine to act differently, it offers the virtual machine a degree of freedom to optimize code where repeatable results are not required.

Non-strict floating-point execution allows intermediate results within expressions to take on values with a greater range than allowed by the usual float or double representations. These extended range representations avoid some underflows and overflows that can happen with the standard representations. Fields, array elements, local variables, parameters, and return values always use the standard representations. When an extended range representation is converted to a standard representation, the closest representable value is used.

If you need a complete understanding of these issues, you should consult The Java Language Specification, Third Edition.

General Operators

In addition to the main arithmetic operators, there are other useful operators for comparing and manipulating values. Member access operators, method invocation and type conversion operators are discussed in following sections.

Increment and Decrement Operators

The ++ and -- operators are the increment and decrement operators, respectively, and can only be applied to numeric variables or numeric array elements. The expression i++ is equivalent to i= i+ 1 except that i is evaluated only once. For example, the statement

++arr[where()];

invokes where only once and uses the result as an index into the array only once. On the other hand, in the statement

arr[where()] = arr[where()] + 1;

the where method is called twice: once to determine the index on the right-hand side, and a second time to determine the index on the left-hand side. If where returns a different value each time it is invoked, the results will be quite different from those of the ++ expression. To avoid the second invocation of where you would have to store its result in a temporary—hence the increment (and decrement) operator allows a simpler, succinct expression of what you want to do.

The increment and decrement operators can be either prefix or postfix operators—they can appear either before or after what they operate on. If the operator comes before (prefix), the operation is applied before the value of the expression is returned. If the operator comes after (postfix), the operation is applied after the original value is used. For example:

class IncOrder {
    public static void main(String[] args) {
        int i = 16;
        System.out.println(++i + " " + i++ + " " + i);
    }
}

The output is

17 17 18

The expression ++i preincrements the value of i to 17 and evaluates to that value (17); the expression i++ evaluates to the current value of i (17) and postincrements i to have the value 18; finally the expression i is the value of i after the postincrement from the middle term. Modifying a variable more than once in an expression makes code hard to understand, and should be avoided.

The increment and decrement operators ++ and -- can also be applied to char variables to get to the next or previous Unicode character.

Relational and Equality Operators

The language provides a standard set of relational and equality operators, all of which yield boolean values:

>

greater than

>=

greater than or equal to

<

less than

<=

less than or equal to

==

equal to

!=

not equal to

Both the relational and equality operators can be applied to the primitive numeric types, with the usual mathematical interpretation applying.

Floating-point values follow normal ordering (–1.0 is less than 0.0 is less than positive infinity) except that NaN is an anomaly. All relational and equality operators that test a number against NaN return false, except !=, which always returns true. This is true even if both values are NaN. For example,

Double.NaN == Double.NaN

is always false. To test whether a value is NaN, use the type-specific NaN testers: the static methods Float.isNaN(float) and Double.isNaN(double).[2]

Only the equality operators == and != are allowed to operate on boolean values.

These operators can be combined to create a “logical XOR” test. The following code invokes sameSign only if both x and y have the same sign (or zero); otherwise, it invokes differentSign:

if ((x < 0) == (y < 0))
    sameSign();
else
    differentSign();

The equality operators can also be applied to reference types. The expression ref1==ref2 is true if the two references refer to the same object or if both are null, even if the two references are of different declared types. Otherwise, it is false.

The equality operators test for reference identity, not object equivalence. Two references are identical if they refer to the same object; two objects are equivalent if they logically have the same value. Equivalence is tested with the equals method defined by Object, which you should override in classes for which equivalence and identity are different. Object.equals assumes an object is equal only to itself. For example, the String class overrides equals to test whether two String objects have the same contents—see Chapter 13.

Logical Operators

The logical operators combine boolean expressions to yield boolean values and provide the common operations of boolean algebra:

&

logical AND

|

logical inclusive OR

^

logical exclusive or (XOR)

!

logical negation

&&

conditional AND

||

conditional OR

A “logical AND” is true if and only if both its operands are true, while a “logical OR” is true if and only if either of its operands are true. The “exclusive OR” operator yields true if either, but not both, of its operands is true—which is the same as testing the equality of the two operands, so we can rewrite our earlier example as:

if ((x < 0) ^ (y < 0))
    differentSign();
else
    sameSign();

The unary operator ! negates, or inverts, a boolean, so !true is the same as false and !false is the same as true.

Boolean values are normally tested directly—if x and y are booleans, the code

if (x || !y) {
    // ...
}

is considered cleaner than the equivalent, but more verbose

if (x == true || y == false) {
    // ...
}

The && (“conditional AND”) and || (“conditional OR”) operators perform the same logical function as the simple & and | operators, but they avoid evaluating their right operand if the truth of the expression is determined by the left operand—for this reason they are sometimes referred to as “short-circuit operators.” For example, consider:

if (w && x) {      // outer "if"
    if (y || z) {  // inner "if"
        // ...        inner "if" body
    }

}

The inner if is executed only if both w and x are true. If w is false then x will not be evaluated because the expression is already guaranteed to be false. The body of the inner if is executed if either y or z is true. If y is true, then z will not be evaluated because the expression is already guaranteed to be true.

A lot of code relies on this rule for program correctness or efficiency. For example, the evaluation shortcuts make the following code safe:

if (0 <= ix && ix < array.length && array[ix] != 0) {
    // ...
}

The range checks are done first. The value array[ix] will be accessed only if ix is within bounds. There is no “conditional XOR” because the truth of XOR always depends on the value of both operands.

instanceof

The instanceof operator evaluates whether a reference refers to an object that is an instance of a particular class or interface. The left-hand side is a reference to an object, and the right-hand side is a class or interface name. You learned about instanceof on page 92.

Bit Manipulation Operators

The binary bitwise operators are:

&

bitwise AND

|

bitwise inclusive OR

^

bitwise exclusive or (XOR)

The bitwise operators apply only to integer types (including char) and perform their operation on each pair of bits in the two operands. The AND of two bits yields a 1 if both bits are 1, the OR of two bits yields a 1 if either bit is 1, and the XOR of two bits yields a 1 only if the two bits have different values. For example:

0xf00f & 0x0ff0 yields 0x0000
0xf00f | 0x0ff0 yields 0xffff
0xaaaa ^ 0xffff yields 0x5555

There is also a unary bitwise complement operator ~, which toggles each bit in its operand. An int with value 0x00003333 has a complemented value of 0xffffcccc.

Although the same characters are used for the bitwise operators and the logical operators, they are quite distinct. The types of the operands determine whether, for example, & is a logical or a bitwise AND. Since logical operators only apply to booleans and bitwise operators only apply to integer types, any expression involving operands of the different types, such as true& 0xaaaa, is a compile-time error.

There are other bit manipulation operators to shift bits within an integer value:

<<

Shift bits left, filling with zero bits on the right-hand side

>>

Shift bits right, filling with the highest (sign) bit on the left-hand side

>>>

Shift bits right, filling with zero bits on the left-hand side

The left-hand side of a shift expression is what is shifted, and the right-hand side is how much to shift. For example, var>>> 2 will shift the bits in var two places to the right, dropping the bottom two bits from the end and filling the top two bits with zero.

The two right-shift operators provide for an arithmetic shift (>>) and a logical shift (>>>). The arithmetic shift preserves the sign of the value by filling in the highest bit positions with the original sign bit (the bit in the highest position). The logical shift inserts zeroes into the high-order bits. It is often used when extracting subsets of bits from a value. For example, in binary coded decimal (BCD) each decimal digit is represented by four bits (0x00 to 0x09—the remaining bit patterns are invalid) and so every byte can encode two decimal digits. To extract the low-order digit you AND the byte with a mask of 0x0f to zero out the high-order digit. To extract the high-order digit you logically right-shift the value by four positions, moving the valid bits down to the least significant positions and filling the new high-order bits with zero:

class BCD {
    static int getBCDLowDigit(byte val) {
        return (val & 0x0f);
    }
    static int getBCDHighDigit(byte val) {
        return val >>> 4 ;
    }
}

Shift operators have a slightly different type rule from most other binary integer operations. For shift operators, the resulting type is the type of the left-hand operand—that is, the value that is shifted. If the left-hand side of the shift is an int, the result of the shift is an int, even if the shift count is provided as a long.

If the shift count is larger than the number of bits in the word, or if it is negative, the actual count will be different from the provided count. The actual count used in a shift is the count you provide, masked by the size of the type minus one. For a 32-bit int, for example, the mask used is 0x1f (31), so both (n<< 35) and (n<< -29) are equivalent to (n<< 3).

Shift operators can be used only on integer types. In the rare circumstance when you actually need to manipulate the bits in a floating-point value, you can use the conversion methods on the classes Float and Double, as discussed in “The Floating-Point Wrapper Classes” on page 191.

Exercise 9.2Write a method that determines the number of 1 bits in a passed-in int, by using just the bit manipulation operators (that is, don't use Integer.bitCount). Compare your solution with published algorithms for doing this—see the related reading for “General Programming Techniques” on page 758 for one source.

The Conditional Operator ?:

The conditional operator provides a single expression that yields one of two values based on a boolean expression. The statement

value = (userSetIt ? usersValue : defaultValue);

is equivalent to

if (userSetIt)
    value = usersValue;
else
    value = defaultValue;

The primary difference between the if statement and the ?: operator is that the latter has a value and so can be used as part of an expression. The conditional operator results in a more compact expression, but programmers disagree about whether it is clearer. We use whichever seems clearer at the time. When to use parentheses around a conditional operator expression is a matter of personal style, and practice varies widely. Parentheses are not required by the language.

The conditional operator expression has a type that is determined by the types of the second and third expressions (usersValue and defaultValue in the above example). If the types of these expressions are the same then that is the type of the overall expression. Otherwise, the rules get somewhat complicated:

  • If one expression is a primitive type and the other can be unboxed to become a compatible primitive type, then the unboxing occurs and the expressions are reconsidered.

  • If both expressions are numeric primitive types then the resulting type is also a numeric primitive type, obtained by numeric promotion if needed. For example, in

    double scale = (halveIt ? 0.5 : 1);
    

    the two expressions are of type double (0.5) and type int (1). An int is assignable to a double, so the 1 is promoted to 1.0 and the type of the conditional operator is double.

  • If one expression is an int constant, and the other is byte, short, or char, and the int value can fit in the smaller type, then the resulting type is that smaller type.

  • If one of the expressions is a primitive type and the other is a reference type that can't be unboxed to get a compatible value, or both expressions are primitive but incompatible, then the primitive type is boxed so that we have two reference types.

  • Given two reference types that are different, the type of the expression is the first common parent type. For example, if both expressions were unrelated class types that implemented Cloneable then Cloneable would be the type of the expression; if one expression was int while the other was String, then Object would be the resulting type.

Note that if either expression has a type of void (possible if the expression invokes a method with the void return type) then a compile-time error occurs.

This operator is also called the question/colon operator because of its form, and the ternary operator because it is the only ternary (three-operand) operator in the language.

Assignment Operators

The assignment operator = assigns the value of its right-operand expression to its left operand, which must be a variable (either a variable name or an array element). The type of the expression must be assignment compatible with the type of the variable—an explicit cast may be needed—see Section 9.4 on page 216.

An assignment operation is itself an expression and evaluates to the value being assigned. For example, given an intz , the assignment

z = 3;

has the value 3. This value can be assigned to another variable, which also evaluates to 3 and so that can be assigned to another variable and so forth. Hence, assignments can be chained together to give a set of variables the same value:

x = y = z = 3;

This also means that assignment can be performed as a side effect of evaluating another expression—though utilizing side effects in expressions is often considered poor style. An acceptable, and common, example of this is to assign and test a value within a loop expression. For example:

while ((v = stream.next()) != null)
    processValue(v);

Here the next value is read from a stream and stored in the variable v. Provided the value read was not null, it is processed and the next value read. Note that as assignment has a lower precedence than the inequality test (see “Operator Precedence and Associativity” on page 221), you have to place the assignment expression within parentheses.

The simple = is the most basic form of assignment operator. There are many other assignment forms. Any binary arithmetic, logical, or bit manipulation operator can be concatenated with = to form another assignment operator—a compound assignment operator. For example,

arr[where()] += 12;

is the same as

arr[where()] = arr[where()] + 12;

except that the expression on the left-hand side of the assignment is evaluated only once. In the example, arr[where()] is evaluated only once in the first expression, but twice in the second expression—as you learned earlier with the ++ operator.

Given the variable var of type T, the value expr, and the binary operator op, the expression

var op= expr

is equivalent to

var = (T) ((var) op (expr))

except that var is evaluated only once. This means that op= is valid only if op is valid for the types involved. You cannot, for example, use <<= on a double variable because you cannot use << on a double value.

Note the parentheses used in the expanded form you just saw. The expression

a *= b + 1

is analogous to

a = a * (b + 1)

and not to

a = a * b + 1

Although a+= 1 is the same as ++a and a++, using ++ is considered idiomatic and is preferred.

Exercise 9.3Review your solution to Exercise 7.3 to see if it can be written more clearly or succinctly with the operators you've learned about in this chapter.

String Concatenation Operator

You can use + to concatenate two strings. Here is an example:

String boo = "boo";
String cry = boo + "hoo";
cry += "!";
System.out.println(cry);

And here is its output:

boohoo!

The + operator is interpreted as the string concatenation operator whenever at least one of its operands is a String. If only one of the operands is a String then the other is implicitly converted to a String as discussed in “String Conversions” on page 220.

new

The new operator is a unary prefix operator—it has one operand that follows the operator. Technically, the use of new is known as an instance creation expression—because it creates an instance of a class or array. The value of the expression is a reference to the object created. The use of new and the associated issue of constructors was discussed in detail on page 50.

Expressions

An expression consists of operators and their operands, which are evaluated to yield a result. This result may be a variable or a value, or even nothing if the expression was the invocation of a method declared void. An expression may be as simple as a single variable name, or it may be a complex sequence of method invocations, variable accesses, object creations, and the combination of the results of those subexpressions using other operators, further method invocations, and variable accesses.

Order of Evaluation

Regardless of their complexity, the meanings of expressions are always well-defined. Operands to operators will be evaluated left-to-right. For example, given x+y+z, the compiler evaluates x, evaluates y, adds the values together, evaluates z, and adds that to the previous result. The compiler does not evaluate, say, y before x, or z before either y or x. Similarly, argument expressions for method, or constructor, invocations are evaluated from left to right, as are array index expressions for multidimensional arrays.

Order of evaluation matters if x, y, or z has side effects of any kind. If, for instance, x, y, or z are invocations of methods that affect the state of the object or print something, you would notice if they were evaluated in any other order. The language guarantees that this will not happen.

Except for the operators &&, ||, and ?:, every operand of an operator will be evaluated before the operation is performed. This is true even for operations that throw exceptions. For example, an integer division by zero results in an ArithmeticException, but it will do so only after both operands have been fully evaluated. Similarly, all arguments for a method or constructor invocation are evaluated before the invocation occurs.

If evaluation of the left operand of a binary operator causes an exception, no part of the right-hand operand is evaluated. Similarly, if an expression being evaluated for a method, or constructor, argument causes an exception, no argument expressions to the right of it will be evaluated—and likewise for array index expressions. The order of evaluation is very specific and evaluation stops as soon as an exception is encountered.

One further detail concerns object creation with new. If insufficient memory is available for the new object, an OutOfMemoryError exception is thrown. This occurs before evaluation of the constructor arguments occurs—because the value of those arguments is not needed to allocate memory for the object—in which case those arguments won't be evaluated. In contrast, when an array is created, the array dimension expressions must be evaluated first to find out how much memory to allocate—consequently, array creation throws the OutOfMemoryError after the dimension expressions are evaluated.

Expression Type

Every expression has a type. The type of an expression is determined by the types of its component parts and the semantics of operators.

If an arithmetic or bit manipulation operator is applied to integer values, the result of the expression is of type int unless one or both sides are long, in which case the result is long. The exception to this rule is that the type of shift operator expressions are not affected by the type of the right-hand side. All integer operations are performed in either int or long precision, so the smaller byte and short integer types are always promoted to int before evaluation.

If either operand of an arithmetic operator is floating point, the operation is performed in floating-point arithmetic. Such operations are done in float unless at least one operand is a double, in which case double is used for the calculation and result.

A + operator is a String concatenation when either operand to + is of type String or if the left-hand side of a += is a String.

When used in an expression, a char value is converted to an int by setting the top 16 bits to zero. For example, the Unicode character uffff would be treated as equivalent to the integer 0x0000ffff. This treatment is different from the way a short with the value 0xffff would be treated—sign extension makes the short equivalent to -1, and its int equivalent would be 0xffffffff.

The above are all examples of different type conversions that can occur within an expression, to determine the type of that expression. The complete set of conversions is discussed next.

Type Conversions

The Java programming language is a strongly typed language, which means that it checks for type compatibility at compile time in almost all cases. Incompatible assignments are prevented by forbidding anything questionable. It also provides cast operations for times when the compatibility of a type can be determined only at run time, or when you want to explicitly force a type conversion for primitive types that would otherwise lose range, such as assigning a double to a float. You learned about type compatibility and conversion for reference types on page 90. In this section you will learn about conversions as a whole, for both primitive and reference types, and the contexts in which those conversions are automatically applied.

Implicit Type Conversions

Some kinds of conversions happen automatically, without any work on your part—these are implicit conversions.

Any numeric value can be assigned to any numeric variable whose type supports a larger range of values—a widening primitive conversion. A char can be used wherever an int is valid. A floating-point value can be assigned to any floating-point variable of equal or greater precision.

You can also use implicit widening conversion of integer types to floating-point, but not vice versa. There is no loss of range going from integer to floating point, because the range of any floating-point type is larger than the range of any integer.

Preserving magnitude is not the same as preserving the precision of a value. You can lose precision in some implicit conversions. Consider, for example, assigning a long to a float. The float has 32 bits of data and the long has 64 bits of data. A float stores fewer significant digits than a long, even though a float stores numbers of a larger range. You can lose data in an assignment of a long to a float. Consider the following:

long orig = 0x7effffff00000000L;
float fval = orig;
long lose = (long) fval;

System.out.println("orig = " + orig);
System.out.println("fval = " + fval);
System.out.println("lose = " + lose);

The first two statements create a long value and assign it to a float value. To show that this loses precision, we explicitly cast fval to a long and assign it to another variable (explicit casts are covered next). If you examine the output, you can see that the float value lost some precision: The long variable orig that was assigned to the float variable fval has a different value from the one generated by the explicit cast back into the long variable lose:

orig = 9151314438521880576
fval = 9.1513144E18
lose = 9151314442816847872

As a convenience, compile-time constants of integer type can be assigned to smaller integer types, without a cast, provided the value of the constant can actually fit in the smaller type and the integer type is not long. For example, the first two assignments are legal while the last is not:

short s1 = 27;    // implicit int to short
byte  b1 = 27;    // implicit int to byte
short s3 = 0x1ffff; // INVALID: int value too big for short

Such a conversion, from a larger type to a smaller type, is a narrowing primitive conversion.

In all, seven different kinds of conversions might apply when an expression is evaluated:

  • Widening or narrowing primitive conversions

  • Widening or narrowing reference conversions

  • Boxing or unboxing conversions

  • String conversions

You have previously seen all of these. There are then five different contexts in which these conversions might be applied, but only some conversions apply in any given context:

  • Assignment—. This occurs when assigning the value of an expression to a variable and can involve the following: a widening primitive conversion; a widening reference conversion; a boxing conversion optionally followed by a widening reference conversion; or an unboxing conversion optionally followed by a widening primitive conversion.

    If the resulting expression is of type byte, char, short, or int, and is a constant expression, then a narrowing primitive conversion can be applied if the variable is of type byte, short, or char and the value will fit in that type—for example, the assignment of the int literal 27 to a short variable that we saw previously. If the type of the variable is Byte, Short, or Character, then after the narrowing primitive conversion a boxing conversion can be applied:

         Short s1 = 27;    // implicit int to short to Short
    
  • Method invocation—. This occurs when the type of an expression being passed as an argument to a method invocation is checked. Basically, the same conversions apply here as they do for assignment, with the exception that the narrowing primitive conversions are not applied. This means, for example, that a method expecting a short parameter will not accept the argument 27 because it is of type int—an explicit cast must be used. This restriction makes it easier to determine which method to invoke when overloaded forms of it exist—for example, if you try to pass a short to a method that could take an int or a byte, should the short get widened to an int, or narrowed to a byte?

  • Numeric promotion—. Numeric promotion, as discussed in Section 9.1 on page 201, ensures that all the operands of an arithmetic expression are of the appropriate type by performing widening primitive conversions, preceded if necessary by unboxing conversions.

  • Casts—. Casts potentially allow for any of the conversions, but may fail at runtime—this is discussed in the next section.

  • String conversions—. String conversions occur when the string concatenation operation is used (see page 214) and are discussed further in “String Conversions” on page 220.

Explicit Type Casts

When one type cannot be assigned to another type with implicit conversion, often it can be explicitly cast to the other type—usually to perform a narrowing conversion. A cast requests a new value of a new type that is the best available representation of the old value in the old type. Some casts are not allowed—for example, a boolean cannot be cast to an int—but explicit casting can be used to assign a double to a long, as in this code:

double d = 7.99;
long l = (long) d;

When a floating-point value is cast to an integer, the fractional part is lost by rounding toward zero; for instance, (int)-72.3 is -72. Methods available in the Math and StrictMath classes round floating-point values to integers in other ways. See “Math and StrictMath” on page 657 for details. A floating-point NaN becomes the integer zero when cast. Values that are too large, or too small, to be represented as an integer become MAX_VALUE or MIN_VALUE for the types int and long. For casts to byte, short, or char, the floating-point value is first converted to an int or long (depending on its magnitude), and then to the smaller integer type by chopping off the upper bits as described below.

A double can also be explicitly cast to a float, or an integer type can be explicitly cast to a smaller integer type. When you cast from a double to a float, three things can go wrong: you can lose precision, you can get a zero, or you can get an infinity where you originally had a finite value outside the range of a float.

Integer types are converted by chopping off the upper bits. If the value in the larger integer fits in the smaller type to which it is cast, no harm is done. But if the larger integer has a value outside the range of the smaller type, dropping the upper bits changes the value, including possibly changing sign. The code

short s = -134;
byte b = (byte) s;

System.out.println("s = " + s + ", b = " + b);

produces the following output because the upper bits of s are lost when the value is stored in b:

s = -134, b = 122

A char can be cast to any integer type and vice versa. When an integer is cast to a char, only the bottom 16 bits of data are used; the rest are discarded. When a char is cast to an integer type, any additional upper bits are filled with zeros.

Once those bits are assigned, they are treated as they would be in any other value. Here is some code that casts a large Unicode character to both an int (implicitly) and a short (explicitly). The int is a positive value equal to 0x0000ffff, because the upper bits of the character were set to zero. But the same bits in the short are a negative value, because the top bit of the short is the sign bit:

class CharCast {
    public static void main(String[] args) {
        int i = 'uffff';
        short s = (short) 'uffff';

        System.out.println("i = " + i);
        System.out.println("s = " + s);
    }
}

And here is the program's output:

i = 65535
s = -1

String Conversions

One special type of implicit conversion involves both the primitive and reference types: string conversion. Whenever a + operator has at least one String operand, it is interpreted as the string concatenation operator and the other operand, if not a String, is implicitly converted into a String. Such conversions are predefined for all primitive types. Objects are converted via the toString method, which is either inherited from Object or overridden to provide a meaningful string representation. For example, the following method brackets a string with the guillemet characters used for quotation marks in many European languages:

public static String guillemete(String quote) {
    return '«' + quote + '»';
}

This implicit conversion of primitive types and objects to strings happens only when you're using + or += in expressions involving strings. It does not happen anywhere else. A method, for example, that takes a String parameter must be passed a String. You cannot pass it an object or primitive type and have it converted implicitly.

When a null reference is converted to a String, the result is the string "null", hence a null reference can be used freely within any string concatenation expression.

Operator Precedence and Associativity

Operator precedence is the “stickiness” of operators relative to each other. Operators have different precedences. For example, relational operators have a higher precedence than boolean logic operators, so you can say

if (min <= i && i <= max)
    process(i);

without any confusion. Because * (multiply) has a higher precedence than - (minus), the expression

3 - 3 * 5

has the value –12, not zero. Precedence can be overridden with parentheses; if zero were the desired value, for example, the following would do the trick:

(3 - 3) * 5

When two operators with the same precedence appear next to each other, the associativity of the operators determines which is evaluated first. Because + (add) is left-associative, the expression

a + b + c

is equivalent to

(a + b) + c

The following table lists all the operators in order of precedence from highest to lowest. All the operators are binary, except those shown as unary with expr, the creation and cast operators (which are also unary), and the conditional operator (which is ternary). Operators with the same precedence appear on the same line of the table:

postfix operators

[] . (params) expr++ expr--

unary operators

++expr --expr +expr -expr ~ !

creation or cast

new (type)expr

multiplicative

* / %

additive

+ -

shift

<< >> >>>

relational

< > >= <= instanceof

equality

== !=

AND

&

exclusive OR

^

inclusive OR

|

conditional AND

&&

conditional OR

||

conditional

?:

assignment

= += -= *= /= %= >>= <<= >>>= &= ^= |=

All binary operators except assignment operators are left-associative. Assignment is right-associative. In other words, a=b=c is equivalent to a=(b=c), so it is convenient to chain assignments together. The conditional operator ?: is right-associative.

Parentheses are often needed in expressions in which assignment is embedded in a boolean expression, or in which bitwise operations are used. For an example of the former, examine the following code:

while ((v = stream.next()) != null)
    processValue(v);

Assignment operators have lower precedence than equality operators; without the parentheses, it would be equivalent to

while (v = (stream.next() != null)) // INVALID
    processValue(v);

and probably not what you want. It is also likely to be invalid code since it would be valid only in the unusual case in which v is boolean.

Many people find the precedence of the bitwise and logical operators &, ^, and | hard to remember. In complex expressions you should parenthesize these operators for readability and to ensure correct precedence.

Our use of parentheses is sparse—we use them only when code seems otherwise unclear. Operator precedence is part of the language and should be generally understood. Others inject parentheses liberally. Try not to use parentheses everywhere—code becomes illegible, looking like LISP with none of LISP's saving graces.

Exercise 9.4Using what you've learned in this chapter but without writing code, figure out which of the following expressions are invalid and what the type and values are of the valid expressions:

3 << 2L - 1
(3L << 2) - 1
10 < 12 == 6 > 17
10 << 12 == 6 >> 17
13.5e-1 % Float.POSITIVE_INFINITY
Float.POSITIVE_INFINITY + Double.NEGATIVE_INFINITY
Double.POSITIVE_INFINITY - Float.NEGATIVE_INFINITY
0.0 / -0.0 == -0.0 / 0.0
Integer.MAX_VALUE + Integer.MIN_VALUE
Long.MAX_VALUE + 5
(short) 5 * (byte) 10
(i < 15 ? 1.72e3f : 0)
i++ + i++ + --i     // i = 3 at start

Member Access

You use the dot (.) operator—as in ref.method()—to access both instance members or static members of types. Because types can inherit members from their supertypes, there are rules regarding which member is accessed in any given situation. Most of these rules were covered in detail in Chapter 2 and Chapter 3, but we briefly recap them.

You access static members by using either the type name or an object reference. When you use a type name, the member referred to is the member declared in that type (or inherited by it if there was no declaration in that type). When you use an object reference, the declared type of the reference determines which member is accessed, not the type of the object being referred to. Within a class, reference to a static member always refers to the member declared in, or inherited by, that class.

Non-static members are accessed through an object reference—either an explicit reference or implicitly this (or one of the enclosing objects) if the non-static member is a member of the current object (or enclosing object). Fields and nested types are accessed based on the declared type of the object reference. Similarly, within a method, a reference to a field or nested type always refers to the declaration within that class or else the inherited declaration. In contrast, methods are accessed based on the class of the object being referred to. Further, the existence of method overloading means that the system has to determine which method to invoke based on the compile-time type of the arguments used in the invocation; this process is described in detail in the next section. The only operator that can be applied to a method member is the method invocation operator ().

You will get a NullPointerException if you use . on a reference with the value null, unless you are accessing a static member. In that case the value of the reference is never considered, because only the type of the reference determines the class in which to locate the member.

Finding the Right Method

For an invocation of a method to be correct, you must provide arguments of the proper number and type so that exactly one matching method can be found in the class. If a method is not overloaded, determining the correct method is simple, because only one parameter count is associated with the method name. When overloaded methods are involved choosing the correct method is more complex. The compiler uses a “most specific” algorithm to do the match, the general form of which is as follows:

  1. Determine which class or interface to search for the method. Exactly how this is done depends on the form of the method invocation. For example, if the invocation is of a static method using a class name—such as Math.exp—then the class to search for exp is Math. On the other hand, if the method name is not qualified in any way—such as exp(n)—then there must be a method by that name in scope at the point where the code invokes the method. That method will be defined in a particular class or interface, and that is the class or interface that will be searched in step 2. You should note that it is only the name of the method that is used to determine where to search.

  2. Find all the methods in that class or interface that could possibly apply to the invocation—namely, all accessible methods that have the correct name, can take the number of arguments being passed, and whose parameters are of types that can be assigned the values of all the arguments. This matching process occurs in three phases:[3]

    1. The match is attempted without performing any boxing conversions, and without considering the possibility of a variable number of arguments—that is, the compiler looks for a method whose declared number of parameters matches the number of arguments, and whose kinds of parameters (either primitive or reference types) match the corresponding arguments provided.

    2. If no matches have been found, the match is attempted again, but this time boxing conversions are considered, so primitives can be passed for objects and vice-versa.

    3. If no matches have been found, the match is attempted again, but this time the possibility of a variable number of arguments is considered, so now the number of arguments might exceed the number of declared parameters.

  3. If any method in the set has parameter types that are all assignable to the corresponding parameters of any other method in the set, that other method is removed from the set because it is a less specific method. For example, if the set has a method that takes an Object parameter and another that takes a String parameter, the Object method is removed because a String can be assigned to an Object, and therefore the method that takes a String is more specific. If you pass a String argument you want it handled by the method that specializes in strings, not the general one that works with any object.

  4. If exactly one method remains, that method is the most specific and will be invoked. If more than one method remains, and they have different signatures, then the invocation is ambiguous and the invoking code invalid because there is no most specific method. If all the remaining methods have the same signature then: if all are abstract then one is chosen arbitrarily; otherwise if only one is not abstract then it is chosen; otherwise the invocation is again ambiguous and is invalid.

The exact details of the algorithm are quite complex, due mainly to the possibility of generic types or methods being involved—see Chapter 11. Interested readers should again consult The Java Language Specification, Third Edition, for those details.

Once a method has been selected, that method determines the expected return type and the possible checked exceptions of that method invocation. If the expected return type is not acceptable in the code (for example, the method returns a String but the method call is used as an array subscript) or if the checked exceptions are not dealt with correctly you will get a compile-time error.

For instance, suppose you had the following type hierarchy:

Finding the Right Method

Also suppose you had several overloaded methods that took particular combinations of Dessert parameters:

void moorge(Dessert d, Scone s)        { /* first form  */ }
void moorge(Cake c, Dessert d)         { /* second form */ }
void moorge(ChocolateCake cc, Scone s) { /* third form  */ }
void moorge(Dessert... desserts)       { /* fourth form */ }

Now consider the following invocations of moorge:

moorge(dessertRef, sconeRef);
moorge(chocolateCakeRef, dessertRef);
moorge(chocolateCakeRef, butteredSconeRef);
moorge(cakeRef, sconeRef);      // INVALID
moorge(sconeRef, cakeRef);

All of these invocations might appear to match the fourth form, and indeed without the other overloaded forms they would. But the fourth form, being a method that takes a variable number of arguments, will only be considered by the compiler if it fails to find any candidate methods that explicitly declare two parameters of assignable type.

The first invocation uses the first form of moorge because the parameter and argument types match exactly. The second invocation uses the second form because it is the only form for which the provided arguments can be assigned to the parameter types. In both cases, the method to invoke is clear after step 2 in the method-matching algorithm.

The third invocation requires more thought. The list of potential overloads includes all three two-argument forms, because a ChocolateCake reference is assignable to any of the first parameter types, a ButteredScone reference is assignable to either of the second parameter types, and none of the signatures matches exactly. So after step 2, you have a set of three candidate methods.

Step 3 requires you to eliminate less specific methods from the set. In this case, the first form is removed from the set because the third form is more specific—a ChocolateCake reference can be assigned to the first form's Dessert parameter and a Scone reference can be assigned to the first form's Scone parameter, so the first form is less specific. The second form is removed from the set in a similar manner. After this, the set of possible methods has been reduced to one—the third form of moorge—and that method form will be invoked.

The fourth invocation is invalid. After step 2, the set of possible matches includes the first and second forms. Because neither form's parameters are assignable to the other, neither form can be removed from the set in step 3. Therefore, you have an ambiguous invocation that cannot be resolved by the compiler, and so it is an invalid invocation of moorge. You can resolve the ambiguity by explicitly casting one of the arguments to the Dessert type: If you cast cakeRef, then the first form is invoked; if you cast sconeRef, then the second form is invoked.

The final invocation uses the fourth form. There are no potential candidates after the first phase of step 1, and no boxing conversions to consider, so the compiler now considers the fourth form in phase 3. An invocation with two arguments is compatible with a parameter list that has a single sequence parameter, and the types of the two arguments are assignable to the type of the sequence, so it matches. Because this is the only match after step 1, it is the method chosen.

These rules also apply to the primitive types. An int, for example, can be assigned to a float, and resolving an overloaded invocation will take that into account just as it considered that a ButteredScone reference was assignable to a Scone reference. However, implicit integer conversions to smaller types are not applied—if a method takes a short argument and you supply an int you will have to explicitly cast the int to short; it won't match the short parameter, regardless of its value.

The method resolution process takes place at compile time based on the declared types of the object reference and the argument values. This process determines which form of a method should be invoked, but not which implementation of that method. At run time the actual type of the object on which the method is invoked is used to find an implementation of the method that was determined at compile time.

Methods may not differ only in return type or in the list of exceptions they throw, because there are too many ambiguities to determine which overloaded method is wanted. If, for example, there were two doppelgänger methods that differed only in that one returned an int and the other returned a short, both methods would make equal sense in the following statement:

double d = doppelgänger();

A similar problem exists with exceptions, because you can catch any, all, or none of the exceptions a method might throw in the code in which you invoke the overloaded method. There would be no way to determine which of two methods to use when they differed only in thrown exceptions.

Such ambiguities are not always detectable at compile time. For example, a superclass may be modified to add a new method that differs only in return type from a method in an extended class. If the extended class is not recompiled, then the error will not be detected. At run time there is no problem because the exact form of the method to be invoked was determined at compile time, so that is the method that will be looked for in the extended class. In a similar way, if a class is modified to add a new overloaded form of a method, but the class invoking that method is not recompiled, then the new method will never be invoked by that class—the form of method to invoke was already determined at compile time and that form is different from that of the new method.

 

Math was always my bad subject. I couldn't convince my teachers that many of my answers were meant ironically.

 
 --Calvin Trillin


[1] Detecting non-zero underflows is a non-trivial task that is beyond the scope of this book.

[2] There is one other way to test for NaN: If x has a NaN value, then x != x is true. This is interesting but very obscure, so you should use the methods instead.

[3] These phases were introduced to maintain backwards compatibility with older versions of the Java programming language.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.144.228