The guidelines in this chapter address areas where developers often make unwarranted assumptions about Java language and library behaviors, or where ambiguities can easily be introduced. Failure to follow these guidelines can result in programs that produce counterintuitive results.
This chapter contains guidelines that address
1. Misconceptions about Java APIs and language features
2. Assumptions and ambiguity-laced programs
3. Situations in which the programmer wanted to do one thing but ended up doing another
According to the Java Language Specification (JLS), §8.3.1.4, “volatile
Fields” [JLS 2013],
A field may be declared volatile
, in which case the Java Memory Model ensures that all threads see a consistent value for the variable (§17.4).
This safe publication guarantee applies only to primitive fields and object references. Programmers commonly use imprecise terminology and speak about “member objects.” For the purposes of this visibility guarantee, the actual member is the object reference; the objects referred to (aka, referents) by volatile object references are beyond the scope of this safe publication guarantee. Consequently, declaring an object reference to be volatile is insufficient to guarantee that changes to the members of the referent are published to other threads. A thread may fail to observe a recent write from another thread to a member field of such an object referent.
Furthermore, when the referent is mutable and lacks thread-safety, other threads might see a partially constructed object or an object in a (temporarily) inconsistent state [Goetz 2007]. However, when the referent is immutable, declaring the reference volatile suffices to guarantee safe publication of the members of the referent. Programmers cannot use the volatile
keyword to guarantee safe publication of mutable objects. Use of the volatile
keyword can only guarantee safe publication of primitive fields, object references, or fields of immutable object referents.
Confusing a volatile object with the volatility of its member objects is a similar error to the one described in Guideline 73, “Never confuse the immutability of a reference with that of the referenced object.”
This noncompliant code example declares a volatile reference to an array object:
final class Foo {
private volatile int[] arr = new int[20];
public int getFirst() {
return arr[0];
}
public void setFirst(int n) {
arr[0] = n;
}
// ...
}
Values assigned to an array element by one thread—for example, by calling setFirst()
—might not be visible to another thread calling getFirst()
, because the volatile
keyword guarantees safe publication only for the array reference; it makes no guarantee regarding the actual data contained within the array.
This problem arises when the thread that calls setFirst()
and the thread that calls getFirst()
lack a happens-before relationship. A happens-before relationship exists between a thread that writes to a volatile variable and a thread that subsequently reads it. However, setFirst()
and getFirst()
each read from a volatile variable—the volatile reference to the array. Neither method writes to the volatile variable.
To ensure that the writes to array elements are atomic and that the resulting values are visible to other threads, this compliant solution uses the AtomicIntegerArray
class defined in java.util.concurrent.atomic
:
final class Foo {
private final AtomicIntegerArray atomicArray =
new AtomicIntegerArray(20);
public int getFirst() {
return atomicArray.get(0);
}
public void setFirst(int n) {
atomicArray.set(0, 10);
}
// ...
}
AtomicIntegerArray
guarantees a happens-before relationship between a thread that calls atomicArray.set()
and a thread that subsequently calls atomicArray.get()
.
To ensure visibility, accessor methods must synchronize access while performing operations on nonvolatile elements of an array, whether the array is referred to by a volatile or a nonvolatile reference. Note that the code is thread-safe, even though the array reference is not volatile.
final class Foo {
private int[] arr = new int[20];
public synchronized int getFirst() {
return arr[0];
}
public synchronized void setFirst(int n) {
arr[0] = n;
}
}
Synchronization establishes a happens-before relationship between threads that synchronize on the same lock. In this case, the thread that calls setFirst()
and the thread that subsequently calls getFirst()
on the same object instance both synchronize on that instance, so safe publication is guaranteed.
This noncompliant code example declares the Map
instance field volatile. The instance of the Map
object is mutable because of its put()
method.
final class Foo {
private volatile Map<String, String> map;
public Foo() {
map = new HashMap<String, String>();
// Load some useful values into map
}
public String get(String s) {
return map.get(s);
}
public void put(String key, String value) {
// Validate the values before inserting
if (!value.matches("[\w]*")) {
throw new IllegalArgumentException();
}
map.put(key, value);
}
}
Interleaved calls to get()
and put()
may result in the retrieval of internally inconsistent values from the Map
object because put()
modifies its state. Declaring the object reference volatile is insufficient to eliminate this data race.
This noncompliant code example attempts to use the volatile-read, synchronized-write technique described in Java Theory and Practice [Goetz 2007]. The map
field is declared volatile to synchronize its reads and writes. The put()
method is also synchronized to ensure it is executed atomically.
final class Foo {
private volatile Map<String, String> map;
public Foo() {
map = new HashMap<String, String>();
// Load some useful values into map
}
public String get(String s) {
return map.get(s);
}
public synchronized void put(String key, String value) {
// Validate the values before inserting
if (!value.matches("[\w]*")) {
throw new IllegalArgumentException();
}
map.put(key, value);
}
}
The volatile-read, synchronized-write technique uses synchronization to preserve atomicity of compound operations, such as increment, and provides faster access times for atomic reads. However, it fails for mutable objects because the safe publication guarantee provided by volatile
extends only to the field itself (the primitive value or object reference); the referent is excluded from the guarantee, as are the referent’s members. In effect, a write and a subsequent read of the map
lack a happens-before relationship.
This technique is also discussed in The CERT® Oracle® Secure Coding Standard for Java™ [Long 2012], “VNA02-J. Ensure that compound operations on shared variables are atomic.”
This compliant solution uses method synchronization to guarantee visibility:
final class Foo {
private final Map<String, String> map;
public Foo() {
map = new HashMap<String, String>();
// Load some useful values into map
}
public synchronized String get(String s) {
return map.get(s);
}
public synchronized void put(String key, String value) {
// Validate the values before inserting
if (!value.matches("[\w]*")) {
throw new IllegalArgumentException();
}
map.put(key, value);
}
}
It is unnecessary to declare the map
field volatile because the accessor methods are synchronized. The field is declared final
to prevent publication of its reference when the referent is in a partially initialized state (see “TSM03-J. Do not publish partially initialized objects,” for more information [Long 2012]).
In this noncompliant code example, the volatile format
field stores a reference to a mutable object, java.text.DateFormat
:
final class DateHandler {
private static volatile DateFormat format =
DateFormat.getDateInstance(DateFormat.MEDIUM);
public static java.util.Date parse(String str)
throws ParseException {
return format.parse(str);
}
}
Because DateFormat
is not thread-safe [API 2013], the value for Date
returned by the parse()
method may not correspond to the str
argument.
This compliant solution creates and returns a new DateFormat
instance for each invocation of the parse()
method [API 2013]:
final class DateHandler {
public static java.util.Date parse(String str)
throws ParseException {
return DateFormat.getDateInstance(
DateFormat.MEDIUM).parse(str);
}
}
This compliant solution makes DateHandler
thread-safe by synchronizing statements within the parse()
method [API 2013]:
final class DateHandler {
private static DateFormat format =
DateFormat.getDateInstance(DateFormat.MEDIUM);
public static java.util.Date parse(String str)
throws ParseException {
synchronized (format) {
return format.parse(str);
}
}
}
This compliant solution uses a ThreadLocal
object to create a separate DateFormat
instance per thread:
final class DateHandler {
private static final ThreadLocal<DateFormat> format =
new ThreadLocal<DateFormat>() {
@Override protected DateFormat initialValue() {
return DateFormat.getDateInstance(DateFormat.MEDIUM);
}
};
// ...
}
Incorrectly assuming that declaring a field volatile guarantees safe publication of a referenced object’s members can cause threads to observe stale or inconsistent values.
Technically, strict immutability of the referent is a stronger condition than is fundamentally required for safe publication. When it can be determined that a referent is thread-safe by design, the field that holds its reference may be declared volatile. However, this approach to using volatile
decreases maintainability and should be avoided.
[API 2013]
Class DateFormat
[Goetz 2007]
Pattern 2, “One-Time Safe Publication”
[JLS 2013]
§8.3.1.4, “volatile
Fields”
[Long 2012]
OBJ05-J. Defensively copy private mutable class members before returning their references
TSM03-J. Do not publish partially initialized objects
VNA02-J. Ensure that compound operations on shared variables are atomic
[Miller 2009]
“Mutable Statics”
According to the JLS, §17.3, “Sleep and Yield” [JLS 2013],
It is important to note that neither Thread.sleep
nor Thread.yield
have any synchronization semantics. In particular, the compiler does not have to flush writes cached in registers out to shared memory before a call to Thread.sleep
or Thread.yield
, nor does the compiler have to reload values cached in registers after a call to Thread.sleep
or Thread.yield
.
Code that bases its concurrency safety on thread suspension or yields to processes that
Flush cached registers,
Reload any values,
Or provide any happens-before relationships when execution resumes.
is incorrect and is consequently disallowed. Programs must ensure that communication between threads has proper synchronization, happens-before, and safe publication semantics.
This noncompliant code attempts to use the nonvolatile primitive Boolean member done
as a flag to terminate execution of a thread. A separate thread sets done
to true
by calling the shutdown()
method.
final class ControlledStop implements Runnable {
private boolean done = false;
@Override public void run() {
while (!done) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// Reset interrupted status
Thread.currentThread().interrupt(); }
}
}
public void shutdown() {
this.done = true;
}
}
The compiler, in this case, is free to read the field this.done
once and to reuse the cached value in each execution of the loop. Consequently, the while
loop might never terminate, even when another thread calls the shutdown()
method to change the value of this.done
[JLS 2013]. This error could have resulted from the programmer incorrectly assuming that the call to Thread.sleep()
causes cached values to be reloaded.
This compliant solution declares the flag field volatile
to ensure that updates to its value are made visible across multiple threads:
final class ControlledStop implements Runnable {
private volatile boolean done = false;
@Override public void run() {
//...
}
// ...
}
The volatile
keyword establishes a happens-before relationship between this thread and any other thread that sets done
.
A better solution for methods that call sleep()
is to use thread interruption, which causes the sleeping thread to wake immediately and handle the interruption.
final class ControlledStop implements Runnable {
@Override public void run() {
// Record current thread so others can interrupt it
myThread = currentThread();
while (!Thread.interrupted()) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
}
}
public void shutdown(Thread th) {
th.interrupt();
}
}
Note that the interrupting thread must know which thread to interrupt; logic for tracking this relationship has been omitted from this solution.
This noncompliant code example contains a doSomething()
method that starts a thread. The thread supports interruption by checking a flag and waits until notified. The stop()
method checks to see whether the thread is blocked on the wait; if so, it sets the flag to true and notifies the thread so that the thread can terminate.
public class Waiter {
private Thread thread;
private boolean flag;
private final Object lock = new Object();
public void doSomething() {
thread = new Thread(new Runnable() {
@Override public void run() {
synchronized (lock) {
while (!flag) {
try {
lock.wait();
// ...
} catch (InterruptedException e) {
// Forward to handler
}
}
}
}
});
thread.start();
}
public boolean stop() {
if (thread != null) {
if (thread.getState() == Thread.State.WAITING) {
synchronized (lock) {
flag = true;
lock.notifyAll();
}
return true;
}
}
return false;
}
}
Unfortunately, the stop()
method incorrectly uses the Thread.getState()
method to check whether the thread is blocked and has not terminated before delivering the notification. Using the Thread.getState()
method for synchronization control, such as checking whether a thread is blocked on a wait, is inappropriate. Java Virtual Machines (JVMs) are permitted to implement blocking using spin-waiting; consequently, a thread can be blocked without entering the WAITING
or TIMED_WAITING
state [Goetz 2006]. Because the thread may never enter the WAITING
state, the stop()
method might fail to terminate the thread.
If doSomething()
and stop()
are called from different threads, the stop()
method could fail to see the initialized thread
, even though doSomething()
was called earlier, unless there is a happens-before relationship between the two calls. If the two methods are invoked by the same thread, they automatically have a happens-before relationship and consequently cannot encounter this problem.
This compliant solution removes the check for determining whether the thread is in the WAITING
state. This check is unnecessary because invoking notifyAll()
affects only threads that are blocked on an invocation of wait()
:
public class Waiter {
// . . .
private Thread thread;
private volatile boolean flag;
private final Object lock = new Object();
public boolean stop() {
if (thread != null) {
synchronized (lock) {
flag = true;
lock.notifyAll();
}
return true;
}
return false;
}
}
Relying on the Thread
class’s sleep()
, yield()
, and getState()
methods for synchronization control can cause unexpected behavior.
[Goetz 2006]
[JLS 2013]
§17.3, “Sleep and Yield”
The JLS, §15.17.3, “Remainder Operator %
” [JLS 2013], states,
The remainder operation for operands that are integers after binary numeric promotion (§5.6.2) produces a result value such that (a/b)*b+(a%b)
is equal to a
. This identity holds even in the special case that the dividend is the negative integer of largest possible magnitude for its type and the divisor is -1
(the remainder is 0
). It follows from this rule that the result of the remainder operation can be negative only if the dividend is negative, and can be positive only if the dividend is positive; moreover, the magnitude of the result is always less than the magnitude of the divisor.
The result of the remainder operator has the same sign as the dividend (the first operand in the expression):
5 % 3 produces 2
5 % (-3) produces 2
(-5) % 3 produces -2
(-5) % (-3) produces -2
As a result, code that depends on the remainder operation to always return a positive result is erroneous.
This noncompliant code example uses the integer hashKey
as an index into the hash
array:
private int SIZE = 16;
public int[] hash = new int[SIZE];
public int lookup(int hashKey) {
return hash[hashKey % SIZE];
}
A negative hash key produces a negative result from the remainder operator, causing the lookup()
method to throw a java.lang.ArrayIndexOutOfBoundsException
.
This compliant solution calls the imod()
method which always returns a positive remainder:
// Method imod() gives nonnegative result
private int SIZE = 16;
public int[] hash = new int[SIZE];
private int imod(int i, int j) {
int temp = i % j;
// Unary minus will succeed without overflow
// because temp cannot be Integer.MIN_VALUE
return (temp < 0) ? -temp : temp;
}
public int lookup(int hashKey) {
return hash[imod(hashKey, SIZE)];
}
Incorrectly assuming a positive remainder from a remainder operation can result in erroneous code.
[JLS 2013]
§15.17.3, “Remainder Operator %
”
Java defines the equality operators ==
and !=
for testing reference equality, but uses the equals()
method defined in Object
and its subclasses for testing abstract object equality. Naïve programmers often confuse the intent of the ==
operation with that of the Object.equals()
method. This confusion is frequently evident in the context of processing String
objects.
As a general rule, use the Object.equals()
method to check whether two objects have equivalent contents, and use the equality operators ==
and !=
to test whether two references specifically refer to the same object. This latter test is referred to as referential equality. For classes that require overriding the default equals()
implementation, care must be taken to also override the hashCode()
method (see The CERT® Oracle® Secure Coding Standard for Java™ [Long 2012], “MET09-J. Classes that define an equals()
method must also define a hashCode()
method”).
Numeric boxed types (for example, Byte
, Character
, Short
, Integer
, Long
, Float
, and Double
) should also be compared using Object.equals()
, rather than the ==
operator. While reference equality may appear to work for Integer
values between the range −128 and 127, it may fail if either of the operands in the comparison are outside that range. Numeric relational operators other than equality (such as <
, <=
, >
, and >=
) can be safely used to compare boxed primitive types (see “EXP03-J. Do not use the equality operators when comparing values of boxed primitives” [Long 2012], for more information).
This noncompliant code example declares two distinct String
objects that contain the same value:
public class StringComparison {
public static void main(String[] args) {
String str1 = new String("one");
String str2 = new String("one");
System.out.println(str1 == str2); // Prints "false"
}
}
The reference equality operator ==
evaluates to true
only when the values it compares refer to the same underlying object. The references in this example are unequal because they refer to distinct objects.
This compliant solution uses the Object.equals()
method when comparing string values:
public class StringComparison {
public static void main(String[] args) {
String str1 = new String("one");
String str2 = new String("one");
System.out.println(str1.equals(str2)); // Prints "true"
}
}
Reference equality behaves like abstract object equality when it is used to compare two strings that are results of the String.intern()
method. This compliant solution uses String.intern()
and can perform fast string comparisons when only one copy of the string one
is required in memory.
public class StringComparison {
public static void main(String[] args) {
String str1 = new String("one");
String str2 = new String("one");
str1 = str1.intern();
str2 = str2.intern();
System.out.println(str1 == str2); // Prints "true"
}
}
Use of String.intern()
should be reserved for cases in which the tokenization of strings either yields an important performance enhancement or dramatically simplifies code. Examples include programs engaged in natural language processing and compiler-like tools that tokenize program input. For most other programs, performance and readability are often improved by the use of code that applies the Object.equals()
approach and that lacks any dependence on reference equality.
The JLS provides few guarantees about the implementation of String.intern()
. For example,
The cost of String.intern()
grows as the number of intern strings grows. Performance should be no worse than O(n log n), but the JLS lacks a specific performance guarantee.
In early Java Virtual Machine (JVM) implementations, interned strings became immortal: they were exempt from garbage collection. This can be problematic when large numbers of strings are interned. More recent implementations can garbage-collect the storage occupied by interned strings that are no longer referenced. However, the JLS lacks any specification of this behavior.
In JVM implementations prior to Java 1.7, interned strings are allocated in the permgen
storage region, which is typically much smaller than the rest of the heap. Consequently, interning large numbers of strings can lead to an out-of-memory condition. In many Java 1.7 implementations, interned strings are allocated on the heap, relieving this restriction. Once again, the details of allocation are unspecified by the JLS; consequently, implementations may vary.
String interning may also be used in programs that accept repetitively occurring strings. Its use boosts the performance of comparisons and minimizes memory consumption.
When canonicalization of objects is required, it may be wiser to use a custom canonicalizer built on top of ConcurrentHashMap
; see Joshua Bloch’s Effective Java™, Second Edition, Item 69 [Bloch 2008] for details.
Confusing reference equality and object equality can lead to unexpected results.
Using reference equality in place of object equality is permitted only when the defining classes guarantee the existence of at most one object instance for each possible object value. The use of static factory methods, rather than public constructors, facilitates instance control; this is a key enabling technique. Another technique is to use an enum
type.
Use reference equality to determine whether two references point to the same object.
[Bloch 2008]
Item 69, “Prefer Concurrency Utilities to wait and notify”
[FindBugs 2008]
ES, “Comparison of String Objects Using ==
or !=
”
[JLS 2013]
§3.10.5, “String Literals”
§5.6.2, “Binary Numeric Promotion”
[Long 2012]
EXP03-J. Do not use the equality operators when comparing values of boxed primitives
MET09-J. Classes that define an equals()
method must also define a hashCode()
method
The conditional AND and OR operators (&&
and ||,
respectively) exhibit short-circuit behavior. That is, the second operand is evaluated only when the result of the conditional operator cannot be deduced solely by evaluating the first operand. Consequently, when the result of the conditional operator can be deduced solely from the result of the first operand, the second operand will remain unevaluated; its side effects, if any, will never occur.
The bitwise AND and OR operators (&
and |
) lack short-circuit behavior. Similar to most Java operators, they evaluate both operands. They return the same Boolean results as &&
and ||
respectively, but can have different overall effects depending on the presence or absence of side effects in the second operand.
Consequently, either the &
or the &&
operator can be used when performing Boolean logic. However, there are times when the short-circuiting behavior is preferred and other times when the short-circuiting behavior causes subtle bugs.
This noncompliant code example, derived from Flanagan [Flanagan 2005], has two variables with unknown values. The code must validate its data, and then check whether array[i]
is a valid index.
int array[]; // May be null
int i; // May be an invalid index for array
if (array != null & i >= 0 &
i < array.length & array[i] >= 0) {
// Use array
} else {
// Handle error
}
This code can fail as a result of the same errors it is trying to prevent. When array
is NULL
or i
is not a valid index, the reference to array
and array[i]
will cause either a NullPointerException
or an ArrayIndexOutOfBoundsException
to be thrown. The exception occurs because the &
operator fails to prevent evaluation of its right operand, even when evaluation of its left operand proves that the right operand is inconsequential.
This compliant solution mitigates the problem by using &&
, which causes the evaluation of the conditional expression to terminate immediately if any of the conditions fail, thereby preventing a runtime exception:
int array[]; // May be null
int i; // May be an invalid index for array
if (array != null && i >= 0 &&
i < array.length && array[i] >= 0) {
// Handle array
} else {
// Handle error
}
This compliant solution uses multiple if
statements to achieve the proper effect.
int array[]; // May be null
int i; // May be a valid index for array
if (array != null) {
if (i >= 0 && i < array.length) {
if (array[i] >= 0) {
// Use array
} else {
// Handle error
}
} else {
// Handle error
}
} else {
// Handle error
}
Although correct, this solution is more verbose and could be more difficult to maintain. Nevertheless, this solution is preferable when the error-handling code for each potential failure condition is different.
This noncompliant code example demonstrates code that compares two arrays for ranges of members that match. Here i1
and i2
are valid array indices in array1
and array2
, respectively. Variables end1
and end2
are the indices of the ends of the matching ranges in the two arrays.
if (end1 >= 0 & i2 >= 0) {
int begin1 = i1;
int begin2 = i2;
while (++i1 < array1.length &&
++i2 < array2.length &&
array1[i1] == array2[i2]) {
// Arrays match so far
}
int end1 = i1;
int end2 = i2;
assert end1 - begin1 == end2 - begin2;
}
The problem with this code is that when the first condition in the while
loop fails, the second condition does not execute. That is, once i1
has reached array1.length
, the loop terminates after i1
is incremented. Consequently, the apparent range over array1
is larger than the apparent range over array2
, causing the final assertion to fail.
This compliant solution mitigates the problem by judiciously using &
, which guarantees that both i1
and i2
are incremented, regardless of the outcome of the first condition:
public void exampleFuntion() {
while (++i1 < array1.length & // Not &&
++i2 < array2.length &&
array1[i1] == array2[i2]){
// Do something
}
}
Failure to understand the behavior of the bitwise and conditional operators can cause unintended program behavior.
[Flanagan 2005]
§2.5.6., “Boolean Operators”
[JLS 2013]
§15.23, “Conditional-And Operator &&
”
§15.24, “Conditional-Or Operator ||
”
Many classes allow inclusion of escape sequences in character and string literals. Examples include java.util.regex.Pattern
as well as classes that support XML- and SQL-based actions by passing string arguments to methods. According to the JLS, §3.10.6, “Escape Sequences for Character and String Literals” [JLS 2013],
The character and string escape sequences allow for the representation of some nongraphic characters as well as the single quote, double quote, and backslash characters in character literals (§3.10.4) and string literals (§3.10.5).
Correct use of escape sequences in string literals requires understanding how the escape sequences are interpreted by the Java compiler, as well as how they are interpreted by any subsequent processor, such as an SQL engine. SQL statements may require escape sequences (for example, sequences containing
,
,
) in certain cases, such as when storing raw text in a database. When representing SQL statements in Java string literals, each escape sequence must be preceded by an extra backslash for correct interpretation.
As another example, consider the Pattern
class used in performing regular expression-related tasks. A string literal used for pattern matching is compiled into an instance of the Pattern
type. When the pattern to be matched contains a sequence of characters identical to one of the Java escape sequences—""
and "n"
, for example—the Java compiler treats that portion of the string as a Java escape sequence and transforms the sequence into an actual newline character. To insert a newline escape sequence, rather than a literal newline character, the programmer must precede the "
"
sequence with an additional backslash to prevent the Java compiler from replacing it with a newline character. The string constructed from the resulting sequence,
\n
consequently contains the correct two-character sequence
and correctly denotes the escape sequence for newline in the pattern.
In general, for a particular escape character of the form X
, the equivalent Java representation is
\X
This noncompliant code example defines a method, splitWords()
, that finds matches between the string literal (WORDS
) and the input sequence. It is expected that WORDS
would hold the escape sequence for matching a word boundary. However, the Java compiler treats the ""
literal as a Java escape sequence, and the string WORDS
silently compiles to a regular expression that checks for a single backspace character.
public class Splitter {
// Interpreted as backspace
// Fails to split on word boundaries
private final String WORDS = "";
public String[] splitWords(String input){
Pattern pattern = Pattern.compile(WORDS);
String[] input_array = pattern.split(input);
return input_array;
}
}
This compliant solution shows the correctly escaped value of the string literal WORDS
that results in a regular expression designed to split on word boundaries:
public class Splitter {
// Interpreted as two chars, '' and 'b'
// Correctly splits on word boundaries
private final String WORDS = "\b";
public String[] split(String input){
Pattern pattern = Pattern.compile(WORDS);
String[] input_array = pattern.split(input);
return input_array;
}
}
This noncompliant code example uses the same method, splitWords()
. This time the WORDS
string is loaded from an external properties file.
public class Splitter {
private final String WORDS;
public Splitter() throws IOException {
Properties properties = new Properties();
properties.load(new FileInputStream("splitter.properties"));
WORDS = properties.getProperty("WORDS");
}
public String[] split(String input){
Pattern pattern = Pattern.compile(WORDS);
String[] input_array = pattern.split(input);
return input_array;
}
}
In the properties file, the WORD
property is once again incorrectly specified as .
WORDS=
This is read by the Properties.load()
method as a single character b
, which causes the split()
method to split strings along the letter b
. Although the string is interpreted differently than if it were a string literal, as in the previous noncompliant code example, the interpretation is incorrect.
This compliant solution shows the correctly escaped value of the WORDS
property:
WORDS=\b
Incorrect use of escape characters in string inputs can result in misinterpretation and potential corruption of data.
[API 2013]
Class Pattern, “Backslashes, Escapes, and Quoting”
Package java.sql
[JLS 2013]
§3.10.6, “Escape Sequences for Character and String Literals”
Java supports overloading methods and can distinguish between methods with different signatures. Consequently, with some qualifications, methods within a class can have the same name if they have different parameter lists. In method overloading, the method to be invoked at runtime is determined at compile time. Consequently, the overloaded method associated with the static type of the object is invoked even when the runtime type differs for each invocation.
For program understandability, do not introduce ambiguity while overloading (see Guideline 65, “Avoid ambiguous or confusing uses of overloading”), and use overloaded methods sparingly [Tutorials 2013].
This noncompliant code example attempts to use the overloaded display()
method to perform different actions depending on whether the method is passed an ArrayList<Integer>
or a LinkedList<String>
:
public class Overloader {
private static String display(ArrayList<Integer> arrayList) {
return "ArrayList";
}
private static String display(LinkedList<String> linkedList) {
return "LinkedList";
}
private static String display(List<?> list) {
return "List is not recognized";
}
public static void main(String[] args) {
// Single ArrayList
System.out.println(display(new ArrayList<Integer>()));
// Array of lists
List<?>[] invokeAll = new List<?>[] {
new ArrayList<Integer>(),
new LinkedList<String>(),
new Vector<Integer>()};
for (List<?> list : invokeAll) {
System.out.println(display(list));
}
}
}
At compile time, the type of the object array is List
. The expected output is ArrayList
, ArrayList
, LinkedList
, and List is not recognized
(because java.util.Vector
is neither an ArrayList
nor a LinkedList
). The actual output is ArrayList
followed by List is not recognized
repeated three times. The cause of this unexpected behavior is that overloaded method invocations are affected only by the compile-time type of their arguments: ArrayList
for the first invocation and List
for the others.
This compliant solution uses a single display
method and instanceof
to distinguish between different types. As expected, the output is ArrayList
, ArrayList
, LinkedList
, List is not recognized
:
public class Overloader {
private static String display(List<?> list) {
return (
list instanceof ArrayList ? "Arraylist" :
(list instanceof LinkedList ? "LinkedList" :
"List is not recognized")
);
}
public static void main(String[] args) {
// Single ArrayList
System.out.println(display(new ArrayList<Integer>()));
List<?>[] invokeAll = new List<?>[] {
new ArrayList<Integer>(),
new LinkedList<String>(),
new Vector<Integer>()};
for (List<?> list : invokeAll) {
System.out.println(display(list));
}
}
}
Ambiguous uses of overloading can lead to unexpected results.
[API 2013]
Interface Collection<E>
[Bloch 2008]
Item 41, “Use Overloading Judiciously”
[Tutorials 2013]
Defining Methods
Immutability helps to support security reasoning. It is safe to share immutable objects without risking modification by the recipient [Mettler 2010].
Programmers often incorrectly assume that declaring a field or variable final
makes the referenced object immutable. Declaring variables that have a primitive type to be final
does prevent changes to their values after initialization (by normal Java processing). However, when the variable has a reference type, the presence of a final
clause in the declaration only makes the reference itself immutable. The final
clause has no effect on the referenced object. Consequently, the fields of the referenced object may be mutable. For example, according to the JLS, §4.12.4, “final
Variables” [JLS 2013],
If a final
variable holds a reference to an object, then the state of the object may be changed by operations on the object, but the variable will always refer to the same object.
This also applies to arrays, because arrays are objects; if a final
variable holds a reference to an array, then the components of the array may still be changed by operations on the array, but the variable will always refer to the same array.
Similarly, a final
method parameter obtains an immutable copy of the object reference. Again, this has no effect on the mutability of the referenced data.
In this noncompliant code example, the programmer has declared the reference to the point
instance to be final
under the incorrect assumption that doing so prevents modification of the values of the instance variables x
and y
. The values of the instance variables can be changed after their initialization because the final
clause applies only to the reference to the point
instance and not to the referenced object.
class Point {
private int x;
private int y;
Point(int x, int y) {
this.x = x;
this.y = y;
}
void set_xy(int x, int y) {
this.x = x;
this.y = y;
}
void print_xy() {
System.out.println("the value x is: " + this.x);
System.out.println("the value y is: " + this.y);
}
}
public class PointCaller {
public static void main(String[] args) {
final Point point = new Point(1, 2);
point.print_xy();
// Change the value of x, y
point.set_xy(5, 6);
point.print_xy();
}
}
When the values of the x
and y
instance variables must remain immutable after their initialization, they should be declared final
. However, this invalidates the set_xy()
method because it can no longer change the values of x
and y
:
class Point {
private final int x;
private final int y;
Point(int x, int y) {
this.x = x;
this.y = y;
}
void print_xy() {
System.out.println("the value x is: " + this.x);
System.out.println("the value y is: " + this.y);
}
// set_xy(int x, int y) no longer possible
With this modification, the values of the instance variables become immutable and consequently match the programmer’s intended usage model.
If the class must remain mutable, another compliant solution is to provide copy functionality. This compliant solution provides a clone()
method in the class Point
, avoiding the elimination of the setter method:
final public class Point implements Cloneable {
private int x;
private int y;
Point(int x, int y) {
this.x = x;
this.y = y;
}
void set_xy(int x, int y) {
this.x = x;
this.y = y;
}
void print_xy() {
System.out.println("the value x is: "+ this.x);
System.out.println("the value y is: "+ this.y);
}
public Point clone() throws CloneNotSupportedException {
Point cloned = (Point) super.clone();
// No need to clone x and y as they are primitives
return cloned;
}
}
public class PointCaller {
public static void main(String[] args)
throws CloneNotSupportedException {
Point point = new Point(1, 2); // Is not changed in main()
point.print_xy();
// Get the copy of original object
Point pointCopy = point.clone();
// pointCopy now holds a unique reference to the
// newly cloned Point instance
// Change the value of x,y of the copy
pointCopy.set_xy(5, 6);
// Original value remains unchanged
point.print_xy();
}
}
The clone()
method returns a copy of the original object that reflects the state of the original object at the moment of cloning. This new object can be used without exposing the original object. Because the caller holds the only reference to the newly cloned instance, the instance variables cannot be changed without the caller’s cooperation. This use of the clone()
method allows the class to remain securely mutable. (See The CERT® Oracle® Secure Coding Standard for Java™ [Long 2012], “OBJ04-J. Provide mutable classes with copy functionality to safely allow passing instances to untrusted code.”)
The Point
class is declared final
to prevent subclasses from overriding the clone()
method. This enables the class to be suitably used without any inadvertent modifications of the original object.
This noncompliant code example uses a public static final
array, items
:
public static final String[] items = {/* . . . */};
Clients can trivially modify the contents of the array, even though declaring the array reference to be final
prevents modification of the reference itself.
This compliant solution makes the array private
and provides public methods to get individual items and array size:
private static final String[] items = {/* . . . */};
public static final String getItem(int index) {
return items[index];
}
public static final int getItemCount() {
return items.length;
}
Providing direct access to the array objects themselves is safe because String
is immutable.
This compliant solution defines a private
array and a public
method that returns a copy of the array:
private static final String[] items = {/* . . . */};
public static final String[] getItems() {
return items.clone();
}
Because a copy of the array is returned, the original array values cannot be modified by a client. Note that a manual deep copy could be required when dealing with arrays of objects. This generally happens when the objects do not export a clone()
method. Refer to “OBJ06-J. Defensively copy mutable inputs and mutable internal components” [Long 2012], for more information.
As before, this method provides direct access to the array objects themselves, but this is safe because String
is immutable. If the array contained mutable objects, the getItems()
method could return an array of cloned objects instead.
This compliant solution declares a private
array from which a public
immutable list is constructed:
private static final String[] items = {/* . . . */};
public static final List<String> itemsList =
Collections.unmodifiableList(Arrays.asList(items));
Neither the original array values nor the public
list can be modified by a client. For more details about unmodifiable wrappers, refer to Guideline 3, “Provide sensitive mutable classes with unmodifiable wrappers.” This solution can also be used when the array contains mutable objects.
Incorrectly assuming that final
references cause the contents of the referenced object to remain mutable can result in an attacker modifying an object believed to be immutable.
[Core Java 2003]
Chapter 6, “Interfaces and Inner Classes”
[JLS 2013]
§4.12.4, “final
Variables”
§6.6, “Access Control”
[Long 2012]
OBJ04-J. Provide mutable classes with copy functionality to safely allow passing instances to untrusted code
OBJ06-J. Defensively copy mutable inputs and mutable internal components
[Mettler 2010]
“Class Properties for Security Review in an Object-Capability Subset of Java”
When objects are serialized using the writeObject()
method, each object is written to the output stream only once. Invoking the writeObject()
method on the same object a second time places a back-reference to the previously serialized instance in the stream. Correspondingly, the readObject()
method produces at most one instance for each object present in the input stream that was previously written by writeObject()
.
According to the Java API [API 2013], the writeUnshared()
method
writes an “unshared” object to the ObjectOutputStream
. This method is identical to writeObject
, except that it always writes the given object as a new, unique object in the stream (as opposed to a back-reference pointing to a previously serialized instance).
Similarly, the readUnshared()
method
reads an “unshared” object from the ObjectInputStream
. This method is identical to readObject
, except that it prevents subsequent calls to readObject
and read-Unshared
from returning additional references to the deserialized instance obtained via this call.
Consequently, the writeUnshared()
and readUnshared()
methods are unsuitable for round-trip serialization of data structures that contain reference cycles.
Consider the following code example:
public class Person {
private String name;
Person() {
// Do nothing — needed for serialization
}
Person(String theName) {
name = theName;
}
// Other details not relevant to this example
}
public class Student extends Person implements Serializable {
private Professor tutor;
Student() {
// Do nothing — needed for serialization
}
Student(String theName, Professor theTutor) {
super(theName);
tutor = theTutor;
}
public Professor getTutor() {
return tutor;
}
}
public class Professor extends Person implements Serializable {
private List<Student> tutees = new ArrayList<Student>();
Professor() {
// Do nothing — needed for serialization
}
Professor(String theName) {
super(theName);
}
public List<Student> getTutees () {
return tutees;
}
/**
* checkTutees checks that all the tutees
* have this Professor as their tutor
*/
public boolean checkTutees () {
boolean result = true;
for (Student stu: tutees) {
if (stu.getTutor() != this) {
result = false;
break;
}
}
return result;
}
}
// ...
Professor jane = new Professor("Jane");
Student able = new Student("Able", jane);
Student baker = new Student("Baker", jane);
Student charlie = new Student("Charlie", jane);
jane.getTutees().add(able);
jane.getTutees().add(baker);
jane.getTutees().add(charlie);
System.out.println("checkTutees returns: " + jane.checkTutees());
// Prints "checkTutees returns: true"
Professor
and Student
are types that extend the basic type Person
. A student (that is, an object of type Student
) has a tutor of type Professor
. A professor (that is, an object of type Professor
) has a list (actually, an ArrayList
) of tutees (of type Student
). The method checkTutees()
checks whether all of the tutees of this professor have this professor as their tutor, returning true
if that is the case and false
otherwise.
Suppose that Professor Jane has three students, Able, Baker, and Charlie, all of whom have Professor Jane as their tutor. Issues can arise if the writeUnshared()
and readUnshared()
methods are used with these classes, as demonstrated in the following noncompliant code example.
This noncompliant code example attempts to serialize data using writeUnshared()
.
String filename = "serial";
try (ObjectOutputStream oos = new ObjectOutputStream(new
FileOutputStream(filename))) {
// Serializing using writeUnshared
oos.writeUnshared(jane);
} catch (Throwable e) {
// Handle error
}
// Deserializing using readUnshared
try (ObjectInputStream ois = new ObjectInputStream(new
FileInputStream(filename))){
Professor jane2 = (Professor)ois.readUnshared();
System.out.println("checkTutees returns: " +
jane2.checkTutees());
} catch (Throwable e) {
// Handle error
}
However, when the data is deserialized using readUnshared()
, the checkTutees()
method no longer returns true
because the tutor objects of the three students are different from the original Professor
object.
This compliant solution uses the writeObject()
and readObject()
methods to ensure that the tutor object referred to by the three students has a one-to-one mapping with the original Professor
object. The checkTutees()
method correctly returns true
.
String filename = "serial";
try (ObjectOutputStream oos = new ObjectOutputStream(new
FileOutputStream(filename))) {
// Serializing using writeUnshared
oos.writeObject(jane);
} catch (Throwable e) {
// Handle error
}
// Deserializing using readUnshared
try (ObjectInputStream ois = new ObjectInputStream(new
FileInputStream(filename))) {
Professor jane2 = (Professor)ois.readObject();
System.out.println("checkTutees returns: " +
jane2.checkTutees());
} catch (Throwable e) {
// Handle error
}
Using the writeUnshared()
and readUnshared()
methods may produce unexpected results when used for the round-trip serialization of data structures containing reference cycles.
[API 2013]
Class ObjectOutputStream
Class ObjectInputStream
Setting local reference variables to null
to “help the garbage collector” is unnecessary. It adds clutter to the code and can make maintenance difficult. Java just-in-time compilers (JITs) can perform an equivalent liveness analysis and most implementations do so.
A related bad practice is the use of a finalizer to null
out references. See The CERT® Oracle® Secure Coding Standard for Java™ [Long 2012], “MET12-J. Do not use finalizers,” for additional details.
This guideline applies specifically to local variables. For a case where explicitly erasing objects is useful, see Guideline 49, “Remove short-lived objects from long-lived container objects.”
In this noncompliant code example, buffer
is a local variable that holds a reference to a temporary array. The programmer attempts to help the garbage collector by assigning null
to the buffer
array when it is no longer needed.
{ // Local scope
int[] buffer = new int[100];
doSomething(buffer);
buffer = null;
}
Program logic occasionally requires tight control over the lifetime of an object referenced from a local variable. In the unusual cases where such control is necessary, use a lexical block to limit the scope of the variable, because the garbage collector can collect the object immediately when it goes out of scope [Bloch 2008].
This compliant solution uses a lexical block to control the lifetime of the buffer
object:
{ // Limit the scope of buffer
int[] buffer = new int[100];
doSomething(buffer);
}
It is unnecessary to set local reference variables to null
when they are no longer needed in a mistaken attempt to help the garbage collector reclaim the associated memory.
[Bloch 2008]
Item 6, “Eliminate Obsolete Object References”
[Long 2012]
MET12-J. Do not use finalizers
3.144.227.9