Java API

This section covers the built-in synchronization mechanisms in Java. These are convenient to have as intrinsic mechanisms in the language. There are, however, potential dangers of misusing or overusing Java synchronization mechanisms.

The synchronized keyword

In Java, the keyword synchronized is used to define a critical section. Both code blocks inside a method and entire methods can be synchronized. The following code example illustrates a synchronized method:

public synchronized void setGadget(Gadget g) {
this.gadget = g;
}

As the method is synchronized, only one thread at a time can write to the gadget field in a given object.

In a synchronized method, the monitor object is implicit. Static synchronized methods use the class object of the method's class as monitor object, while synchronized instance methods use this. So, the previous code would be equivalent to:

public void setGadget(Gadget g) {
synchronized(this) {
this.gadget = g;
}
}

The java.lang.Thread class

The built-in thread abstraction in Java is represented by the class java.lang.Thread. This class is a somewhat more generic thread representation than that of corresponding OS implementations. It contains, among other things, fundamental methods for starting threads and for inserting the thread payload code. This is symmetrical with typical OS thread implementations where payload is passed as a function pointer to the main thread function by the creator of the thread. Java uses an object-oriented approach instead, but the semantics are the same. Any class implementing the java.lang.Runnable interface can become a thread. The run method inherited from the interface must be implemented and filled with payload code. java.lang.Thread can also be subclassed directly.

There is also a simple priority mechanism in the java.lang.Thread class that may or may not be efficiently mapped to the underlying OS variant. The setPriority method can be used to change the priority level of a thread, hinting to the JVM that it's more important (real-time) or less important. Normally, for most JVMs, little is gained by setting thread priorities explicitly from Java. The JRockit JVM may even ignore Java thread priorities when the runtime "knows better".

Threads can be made to yield the rest of their scheduled time slice to other threads, go to sleep or join (that is, wait for this thread to die).

Threads can be arranged in java.lang.ThreadGroups, a *NIX process like abstraction, which can also contain other thread groups. Thread operations may be applied to all threads in a thread group.

A thread may hold thread local object data, represented by the java.lang.ThreadLocal class. Each thread will own a copy of any ThreadLocal it contains. This is a very useful mechanism that has been around since Java 1.2. Even though it is a somewhat clumsy retrofit for a language without the concept of stack local object allocation, it can be a performance life saver. Given that the programmer knows what he is doing, explicitly declaring data thread local in Java may lead to significant speed ups.

The java.lang.Thread class has suffered some changes and deprecations to its API during its lifetime. Originally, it came with methods for stopping, suspending, and resuming threads. These turned out to be inherently unsafe. They still occur from time to time in Java programs, and we will discuss why they are dangerous in the section Pitfalls and false optimizations, later in this chapter.

The java.util.concurrent package

The java.util.concurrent package, introduced in JDK 1.5, contains several classes that implement data structures useful for concurrent programming. One example is the BlockingQueue that halts execution and waits for space to become available in the queue before storing elements and for elements to be inserted before retrieving them. This is the classic synchronized producer/consumer pattern.

The java.util.concurrent package helps the programmer spend less effort on re-implementing the most fundamental building blocks of synchronization mechanisms. Effort has also been made to ensure that the concurrent classes are optimized for scalability and performance.

Possibly, even more useful is the child package java.util.concurrent.atomic that contains lightweight thread safe mechanisms for modifying fields. For example, representations of integers (java.util.concurrent.atomic.AtomicInteger) and longs (java.util.concurrent.atomic.AtomicLong) that can be atomically incremented and decremented and have native-style atomic compares applied to them. Using the atomic package, when applicable, can be a good way of avoiding explicit heavyweight synchronization in the Java program.

Finally, the concurrent package includes the sub package java.util.concurrent.locks that contains implementations of data structures with common locking semantics. This includes reader/writer locks, another useful pattern that the programmer no longer has to implement from scratch.

Note

A reader/writer lock is a lock that allows unsynchronized reads from the data it protects, but enforces exclusiveness for writes to the data.

Semaphores

A semaphore is a synchronization mechanism that can come in handy when one thread tries to acquire a resource and fails because the resource is already being held by another thread. In case of failure, the thread that wanted the resource may want to go to sleep until explicitly woken up when the resource has been released. This is what semaphores are for. Semaphores are a common locking mechanism with abstraction and library calls present in every operating system, modern as well as antique. They are also enabled by an integral feature of the Java language.

In Java, each object contains methods named wait, notify, and notifyAll that may be used to implement semaphores. They are all inherited from the java.lang.Object class. The methods are meant to be used in the context of a monitor object, for example in a synchronized block. If there is no monitor available in the context they are called from, an IllegalMonitorStateException will be thrown at runtime.

Calling wait suspends the executing thread. It will be woken up as soon as a notification is received. When notify is called, one of the threads waiting for the synchronized resource will be arbitrarily selected and woken up by the thread scheduler in the JVM. The executing thread will go to sleep and block. When notifyAll is called, all threads waiting for the lock will be woken up. Only one of them will succeed in acquiring the lock and the rest will go to sleep again. The notifyAll method is safer than notify, as everyone will get a chance to acquire the lock, and deadlock situations are easier to avoid. The downside to notifyAll is that it carries a greater overhead than notify. So, if you know what you are doing, notifyAll should probably be avoided.

The wait method also comes with an optional timeout argument, which, when exceeded, always results in the suspended thread being woken up again.

To exemplify how semaphores work in Java, we can study the following code. The code is a component that can be used in a classic producer/consumer example, a message port, with the instance this used as an implicit monitor object in its synchronized methods.

public class Mailbox {
private String message;
private boolean messagePending;
/**
* Places a message in the mailbox
*/
public synchronized void putMessage(String message) {
while (messagePending) { //wait for consumers to consume
try {
wait(); //blocks until notified
} catch (InterruptedException e) {
}
}
this.message = message; //store message in mailbox
messagePending = true; //raise flag on mailbox
notifyAll(); //wake up any random consumer
}
/**
* Retrieves a message from the mailbox
*/
public synchronized String getMessage() {
while (!messagePending) { //wait for producer to produce
try {
wait(); //blocks until notified
} catch (InterruptedException e) {
}
}
messagePending = false; //lower flag on mailbox
notifyAll(); //wake up any random producer
return message;
}
}

Multiple producer and consumer threads can easily use a Mailbox object for synchronized message passing between them. Any consumer wanting to retrieve a message from an empty Mailbox by calling getMessage will block until a producer has used putMessage to place a message in the Mailbox. Symmetrically, if the Mailbox is already full, any producer will block in putMessage until a consumer has emptied the Mailbox.

Note

We have deliberately simplified things here. Semaphores can be either binary or counting. Binary semaphores are similar to the Mailbox example described above—there is an explicit "true or false" control over a single resource. Counting semaphores can instead limit access to a given number of accessors. This is exemplified by the class java.util.concurrent.Sempahore, which is another excellent tool that can be used for synchronization.

The volatile keyword

In a multi-threaded environment, it is not guaranteed that a write to a field or a memory location will be seen simultaneously by all executing threads. We will get into some more details of this in the section on The Java Memory Model, later in this chapter. However, if program execution relies on all threads needing to see the same value of a field at any given time, Java provides the volatile keyword.

Declaring a field volatile will ensure that any writes to the field go directly to memory. The data cannot linger in caches and cannot be written later, which is what may cause different threads to simultaneously see different values of the same field. The underlying virtual machine typically implements this by having the JIT insert memory barrier code after stores to the field, which naturally is bad for program performance.

While people usually have trouble with the concept that different threads can end up with different values for a field load, they tend not to suffer from the phenomenon. Usually, the memory model of the underlying machine is strong enough or the structure of the program itself isn't too prone to causing problems with non-volatile fields. However, bringing an optimizing JIT compiler into the picture might wreak some additional havoc on the unsuspecting programmer. Hopefully, the following example explains why it is important to think about memory semantics in all kinds of Java programs, even (especially) in those where problems do not readily manifest themselves:

public class MyThread extends Thread {
private volatile boolean finished;
public void run() {
while (!finished) {
//
}
}
public void signalDone() {
this.finished = true;
}
}

If finished isn't declared volatile here, the JIT compiler may theoretically choose, as an optimization, to load its value from memory only once, before the while loop is run, thus breaking the thread ending criterion. In that case, as finished starts out as false, the while loop condition will be forever true and the thread will never exit, even though signalDone is called later on. The Java Language Specification basically allows the compiler to create its own thread local copies of non-volatile fields if it sees fit to do so.

For further insight about volatile fields, consider the following code:

public class Test {
volatile int a = 1;
volatile int b = 1;
void add() {
a++;
b++;
}
void print() {
System.out.println(a + " " + b);
}
}

Here, the volatile keyword implicitly guarantees that b never appears greater than a to any thread, even if the add and print functions are frequently called in a multithreaded environment. An even tougher restriction would be to declare the add method synchronized, in which case a and b would always have the same value when print is called (as they both start at 1). If none of the fields are declared volatile and the method is not synchronized, it is important to remember that Java guarantees no relationship between a and b!

Note

volatile fields should be used with caution, as their implementation in the JIT usually involves expensive barrier instructions that may ruin CPU caches and slow down program execution.

Naturally, synchronized mechanisms incur runtime overhead to a greater degree than unsynchronized ones. Instead of readily using volatile and synchronized declarations, with their potential slowdowns, the programmer should sometimes consider other ways of propagating information if it doesn't change the semantics of the memory model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.107.85