Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Switches

The Java bytecode specification allows a switch statement to be compiled into one of two different bytecodes. One compiled switch type works as follows:

Given a particular value passed to the switch block to be compared, the passed value is successively compared against the value associated with each case statement in order. If, after testing all cases, no statements match, then the default label is matched. When a case statement that matches is found, the body of that statement and all subsequent case bodies are executed (until one body exits the switch statement, or the last one is reached).

The operation of this switch statement is equivalent to holding an ordered collection of values that are compared to the passed value, one after the other in order, until a match is determined. This means that the time taken for the switch to find the case that matches depends on how many case statements there are and where in the list the matched case is. If no cases match, and the default must be used, that always takes the longest matching time.

The other switch bytecode works for switch statements where the case values all lie in a particular range (or can be made to lie in a particular range). It works as follows:

Given a particular value passed to the switch block to be compared, the passed value is tested to see if it lies in the range. If it does not, the default label is matched; otherwise, the offset of the case is calculated and the corresponding case is matched directly. The body of that matched label and all subsequent case bodies are executed (until one body exits the switch statement, or the last one is reached).

For this latter switch bytecode, the time taken for the switch statement to match the case is constant. The time is not dependent on the number of cases in the switch, and if no cases match, the time to carry out the matching and go to the default is still the same. This switch statement operates as an ordered collection with switch value first being checked to see if it is a valid index into the ordered collection, and then that value is used as the index to arrive immediately at the matched location.

Clearly, the second type of switch statement is faster than the first. Sometimes compliers can add dummy cases to a switch statement, converting the first type of switch into the second (faster) kind. (A compiler is not obliged to use the second type of switch bytecode at all, but generally it does if it can easily be used.) You can determine which switch a particular statement has been compiled into using javap , the disassembler available with the JDK. Using the -c option so that the code is disassembled, examine the method that contains the switch statement. It contains either a “tableswitch” bytecode identifier, or a “lookupswitch” bytecode identifier. The tableswitch keyword is the identifier for the faster (second) type of switch.

If you identify a bottleneck that involves a switch statement, do not leave the decision to the compiler. You are better off constructing switch statements that use contiguous ranges of case values, ideally by inserting dummy case statements to specify all the values in the range, or possibly by breaking up the switch into multiple switches that each use contiguous ranges. You may need to apply both of these optimizations as in the next example.

Our tuning.loop.SwitchTest class provides a repeated test on three methods with switch statements, and one other array-access method for comparison. The first method, switch1( ), contains some noncontiguous values for the cases, with each returning a particular integer value. The second method, switch2( ), converts the single switch statement in switch1( ) into four switch statements, with some of those four switch statements containing extra dummy cases to make each switch statement contain a contiguous set of cases. This second method, switch2( ), is functionally identical to switch1( ).

The third method, switch3( ), replaces the cases with a contiguous set of cases, integers 1 to 13. This method is not directly comparable to the first two methods; it is present as a control test. The fourth method, switch4( ), is functionally identical to switch3( ), but uses an array access instead of the switch statement, essentially doing in Java code what the compiler implicitly does in bytecodes for switch3( ). I run two sets of tests. The first set passes in a different integer for each call to the switches. This means that most of the time, the default label is matched. The second set of tests always passes in the integer 8 to the switches. The results are shown in Table 7-2 for various VMs. “Varying” and “constant” refer to the value passed to the switch statement. Tests labeled varying passed different integers for each iteration of the test loop; tests labeled constant passed the integer 8 for each iteration of the loop.

Table 7-2. Speedup Using Exception-Driven Loop Termination

		1.2	1.3	HotSpot 1.0	HotSpot 2nd Run^[a]	1.1.6
1	`switch1` varying	100%	55%	208%	29%	109%
2	`switch2` varying	12%	53%	218%	27%	12%
3	`switch3` varying	23%	79%	212%	36%	23%
4	`switch4` varying	9%	36%	231%	15%	9%
5	`switch1` constant	41%	33%	195%	30%	45%
6	`switch2` constant	17%	42%	207%	30%	15%
7	`switch3` constant	20%	48%	186%	24%	20%
8	`switch4` constant	6%	42%	200%	12%	6%
^[a]HotSpot is tuned for long-lived server applications, and so applies its optimizations after the first run of the test indicates where the bottlenecks are.

There is a big difference in optimizations gained depending on whether the VM has a JIT or uses HotSpot technology. The times are all relative to the JDK 1.2 "switch1 varying” case. From the variation in timings, it is not clear whether the HotSpot technology fails to compile the handcrafted switch in an optimal way, or whether it does optimally compile all the switch statements but adds overheads that cancel some of the optimizations.

For the JIT results, the first and second lines of output show the speedup you can get by recrafting the switch statements. Here, both switch1( ) and switch2( ) are using the default for most of the tests. In this situation, switch1( ) requires 13 failed comparisons before executing the default statement. switch2(), on the other hand, checks the value against the range of each of its four switch statements, then immediately executes the default statement.

The first and third lines of output show the worst-case comparison between the two types of switch statements. In this test, switch1( ) almost always fails all its comparison tests. On the other hand, switch3( ), with the contiguous range, is much faster than switch1() ( JIT cases only). This is exactly what is expected, as the average case for switch1( ) here consists of 13 failed comparisons followed by a return statement. The average case for switch3( ) in this test is only a pair of checks followed by a return statement. The two checks are that the integer is smaller than or equal to 13, and larger than or equal to 1. Both checks fail in most of the calls for this “varying” case.

Even when the case statement in switch1( ) is always matched, the fifth and sixth lines show that switch2() can be faster (though again, not with HotSpot). In this test, the matched statement is about halfway down the list of cases in switch1( ), so the seven or so failed comparisons for switch1( ) compared to two range checks for switch2( ) translate into switch2( ) being more than twice as fast as switch1( ).

In each set of tests, switch2( ), which is functionally identical to switch1( ), is faster. The output for switch4( ) is included for comparison, and it turns out to be faster than the functionally identical switch3( ), thus indicating that it is worth considering dispensing with the switch tests completely when you can convert the switch to an array access. In this example, the switch merely returns an integer, so the conversion to an array access is feasible; in general, it may be difficult to convert a set of body statements into an array access and subsequent processing:

package tuning.loop;

public class SwitchTest
{
  //Use a default size for the loop of 1 million iterations
  static int SIZE = 10000000;

  public static void main(String args[])
  {
    //Allow an argument to set the size of the loop.
    if (args.length != 0)
      SIZE = Integer.parseInt(args[0]);
    int result = 0;
    //run tests looking mostly for the default (switch
    //test uses many different values passed to it)
    long time = System.currentTimeMillis( );
    for (int i = SIZE; i >=0 ; i--)
      result += switch1(i);
    System.out.println("Switch1 took " + 
      (System.currentTimeMillis( )-time) + " millis to get " + result);

    //and the same code to test timings on switch2( ),
    //switch3( ) and switch4( )
    ...

    //run tests using one particular passed value (8)
    result = 0;
    time = System.currentTimeMillis( );
    for (int i = SIZE; i >=0 ; i--)
      result += switch1(8);
    System.out.println("Switch1 took " + 
      (System.currentTimeMillis( )-time) + " millis to get " + result);

    //and the same code to test timings on switch2( ),
    //switch3( ) and switch4( )
    ...
  }

  public static int switch1(int i)
  {
    //This is one big switch statement with 13 case statements
    //in no particular order.
    switch(i)
    {
      case 318: return 99;
      case 320: return 55;
      case 323: return -1;
      case 14: return 6;
      case 5: return 8;
      case 123456: return 12;
      case 7: return 15;
      case 8: return 29;
      case 9: return 11111;
      case 123457: return 12345;
      case 112233: return 6666;
      case 112235: return 9876;
      case 112237: return 12;
      default: return -1;
    }
  }
  public static int switch2(int i)
  {
    //In this method we break up the 13 case statements from
    //switch1( ) into four almost contiguous ranges. Then we
    //add in a few dummy cases so that the four ranges are
    //definitely contiguous. This should ensure that the compiler
    //will generate the more optimal tableswitch bytcodes
    switch(i)
    {
      case 318: return 99;
      case 319: break;      //dummy
      case 320: return 55;
      case 321: break;      //dummy
      case 322: break;      //dummy
      case 323: return -1;
    }
    switch(i)
    {
      case 5: return 8;
      case 6: break;        //dummy
      case 7: return 15;
      case 8: return 29;
      case 9: return 11111;
      case 10: break;       //dummy
      case 11: break;       //dummy
      case 12: break;       //dummy
      case 13: break;       //dummy
      case 14: return 6;
    }
    switch(i)
    {
      case 112233: return 6666;
      case 112234: break;       //dummy
      case 112235: return 9876;
      case 112236: break;       //dummy
      case 112237: return 12;
    }
    switch(i)
    {
      case 123456: return 12;
      case 123457: return 12345;
      default: return -1;
    }
  }
  public static int switch3(int i)
  {
    switch(i)
    {
      //13 contiguous case statements as a kind of fastest control
      case 1: return 99;
      case 2: return 55;
      case 3: return -1;
      case 4: return 6;
      case 5: return 8;
      case 6: return 12;
      case 7: return 15;
      case 8: return 29;
      case 9: return 11111;
      case 10: return 12345;
      case 11: return 6666;
      case 12: return 9876;
      case 13: return 12;
      default: return -1;
    }
  }
  final static int[] RETURNS = {
      99, 55, -1, 6, 8, 12, 15, 29,
      11111, 12345, 6666, 9876, 12
    };
  public static int switch4(int i)
  {
    //equivalent to switch3( ), but using an array lookup
    //instead of a switch statement.
    if (i < 1 || i > 13)
      return -1;
    else
      return RETURNS[i-1];
  }
}

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Switches

Create new playlist

Sign In

Sign Up

Switches

Table of Contents for
Switches