Puzzler 4

Now You See Me, Now You Don't

Scala supports object-oriented programming concepts, and inheritance is a prominent one. When working with inheritance, it is often useful to override default values set in parent classes and traits. Adding multiple levels of inheritance makes things more interesting, such as in the following program. What does it print?

  trait A {
    val foo: Int
    val bar = 10
    println("In A: foo: " + foo + ", bar: " + bar)
  }
  
class B extends A {   val foo: Int = 25   println("In B: foo: " + foo + ", bar: " + bar) }
class C extends B {   override val bar = 99   println("In C: foo: " + foo + ", bar: " + bar) }
new C()

Possibilities

  1. Prints:
      In A: foo: 0, bar: 0
      In B: foo: 25, bar: 0
      In C: foo: 25, bar: 99
    
  2. Prints:
      In A: foo: 0, bar: 10
      In B: foo: 25, bar: 10
      In C: foo: 25, bar: 99
    
  3. Prints:
      In A: foo: 0, bar: 0
      In B: foo: 25, bar: 99
      In C: foo: 25, bar: 99
    
  4. Prints:
      In A: foo: 25, bar: 99
      In B: foo: 25, bar: 99
      In C: foo: 25, bar: 99
    

Explanation

The correct answer is number 1. To understand why, you need to look into the details of the program execution.

First, you should remember that every Scala class has a primary constructor that is not explicitly defined, but interwoven with the class definition.[1] All statements in the class definition form the body of the primary constructor, and that includes field definitions (which is, by the way, the reason Scala does not intrinsically differentiate between class fields and values local to the constructor). Hence, all the code in trait A and classes B and C belongs to the constructor body.

The following rules control the initialization and overriding behavior of vals:[2]

  1. Superclasses are fully initialized before subclasses.
  2. Members are initialized in the order they are declared.
  3. When a val is overridden, it can still only be initialized once.
  4. Like an abstract val, an overridden val will have a default initial value during the construction of superclasses.

Therefore, even though bar appears to have an initial value assigned in trait A and class B, that is not the case, because it is overridden in class C. This means that during the construction of trait A, bar has the default initial value of 0 and not the assigned value of 10. Essentially, initialization order gets in the way, and the assignment of 10 to bar in trait A is completely invisible because bar is overridden in class C, where it is initialized to 99. Similarly, the value foo, since it is assigned a non-default value in class B, has value 0 in A and then 25 in B and C.

This issue can manifest itself with abstract fields, when such a field is used after it is declared, but before it is certain to have been initialized in a subclass. Basically, all constructs that are initialized during class construction (including non-abstract fields) and depend on abstract fields are prone to initialization order problems.

Default initial values

For the record, Scala specifies default initial values as:

  • 0 for Byte, Short, and Int
  • 0L, 0.0f, and 0.0d for Int, Long, Float, and Double, respectively
  • '' for Char
  • false for Boolean
  • () for Unit
  • null for all other types

Discussion

Scala inherits initialization order rules from Java. Java makes sure that superclasses are initialized first to allow safe use of superclass fields from the subclass constructors, guaranteeing that the fields will be properly initialized. Traits compile into interfaces and concrete (i.e., non-abstract) classes, so the same rules apply.

You might wonder if the compiler could somehow warn you about abstract fields that are used before being initialized to non-default values.[3] Unfortunately, no warning about uninitialized values is given by default—only testing can catch them. However, there is an advanced compiler option that can be used to detect them:

  • -Xcheckinit Wrap field accessors to throw an exception on uninitialized accesses.

This option adds a wrapper around all potentially uninitialized field accesses, and throws an exception rather than using a default value. The addition of a runtime check to field accessors adds significant overhead, so it's not recommended that you use it in production code.

If you start the Scala REPL session with the -Xcheckinit flag, the following exception will be thrown upon executing new C():

  scala> new C()
  scala.UninitializedFieldError: Uninitialized field: 
      <console>: 10
    at C.bar(<console>:10)
    at A$class.$init$(<console>:10)
    ...

As a good practice, you may want to turn on this flag in your automated builds to spot such problems early.

Now that you are aware of the nature of the problem, is there something you can do about it? The following sections provide some workarounds.

Methods

One option is to declare bar as a def instead of a val, which in this case results in the behavior you expect:

  trait A {
    val foo: Int
    def bar: Int = 10
    println("In A: foo: " + foo + ", bar: " + bar)
  }
  
class B extends A {   val foo: Int = 25   println("In B: foo: " + foo + ", bar: " + bar) }
class C extends B {   override def bar: Int = 99   println("In C: foo: " + foo + ", bar: " + bar) }
scala> new C In A: foo: 0, bar: 99 In B: foo: 25, bar: 99 In C: foo: 25, bar: 99

The reason defining bar as a def works here is that method bodies do not belong to the primary constructor and, therefore, take no part in class initialization. In addition, bar is overridden in C, and the polymorphic resolution selects that definition as the most specific one. Therefore, a call to bar in all three println statements invokes the overridden definition in class C.

One drawback of using methods is that they are evaluated upon each and every invocation. Also, Scala conforms to the Uniform Access Principle,[4] so defining a parameterless method in the superclass does not prevent it from being overridden as a val in a subclass, which would cause the puzzling behavior to reappear, ruining all the careful planning.

Lazy vals

Another way to avoid such surprises is to declare bar as a lazy val. Lazy vals are initialized when accessed the first time. "Regular" vals, called strict or eager, are initialized when defined. Here's how that looks:

  trait A {
    val foo: Int
    lazy val bar = 10
    println("In A: foo: " + foo + ", bar: " + bar)
  }
  class B extends A {
    val foo: Int = 25
    println("In B: foo: " + foo + ", bar: " + bar)
  }
  class C extends B {
    override lazy val bar = 99
    println("In C: foo: " + foo + ", bar: " + bar)
  }
  
new C()

This program also works as expected:

  In A: foo: 0, bar: 99
  In B: foo: 25, bar: 99
  In C: foo: 25, bar: 99

Declaring bar as a lazy val means it will be initialized to 99 during the construction of trait A, since that is where it is accessed for the first time. Lazy vals are initialized using compiler-generated methods, and here, the overridden version in trait C is the one that is called.

Note that lazy vals are typically used to defer expensive initializations to the last possible moment (sometimes they may never be initialized). That is not the goal here: in this case, lazy vals are used to ensure the proper order of initialization at runtime.

Be aware, however, that lazy vals can have some disadvantages:

  1. They incur a slight performance cost, due to synchronization that happens under the hood.
  2. You cannot declare an abstract lazy val.
  3. Using lazy vals is prone to creating cyclic references that can result in stack overflow errors on first access, or possibly even deadlock.
  4. You can even get a deadlock when a cyclic dependency does not exist between lazy vals, but between objects that declare them. Such scenarios can be very subtle and non-obvious.[5]

Pre-initialized fields

The same effect can also be achieved by using pre-initialized fields (also known as early initializers):

  trait A {
    val foo: Int
    val bar = 10
    println("In A: foo: " + foo + ", bar: " + bar)
  }
  
class B extends A {   val foo: Int = 25   println("In B: foo: " + foo + ", bar: " + bar) }
  class C extends {
    override val bar = 99with B {
    println("In C: foo: " + foo + ", bar: " + bar)
  }
  
scala> new C In A: foo: 0, bar: 99 In B: foo: 25, bar: 99 In C: foo: 25, bar: 99

The only difference between this and the original program is that bar is initialized in the early field definition clause of class C. An early field definition clause is the code within curly braces immediately following the extends keyword.[6] It is the part of a subclass that is intended to run before its superclass constructor.[7] By doing that, you make sure bar is initialized before trait A is constructed.

image images/moralgraphic117px.png The best way to address potential initialization order problems depends on your use case. If evaluating expressions upon each access is not too expensive, you might reach for method definitions. Or, lazy vals might turn out to be the simplest solution for the users of your class so long as you avoid any circular dependencies. Otherwise, assuming you can make it clear to users that they should use early field definitions, plain old abstract vals can be a good choice.

Footnotes for Chapter 4:

[1] Odersky, The Scala Language Specification, Section 5.3. [Ode14]

[2] "Why is my abstract or overridden val null?" [Why]

[3] In the manner of the "X may be used uninitialized in this function" warning from the C compiler, for instance.

[4] Odersky, Spoon, Venners, Programming in Scala, Glossary online. [Odeb]

[5] SIP-20, Improved Lazy Vals Initialization aims to significantly reduce the possibility of deadlocks by re-implementing the initialization mechanism.

[6] Odersky, The Scala Language Specification, Section 5.1.6. [Ode14]

[7] See Puzzler 3 for a more in-depth discussion of initialization order.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.144.194