As your programs get larger, you may end up with modules that have cyclic dependencies. Reliably initializing such modules can be challenging.
What is the result of executing the following code in the REPL?
object XY { object X { val value: Int = Y.value + 1 } object Y { val value: Int = X.value + 1 } }
println(if (math.random > 0.5) XY.X.value else XY.Y.value)
1
2
1
or:
2
You may wonder whether the Scala compiler can even handle cyclic definitions of this kind, or whether you will run into an endless loop at runtime.
If you are confident that Scala can indeed handle such definitions without blowing up, you may suspect that the values will be initialized in declaration order. Since you randomly print either the value that is declared first (XY.X.value) or second (XY.Y.value), you would expect to see a non-deterministic result in this case.
Alternatively, you may guess that, while initializing the object accessed first, you will see default values for the other as-yet-uninitialized object, resulting in 1 being printed every time.
In fact, the correct answer is number 2. The value 2 is printed every time:
scala> println(if (math.random > 0.5) XY.X.value else XY.Y.value) 2
scala> println(s"X: ${XY.X.value} Y: ${XY.Y.value}") X: 2 Y: 1
And after a few :reset commands, you should eventually see:[1]
scala> println(if (math.random > 0.5) XY.X.value else XY.Y.value) 2
scala> println(s"X: ${XY.X.value} Y: ${XY.Y.value}") X: 1 Y: 2
To understand what is going on, we'll first demonstrate that the Scala compiler has no problem with cycles in val definitions. It does, however, require at least one explicit type specification:
scala> lazy val x = y; lazy val y = x <console>:12: error: recursive value y needs type lazy val x = y; lazy val y = x ^
scala> lazy val x: Int = y; lazy val y = x x: Int = <lazy> y: Int = <lazy>
The language specification says that "the value defined by an object definition is instantiated lazily,"[2] and goes even further by remarking that an object can indeed be seen as "roughly equivalent to [...] a lazy value." This explains why the random choice of printing XY.X.value or XY.Y.value does not influence the outcome: the declaration order is irrelevant, since the values are not initialized when the objects are declared, but when they are accessed. The chosen object is always the first to be initialized, and the object initialized first ends up with the value 2.
But how does the object initialized first end up with the value 2, and why do you not run into an endless loop once you access it? Here, it helps to examine the output produced by the compiler when you compile just the XY object with scalac -print:
[[syntax trees at end of cleanup]] // XY.scala package <empty> { object XY extends Object { def <init>(): XY.type = { XY.super.<init>(); () } }; object XY$X extends Object { private[this] val value: Int = _; <stable> <accessor> def value(): Int = XY$X.this.value; def <init>(): XY$X.type = { XY$X.super.<init>(); XY$X.this.value = XY$Y.value().+(1); () } }; object XY$Y extends Object { private[this] val value: Int = _; <stable> <accessor> def value(): Int = XY$Y.this.value; def <init>(): XY$Y.type = { XY$Y.super.<init>(); XY$Y.this.value = XY$X.value().+(1); () } } }
What happens when you access the randomly chosen object? Assume you are trying to get XY.Y.value:
If you happen to choose XY.X.value as the value to print, the initialization takes place with roles reversed. This explains why the first-accessed object will always receive a value of 2, with 1 being assigned to the value of the other object.
The observed behavior becomes more surprising when you compare it to what happens with similar kinds of cyclic definitions. For example, given that The Scala Language Specification says that objects are "roughly equivalent to [lazy values]," you might try:[4]
object XY2 { lazy val xvalue: Int = yvalue + 1 lazy val yvalue: Int = xvalue + 1 }
scala> println(if (math.random > 0.5) XY2.xvalue else XY2.yvalue) java.lang.StackOverflowError ... at XY2$.xvalue(<console>:8) at XY2$.yvalue$lzycompute(<console>:9) at XY2$.yvalue(<console>:9) at XY2$.xvalue$lzycompute(<console>:8) at XY2$.xvalue(<console>:8)
Or you could stick with objects, but put them inside an enclosing class instead of an object:
class XY3 { object X { val value: Int = Y.value + 1 } object Y { val value: Int = X.value + 1 } }
scala> val xy3 = new XY3() xy3: XY3 = XY3@770b07b9
scala> println(if (math.random > 0.5) xy3.X.value else xy3.Y.value) java.lang.StackOverflowError ... at XY3.Y$lzycompute(<console>:11) at XY3.Y(<console>:11) at XY3$X$.<init>(<console>:9) at XY3.X$lzycompute(<console>:8) at XY3.X(<console>:8) at XY3$Y$.<init>(<console>:12) at XY3.Y$lzycompute(<console>:11)
In both cases, you are missing the "endless loop protection" provided by the JVM's inability to initialize the same instance more than once. The compiler happily allows two functions to each each other, so you throw an exception at runtime.
In the second example, Y$lzycompute starts creating a new instance of Y to assign to the XY.Y singleton. This tries to access XY.X, which triggers X$lzycompute and, because XY.Y has not been initialized yet, invokes Y$lzycompute again. Y$lzycompute tries to create another instance of Y, and so on.
Alternatively, you can be slightly "less lazy":
object XY4 { lazy val xvalue: Int = yvalue + 1 val yvalue: Int = xvalue + 1 }
scala> println(if (math.random > 0.5) XY4.xvalue else XY4.yvalue) 2
scala> println(s"X: ${XY4.xvalue} Y: ${XY4.yvalue}") X: 1 Y: 2
Now, it is no longer the order in which the values are accessed that determines their values: for XY4, yvalue will be evaluated as soon as XY4 is initialized. This triggers the evaluation of xvalue, which sees the default value 0 for yvalue and becomes 1, with yvalue always becoming 2. The order in which xvalue and yvalue are declared still does not matter, though:
object XY4a { val yvalue: Int = xvalue + 1 lazy val xvalue: Int = yvalue + 1 }
scala> println(if (math.random > 0.5) XY4a.xvalue else XY4a.yvalue) 1
scala> println(s"X: ${XY4a.xvalue} Y: ${XY4a.yvalue}") X: 1 Y: 2
You can also avoid lazy values entirely:
object XY5 { val xvalue: Int = yvalue + 1 val yvalue: Int = xvalue + 1 }
scala> println(if (math.random > 0.5) XY5.xvalue else XY5.yvalue) 1
scala> println(s"X: ${XY5.xvalue} Y: ${XY5.yvalue}") X: 1 Y: 2
Here, both xvalue and yvalue are immediately evaluated on initialization of XY5. xvalue tries to retrieve the value of the as-yet-unassigned yvalue, again sees the default value 0, and is set to 1. yvalue is then always set to 2. Here, though, the problem is so predictable that the compiler emits a warning as soon as XY5 is defined:
scala> object XY5 { val xvalue: Int = yvalue + 1 val yvalue: Int = xvalue + 1 } <console>:8: warning: Reference to uninitialized value yvalue val xvalue: Int = yvalue + 1 ^ defined object XY5
Furthermore, unlike the other examples, here the declaration order determines the values of xvalue and yvalue. Inverting the order flips the values:
object XY5a { val yvalue: Int = xvalue + 1 val xvalue: Int = yvalue + 1 }
scala> println(s"X: ${XY5a.xvalue} Y: ${XY5a.yvalue}") X: 2 Y: 1
In summary, cyclic dependencies and definitions are tricky and hard to reason about. Some forms are dependent on the declaration order, others on the order of initialization, yet others result in endless loops. Avoid them where possible.
Avoid cyclic dependencies and definitions where possible. If you really can find no way to remove the cycle, ensure you understand the initialization behavior of all its components and values. Test thoroughly to ensure you get the intended result, especially if the order in which elements will be initialized is not deterministic. |
[1] The :reset command tells the REPL to "forget" all definitions, allowing you to initialize XY.X.value and XY.Y.value again.
[2] Odersky, The Scala Language Specification, Section 5.4. [Ode14]
[3] Lindholm, et. al., The Java Virtual Machine Specification, Section 5.5. [Lin13]
[4] See Puzzler 4 for a more detailed discussion of initialization options for variables.
18.188.137.58