Puzzler 10

A Case of Equality

Scala's case classes are an easy way to represent entities, with factory methods, extractors, and several convenience methods implemented "for free":

  class Country(val isoCode: Stringval name: String)
  case class CountryCC(isoCode: String, name: String)
  
val homeOfScala = new Country("CH""Switzerland") val homeOfScalaCC =   CountryCC("CH""Switzerland"// factory method
scala> println(homeOfScala equals           new Country("CH""Switzerland")) false
scala> println(homeOfScalaCC equals          CountryCC("CH""Switzerland")) true
scala> println(homeOfScala.toString) $line348.$read$$iw$$iw$Country@39eb8ede
scala> println(homeOfScalaCC.toString) CountryCC(CH,Switzerland)

To give you a better idea of what's going on, we'll trace the invocation of hashCode, one of the convenience methods. We'll mix a "debugging" trait into the declaration or instantiations of the case class, then add case class instances to HashSets to see how hashCode is used. What is the result of executing the following code in the REPL?

  trait TraceHashCode {
    override def hashCode: Int = {
      println(s"TRACE: In hashCode for ${this}")
      super.hashCode
    }
  }
  
// mix in trait at instantiation case class Country(isoCode: String) def newSwitzInst = new Country("CH"with TraceHashCode
// mix in trait at declaration time case class CountryWithTrace(isoCode: Stringextends    TraceHashCode def newSwitzDecl = CountryWithTrace("CH")
import collection.immutable.HashSet val countriesInst = HashSet(newSwitzInst) println(countriesInst.iterator contains newSwitzInst) println(countriesInst contains newSwitzInst)
val countriesDecl = HashSet(newSwitzDecl) println(countriesDecl.iterator contains newSwitzDecl) println(countriesDecl contains newSwitzDecl)

Possibilities

  1. Prints:
      true
      TRACE: In hashCode for Country(CH)
      true
      true
      TRACE: In hashCode for CountryWithTrace(CH)
      true
    
  2. Prints:
      true
      TRACE: In hashCode for Country(CH)
      true
      true
      TRACE: In hashCode for CountryWithTrace(CH)
      false
    
  3. Prints:
      true
      TRACE: In hashCode for Country(CH)
      false
      true
      TRACE: In hashCode for CountryWithTrace(CH)
      false
    
  4. Prints:
      true
      TRACE: In hashCode for Country(CH)
      true
      false
      TRACE: In hashCode for CountryWithTrace(CH)
      false
    

Explanation

The generated implementation of equals and hashCode for case classes is based on structural equality: two instances are equal if they have the same type and equal constructor arguments. Since mixing in TraceHashCode does not affect that structure, you might assume that instances created by newSwitzInst are equal and have identical hash codes, and the same holds true for newSwitzDecl. And if this is true, countriesInst should contain newSwitzInst, and countriesDecl should contain newSwitzDecl.

Or, you may wonder whether mixing in TraceHashCode at declaration time "switches off" the generated structural equality for CountryWithTrace. Different instances created by newSwitzDecl would have different hash codes and not be considered equal, and therefore the second instance created by newSwitzDecl would not be a member of countriesDecl. Surely, though, it makes no difference whether you check the set or the iterator?

Actually, it does. Mixing in TraceHashCode on instantiation leaves equals and hashCode behavior unaffected, as you might hope. But declaring CountryWithTrace as extending from TraceHashCode switches off the generated hashCode method for case classes, so the new instance created by newSwitzDecl is not found in the set. The generated equals implementation, on which the iterator depends, is not affected. The correct answer is number 2:

  scala> println(countriesInst.iterator contains
           newSwitzInst)
  true
  
scala> println(countriesInst contains newSwitzInst) TRACE: In hashCode for Country(CH) true
scala> println(countriesDecl.iterator contains           newSwitzDecl) true
scala> println(countriesDecl contains newSwitzDecl) TRACE: In hashCode for CountryWithTrace(CH) false

This is especially problematic because you are inadvertently violating the equals/hashCode contract here, which states, "it is required that if two objects are equal [...] they have identical hash codes."[1] Note that both instances created by newSwitzInst are considered equal (and have equal hash codes), so mixing in TraceHashCode at instantiation time does not have any unintended effects.

The language specification's explanation of case classes[2] can help clarify what is going on (our emphasis):

Every case class implicitly overrides some method definitions of class scala.AnyRef unless a definition of the same method is already given in the case class itself or a concrete definition of the same method is given in some base class of the case class different from AnyRef.

So the compiler will generate overrides only if explicit implementations of the methods are not present in the case class or inherited from a parent class or trait. In addition, the conditions under which the methods (equals and hashCode, in this case) are overridden are independent of each other, so coherence between equals and hashCode is left to the developer.

In our example, the compiler generates an overridden implementation for CountryWithTrace's equals method, so comparing two instances created by newSwitzDecl via newSwitzDecl == newSwitzDecl evaluates to true. The hashCode method, however, is not overridden, so the super.hashCode call in TraceHashCode invokes the default implementation in AnyRef, which is consistent with reference equality. Hence, newSwitzDecl.hashCode == newSwitzDecl.hashCode returns false, and therefore new instances created by newSwitzDecl are not found in the countriesDecl set.

In the case of new Country("CH") with TraceHashCode, the generated overrides are added by the compiler when case class Country is declared, at which point neither equals nor hashCode are explicitly implemented. By the time TraceHashCode is mixed in during the creation of new instances by newSwitzInst, Country already has an equals method based on structural equality. The super.hashCode call in TraceHashCode thus invokes the compiler-generated hashCode method in Country, as intended.

Discussion

Adding the "debugging" trait at instantiation time seems to be the way to go. However, you want to avoid having to mix in the TraceHashCode trait every time you create an instance. You can achieve this by (temporarily) creating a subclass of Country:

  case class _Country(isoCode: String// renamed
  // use :paste in the REPL
  class Country(isoCode: Stringextends
    _Country(isoCode: Stringwith TraceHashCode
  object Country {
    def apply(isoCode: String): Country = new Country(isoCode)
  }
  // ctrl-D to end :paste mode
  def newSwitzSubcl = Country("CH")
  
scala> println(newSwitzSubcl == newSwitzSubcl) true
scala> println(newSwitzSubcl.hashCode           == newSwitzSubcl.hashCode) TRACE: In hashCode for _Country(CH) TRACE: In hashCode for _Country(CH) true

Extending case classes is not considered good practice, however. You can do a little better by "replacing" the case class factory method. The compiler will still attempt to generate an apply method if you define one yourself, however, which will cause a compiler error. If you want to redefine the standard apply factory method in a case class's companion object, you will need to declare the case class abstract:

  // use :paste in the REPL
  abstract case class Country(isoCode: String)
  object Country {
    def apply(isoCode: String): Country =
      new Country(isoCode) with TraceHashCode
  }
  // ctrl-D to end :paste mode
  def newSwitzFact = Country("CH")
  
scala> println(newSwitzFact == newSwitzFact) true
scala> println(newSwitzFact.hashCode           == newSwitzFact.hashCode) TRACE: In hashCode for Country(CH) TRACE: In hashCode for Country(CH) true

Conveniently, the compiler will still add an implementation of unapply to the companion object, so your case class will still work with pattern matching. You will, however, be unable to make instances using newi.e., new Country("CH")—since Country is now abstract.

If you are going to mess with the declaration of the case class, the easiest approach is to avoid super.hashCode and simply ensure that the implementation of hashCode is consistent with structural equality. Calling isoCode.hashCode would meet this requirement, but you have to be careful since isoCode could conceivably be null. The ## method, Scala's null-safe version of hashCode, avoids this problem:

  case class CountryWithTrace(isoCode: String) {
    // avoiding super.hashCode
    override def hashCode: Int = {
      println(s"TRACE: In hashCode for ${this}")
      isoCode.##
    }
  }
  def newSwitzHCImpl = CountryWithTrace("CH")
  
scala> println(newSwitzHCImpl == newSwitzHCImpl) true
scala> println(newSwitzHCImpl.hashCode          == newSwitzHCImpl.hashCode) TRACE: In hashCode for CountryWithTrace(CH) TRACE: In hashCode for CountryWithTrace(CH) true
image images/moralgraphic117px.png When supplying your own implementation of equals or hashCode for a case class:
  1. Ensure that it obeys structural equality if specifying only one of the two methods.
  2. If not, implement both methods according to the equals/hashCode contract.

Footnotes for Chapter 10:

[1] See the Scaladoc for scala.Any. [EPF]

[2] Odersky, The Scala Language Specification, Section 5.3.2. [Ode14]

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.66.94