Chapter 8. for Comprehensions in Depth

ForComprehensions described many of the details. At this point, they look like a nice, more flexible version of the venerable for loop, but not much more. In fact, lots of sophistication lies below the surface, connected to some of the functional combinators we discussed in the previous chapter. You can write concise code with elegant solutions to a number of design problems.

In this chapter, we’ll dive below the surface to understand for comprehensions and how they are implemented in Scala. You understand how your own types can exploit them.

We’ll finish with some practical design problems implemented using for comprehensions, such as error handling during the execution of a sequence of processing steps.

Recap: The Elements of for Comprehensions

A for comprehension contains one or more generator expressions, optional guard expressions for filtering, and optional value definitions. The output can be “yielded” to create new collections or a side-effecting block of code can be executed on each pass, such as printing output. The following example demonstrates all these features. It removes blank lines from a text file. This is a full program with an example of how to parse input arguments (although there are libraries available for this purpose), handle help messages, etc.

// src/main/scala/progscala3/forcomps/RemoveBlanks.scala
package progscala3.forcomps

object RemoveBlanks:
  def apply(path: String, compress: Boolean, numbers: Boolean): Seq[String] =
    for                                                              1
      (line, i) <- scala.io.Source.fromFile(path).getLines.toSeq.zipWithIndex
      if line.matches("""^s*$""") == false                          2
      line2 = if compress then line.trim.replaceAll("\s+", " ")     3
              else line
      numLine = if numbers then "%4d: %s".format(i, line2)           4
              else line2
    yield numLine

  def main(params: Array[String]): Unit =                            5

    val Args(compress, numbers, paths) = parseParams(params.toSeq, Args())
    for                                                              6
      path <- paths
      seq = s"
== File: $path
" +: RemoveBlanks(path, compress, numbers)
      line <- seq
    do println(line)

  protected val helpMessage = """
    |usage: RemoveBlanks [-h|--help] [-c|--compress] [-n|--numbers] file ...
    |where:
    | -h | --help     Print this message and quit
    | -c | --compress Compress whitespace
    | -n | --numbers  Print line numbers
    | file ...        One or more files to print without blanks
    |""".stripMargin

  protected case class Args(                                         7
    compress: Boolean = false,
    numbers: Boolean = false,
    paths: Vector[String] = Vector.empty)

  protected def help(messages: Seq[String], exitCode: Int) =
    messages.foreach(println)
    println(helpMessage)
    sys.exit(exitCode)

  protected def parseParams(params2: Seq[String], args: Args): Args =
    params2 match
      case ("-h" | "--help") +: tail =>
        println(helpMessage)
        sys.exit(0)
      case ("-c" | "--compress") +: tail =>
        parseParams(tail, args.copy(compress = true))
      case ("-n" | "--number") +: tail =>
        parseParams(tail, args.copy(numbers = true))
      case flag +: tail if flag.startsWith("-") =>
        println(s"ERROR: Unknown option $flag")
        println(helpMessage)
        sys.exit(1)
      case path +: tail =>
        parseParams(tail, args.copy(paths = args.paths :+ path))
      case Nil => args
1

Use scala.io.Source to open the file and get the lines, where getLines returns an Iterator[String], which we must convert to a sequence, because we can’t return an Iterator from the for comprehension and the return type is determined by the initial generator. Using zipWithIndex adds a line number.

2

Filter out blank lines using a regular expression. Note that this will result in line number gaps.

3

Define a local variable containing the nonblank line, if whitespace compression is not enabled, or a new string with all whitespace compressed to single spaces.

4

Format a string with the line number, if enabled.

5

The main method to process the argument list.

6

A second for comprehension to process the files. Note that we prepend a line with the file name, which will be printed, along with an optional line number created using zipWithIndex. Note that the numbers printed won’t the orginal file’s line numbers.

7

Convenience class to parse the arguments, including flags to show help, whether or not to compress the whitespace in lines, and whether or not to print line numbers.

Try running it at the sbt prompt:

> runMain progscala3.forcomps.RemoveBlanks --help
> runMain progscala3.forcomps.RemoveBlanks README.md build.sbt -n -c

Try different files and different command line options.

For Comprehensions: Under the Hood

The for comprehension syntax is actually syntactic sugar provided by the compiler for calling the collection methods foreach, map, flatMap, and withFilter.

Having a second way to invoke these methods is often easier to understand for nontrivial sequences, compared to using the API calls. After a while, you develop an intuition about which approach is best for a given context.

The method withFilter is used for filtering elements just like the filter method, but it doesn’t construct its own output collection. For better efficiency, it works with the other methods to combine filtering with their logic so that one less new collection is generated. Specifically, withFilter restricts the domain of the elements allowed to pass through subsequent combinators like map, flatMap, foreach, and other withFilter invocations.

To see what the for comprehension sugar encapsulates, let’s walk through several informal comparisons first, then we’ll discuss the details of the precise mapping. As you look at the examples that follow, ask yourself which syntax is easier to understand in each case, the for comprehension or the corresponding method calls.

Consider this example of a simple for comprehension and the equivalent use of foreach on a collection:

// src/script/scala/progscala3/forcomps/ForForeach.scala

scala> val states = Vector("Alabama", "Alaska", "Virginia", "Wyoming")

scala> var lower1a = Vector.empty[String]
scala> var lower1b = Vector.empty[String]
scala> var lower2  = Vector.empty[String]

scala> for
     |   s <- states
     | do lower1a = lower1a :+ s.toLowerCase
     |
     | for s <- states do lower1b = lower1b :+ s.toLowerCase
     |
     | states.foreach(s => lower2 = lower2 :+ s.toLowerCase)

var lower1a: Vector[String] = Vector(alabama, alaska, virginia, wyoming)
var lower1b: Vector[String] = Vector(alabama, alaska, virginia, wyoming)
var lower2: Vector[String] = Vector(alabama, alaska, virginia, wyoming)

When there is just one generator (the s <- states) in a for comprehension, it can be written on a single line, as shown for lower1b. you can still put the do clause on the next line, if you prefer.

A single generator expression with a do statement corresponds to an invocation of foreach on the collection.

What happens if we use yield instead?

// src/script/scala/progscala3/forcomps/ForMap.scala

scala> var upper1a = Vector.empty[String]
scala> var upper1b = Vector.empty[String]
scala> var upper2  = Vector.empty[String]

scala> val upper1a = for
     |   s <- states
     | yield s.toUpperCase
     |
     | val upper1b = for s <- states yield s.toUpperCase
     |
     | val upper2 = states.map(_.toUpperCase)

val upper1a: Vector[String] = Vector(ALABAMA, ALASKA, VIRGINIA, WYOMING)
val upper1b: Vector[String] = Vector(ALABAMA, ALASKA, VIRGINIA, WYOMING)
val upper2: Vector[String] = Vector(ALABAMA, ALASKA, VIRGINIA, WYOMING)

A single generator expression followed by a yield expression corresponds to an invocation of map. When yield is used to construct a new container, its type is determined by the first generator. This is consistent with how map works.

What if we have more than one generator?

// src/script/scala/progscala3/forcomps/ForFlatmap.scala

scala> val results1 = for
     |   s <- states
     |   c <- s
     | yield s"$c-${c.toUpper}"
     |
     | val results2 = states.
     |   flatMap(s => s.toSeq).
     |   map(c => s"$c-${c.toUpper}")
val results1: Vector[String] = Vector(A-A, l-L, a-A, b-B, a-A, m-M, a-A, ...)
val results2: Vector[String] = Vector(A-A, l-L, a-A, b-B, a-A, m-M, a-A, ...)

The second generator iterates through each character in the string s. The contrived yield statement returns the character and its uppercase equivalent, separated by a dash.

When there are multiple generators, all but the last are converted to flatMap invocations. The last is a map invocation. Already, you may find the for comprehension easier to understand.

What if we add a guard?

// src/script/scala/progscala3/forcomps/ForGuard.scala

scala> val results1 = for
     |   s <- states
     |   c <- s
     |   if c.isLower
     | yield s"$c-${c.toUpper}"
     |
     | val results2 = states.
     |   flatMap(s => s.toSeq).
     |   withFilter(c => c.isLower).
     |   map(c => s"$c-${c.toUpper}")
     |
val results1: Vector[String] = Vector(l-L, a-A, b-B, a-A, m-M, a-A, l-L, ...)
val results2: Vector[String] = Vector(l-L, a-A, b-B, a-A, m-M, a-A, l-L, ...)

Note that the withFilter invocation is injected before the final map invocation.

Finally, defining a variable works as follows:

// src/script/scala/progscala3/forcomps/ForVariable.scala

scala> val results1 = for
     |   s <- states
     |   c <- s
     |   if c.isLower
     |   c2 = s"$c-${c.toUpper}"
     | yield c2
     |
     | val results2 = states.       // Same as the previous example.
     |   flatMap(s => s.toSeq).
     |   withFilter(c => c.isLower).
     |   map(c => s"$c-${c.toUpper}")
val results1: Vector[String] = Vector(l-L, a-A, b-B, a-A, m-M, a-A, l-L, ...)
val results2: Vector[String] = Vector(l-L, a-A, b-B, a-A, m-M, a-A, l-L, ...)

Translation Rules of for Comprehensions

Now that we have an intuitive understanding of how for comprehensions are translated to collection methods, let’s define the details more precisely.

First, in a generator expression such as pat <- expr, pat is a pattern expression. For example, (x, y) <- Seq((1,2),(3,4)). Similarly, in a value definition pat2 = expr, pat2 is also interpreted as a pattern. For example, (x, y) = aPair.

Because they are interpreted as patterns, the compiler translates the expressions using partial functions. For example, first step in the translation is to convert simple comprehension with a generator, pat <- expr. The translation is similar to the following for comprehensions (yield) and loops (do):

// src/script/scala/progscala3/forcomps/ForTranslated.scala

scala> val seq = Seq(1,2,3)

scala> for i <- seq yield 2*i
val res0: Seq[Int] = List(2, 4, 6)

scala> seq.map { case i => 2*i }
val res1: Seq[Int] = List(2, 4, 6)

scala> var sum1 = 0
scala> for i <- seq do sum1 += 1
var sum1: Int = 3

scala> var sum2 = 0
scala> seq.foreach { case i => sum2 += 1 }
var sum2: Int = 3

A conditional is translated to withFilter conceptually as shown next:

scala> for
     |   i <- seq
     |   if i%2 != 0
     | yield 2*i
val res2: Seq[Int] = List(2, 6)

scala> for
     |   i <- seq if i%2 != 0           1
     | yield 2*i
val res3: Seq[Int] = List(2, 6)

scala> seq.withFilter {
     |   case i if i%2 != 0 => true
     |   case _ => false
     | }.map { case i => 2*i }
val res4: Seq[Int] = List(2, 6)
1

You can write the guard on the same line as the previous generator.

After this, the translations are applied repeatedly until all comprehension expressions have been replaced. Note that some steps generate new for comprehensions that subsequent iterations will translate.

First, a for comprehension with two generator and a yield expression:

scala> for
     |   i <- seq
     |   j <- (i to 3)
     | yield j
val res5: Seq[Int] = List(1, 2, 3, 2, 3, 3)

scala> seq.flatMap { case i => for j <- (i to 3) yield j }      1
val res6: Seq[Int] = List(1, 2, 3, 2, 3, 3)

scala> seq.flatMap { case i => (i to 3).map { case j => j } }   2
val res7: Seq[Int] = List(1, 2, 3, 2, 3, 3)
1

One level of translation. Note the nested for … yield.

2

Completed translation.

A for loop, with do, again translating in two steps:

scala> var sum3=0
scala> for
     |   i <- seq
     |   j <- (i to 3)
     | do sum3 += j
var sum3: Int = 14

scala> var sum4=0
scala> seq.foreach { case i => for j <- (i to 3) do sum4 += j }
var sum4: Int = 14

scala> var sum5=0
scala> seq.foreach { case i => (i to 3).foreach { case j => sum5 += j } }
var sum5: Int = 14

A generator followed by a value definition has a surprisingly complex translation. Here I show complete for … yield … expressions:

scala> for
     |   i <- seq
     |   i10 = i*10
     | yield i10
val res8: Seq[Int] = List(10, 20, 30)

scala> for
     |   (i, i10) <- for
     |     x1 @ i <- seq                    1
     |   yield
     |     val x2 @ i10 = x1*10             2
     |     (x1, x2)                         3
     | yield i10                            4
val seq9: Seq[Int] = List(10, 20, 30)
1

Recall from PatternMatching that x1 @ i means assign to variable x1 the value corresponding to the whole expression on the right-hand side of @, which is trivially i in this case, but it could be an arbitrary pattern with nested variable bindings to the constituent parts.

2

Assign to x2 the value of i10.

3

Return the tuple.

4

Yield i10, which will be equivalent to x2.

Here is another example of x @ pat = expr:

scala> val z @ (x, y) = (1 -> 2)
val z: (Int, Int) = (1,2)
val x: Int = 1
val y: Int = 2

This completes the translation rules. Whenever you encounter a for comprehension, you can apply these rules to translate it into method invocations on containers. You won’t need to do this often, but sometimes it’s a useful skill for debugging problems.

Options and Other Container Types

We used collections like Lists, Arrays, and Maps for our examples, but any types that implement foreach, map, flatMap, and withFilter (or filter) can be used in for comprehensions and not just the obvious collection types. In the general case, these are containers and eligible for use in for comprehensions.

Let’s consider several other container types. We’ll see how exploiting for comprehensions can transform your code in unexpected ways.

Option as a Container

Option is a binary container. It has an item or it doesn’t. It implements the four methods we need.

Here is a simplied version of the Option abstract class in the Scala library (full source here):

sealed abstract class Option[+A] { self =>                      1
  ...
  def isEmpty: Boolean = this eq None                           2

  final def foreach[U](f: A => U): Unit =
    if (!isEmpty) f(this.get)

  final def map[B](f: A => B): Option[B] =
    if (isEmpty) None else Some(f(this.get))

  final def flatMap[B](f: A => Option[B]): Option[B] =
    if (isEmpty) None else f(this.get)

  final def filter(p: A => Boolean): Option[A] =
    if (isEmpty || p(this.get)) this else None

  final def withFilter(p: A => Boolean): WithFilter = new WithFilter(p)

  class WithFilter(p: A => Boolean) {                           3
    def map[B](f: A => B): Option[B] = self filter p map f      4
    def flatMap[B](f: A => Option[B]): Option[B] = self filter p flatMap f
    def foreach[U](f: A => U): Unit = self filter p foreach f
    def withFilter(q: A => Boolean): WithFilter =
      new WithFilter(x => p(x) && q(x))
  }
}
1

The self => expression defines an alias for this for the Option instance. It is needed inside WithFilter below. See SelfTypeAnnotations for more details.

2

Test if this is actually the None instance, not value equality.

3

The WithFilter, which is used by withFilter combined with the other operations to avoid creation of an intermediate collection when filtering.

4

Here’s where the self reference defined above is used to operate on the enclosing Option instance. Using this would refer to the instance of WithFilter itself.

The final keyword prevents subclasses from overriding the implementation. It might be surprising to see the base class refer to derived classes. Normally, in object-oriented design this would be considered bad. However, with sealed type hierarchies, this file knows all the possible subclasses. Referring to derived classes makes the implementation more concise and efficient, overall, as well as safe.

The crucial feature about these Option methods shown is that the function arguments are only applied if the Option isn’t empty. This feature allows us to address a common design problem in an elegant way.

Say for example that you want to distribute some tasks around a cluster, then gather the results together. Suppose you want an elegant way to ignore those tasks that return empty results.

Wrap each task return value in an Option, where None is used for empty results and Some wraps a nonempty result. We want an easy way to filter out the None results. Here is an example, where we have the returned Options in a Vector:

// src/script/scala/progscala3/forcomps/ForOptionsFilter.scala

scala> val options: Seq[Option[Int]] = Vector(Some(10), None, Some(20))
val options: Seq[Option[Int]] = Vector(Some(10), None, Some(20))

scala> val results = for
     |   case Some(i) <- options
     | yield (2 * i)
val results: Seq[Int] = Vector(20, 40)

case Some(i) <- options pattern matches on each element in results and extracts the integers inside the Some values. Since a None won’t match, all of them are removed. We then yield the final expression we want.

As an exercise, let’s work through the translation rules. Try it yourself before reading on! Here is the first step, where we apply the first rule for converting each pat <- expr expression to a withFilter expression:

scala> val results2 = for
     |   case Some(i) <- options withFilter {
     |     case Some(i) => true
     |     case None => false
     |   }
     | yield (2 * i)
val results2: Seq[Int] = Vector(20, 40)

Finally, we convert the outer for x <- y yield (z) expression to a map call:

scala> val results3 = options withFilter {
     |   case Some(i) => true
     |   case None => false
     | } map {
     |   case Some(i) => (2 * i)
     |   case None => -1             // hack
     | }
val results3: Seq[Int] = Vector(20, 40)

The “hack” is there because we don’t actually need the case None clause, because the withFilter has already removed all Nones. However, the compiler doesn’t understand this, so it warns us we’ll risk a MathError without the clause. Try removing this clause and observe the warning you get.

Consider another design problem. Instead of independent tasks where we ignore the empty results and combine the nonempty results, consider the case where we run a sequence of dependent steps, and we want to stop the whole process as soon as we encounter a None.

Note that we have a limitation that using None means we receive no feedback about why the step returned nothing, such as a failure. We’ll address this limitation later.

We could write tedious conditional logic that tries each case, one at a time, and checks the results, but a for comprehension is more concise:

// src/script/scala/progscala3/forcomps/ForOptionsSeq.scala

scala> def positiveOption(i: Int): Option[Int] =
     |   if i > 0 then Some(i) else None

scala> val resultSuccess = for
     |   i1 <- positiveOption(5)
     |   i2 <- positiveOption(10 * i1)
     |   i3 <- positiveOption(25 * i2)
     |   i4 <- positiveOption(2  * i3)
     | yield (i1 + i2 + i3 + i4)
val resultSuccess: Option[Int] = Some(3805)

scala> val resultFail = for
     |   i1 <- positiveOption(5)
     |   i2 <- positiveOption(-1 * i1)       1
     |   i3 <- positiveOption(25 * i2)
     |   i4 <- positiveOption(-2 * i3)
     | yield (i1 + i2 + i3 + i4)
val resultFail: Option[Int] = None
1

None is returned. The subsequent generators don’t call positiveOption, they just pass the None through.

At each step, the integer in the Some returned by positiveOption is extracted and assigned to a variable. Subsequent generators use those values. It appears we assume the “happy path” always works, which is true for the first for comprehension. It also works fine for the second for comprehension, because once a None is returned, the subsequent generators simply propagate the None and don’t call positiveOption.

Let’s look at three more container types with similar properties, Either and Try from the Scala library, and Validated from the Typelevel Cats library. Validated is a more sophisticated tool for sequencing validation steps.

Either: A Logical Extension to Option

We noted that the use of Option has the disadvantage that None carries no information that could tell us why no value is available. Did an error occur? What kind? Using Either instead is one solution. As the name suggests, Either is a container that holds one and only one of two things. In other words, where Option handled the case of zero or one items, Either handles the case of one item or another.

Either is a parameterized type with two parameters, Either[+A, +B], where the A and B are the two possible types of the element contained in the Either. Recall that +A indicates that Either is covariant in the type parameter A and similarly for +B. This means that if you need a value of type Either[Any,Any] (for example, a method parameter), you can use an instance of type Either[String,Int], because String and Int are subtypes of Any, therefore Either[String,Int] is a subtype of Either[Any,Any].

Either is also a sealed abstract class with two subclasses defined, Left and Right. That’s how we distinguish between the two possible elements.

The concept of Either predates Scala. It has been used for a long time as an alternative to throwing exceptions. By historical convention, the Left value is used to hold the error indicator, such as a message string or thrown exception, and the normal return value is returned in a Right.

Let’s port our Option example. It’s almost identical:

// src/script/scala/progscala3/forcomps/ForEithersGood.scala

scala> def positiveEither(i: Int): Either[String,Int] =
     |   if i > 0 then Right(i) else Left(s"nonpositive number $i")

scala> val result1 = for
     |   i1 <- positiveEither(5)
     |   i2 <- positiveEither(10 * i1)
     |   i3 <- positiveEither(25 * i2)
     |   i4 <- positiveEither(2  * i3)
     | yield (i1 + i2 + i3 + i4)
val result1: Either[String, Int] = Right(3805)

scala> val result2 = for
     |   i1 <- positiveEither(5)
     |   i2 <- positiveEither(-1 * i1)   1
     |   i3 <- positiveEither(25 * i2)
     |   i4 <- positiveEither(-2 * i3)
     | yield (i1 + i2 + i3 + i4)
val result2: Either[String, Int] = Left(nonpositive number -5)
1

A Left is returned here, stopping the process.

Note how Left and Right objects are constructed in positiveEither. Note the types for result1 and result2. In particular, result2 now tells us where the first negative number was encountered, but not the second occurrence of one.

Either isn’t limited to this error-handling idiom. It could be used for any scenario where you want to hold one object or another, possibly of different types. However, union types are, such as String | Int, are better for this purpose. Superficially, they appear to serve a similar function, but union types can’t be used in this context, because they don’t have the combinators like map, flatMap, etc.

That raises some questions, though. Why do Lefts stop the for comprehension and Rights don’t? It’s because Either isn’t really symmetric in the types. Since it is always used for this error-handling idiom, the implementations of Left and Right bias towards the right as the “happy path”.

Let’s look how the combinators and some other methods work for these two types, using result1 and result2:

scala> result1    // Reminder of these values:
     | result2
val res6: Either[String, Int] = Right(3805)
val res7: Either[String, Int] = Left(nonpositive number -5)

scala> var r1  = 0
     | result1.foreach(i => r1 = i * 2)
     | var r2  = 0
     | result2.foreach(i => r1 = i * 2)
var r1: Int = 7610
var r2: Int = 0                                            1

scala> val r3  = result1.map(_ * 2)
     | val r4  = result2.map(_ * 2)
     |
val r3: Either[String, Int] = Right(7610)
val r4: Either[String, Int] = Left(nonpositive number -5)

scala> val r5a = result1.flatMap(i => Right(i * 2))
     | val r5b = result1.flatMap(i => Left("hello"))
     | val r5c = result1.flatMap(i => Left[String,Double]("hello"))
     | val r5d: Either[String,Double] = result1.flatMap(i => Left("hello"))
     | val r6  = result2.flatMap(i => Right(i * 2))
     |
val r5a: Either[String, Int] = Right(7610)
val r5b: Either[String, Nothing] = Left(hello)              2
val r5c: Either[String, Double] = Left(hello)
val r5d: Either[String, Double] = Left(hello)
val r6:  Either[String, Int] = Left(nonpositive number -5)
1

No change is made to r2 after initialization.

2

Note the second type for r5b vs. r5c and r5d. Using Left("hello") alone provides no information about the desired second type, so Nothing is used.

The filter and withFilter methods aren’t supported. They are somewhat redundant in this case.

You can infer that the Left method implementations ignore the function and just return their value. Right.map extracts the value, applies the function, then constructs a new Right, while Right.flatMap simply returns the value the function returns.

Finally, here is for comprehension that uses Eithers:

// src/script/scala/progscala3/forcomps/ForEithersSeq.scala

scala> val seq: Seq[Either[RuntimeException,Int]] =
     |   Vector(Right(10), Left(new RuntimeException("boo!")), Right(20))
     |
     | val results3 = for
     |   case Right(i) <- seq
     | yield 2 * i
val results3: Seq[Int] = Vector(20, 40)

Throwing exceptions versus returning Either values

Just as Either encourages handling of errors as normal return values, avoiding thrown exceptions is also valuable for uniform handling of errors, including maintaining referential transparency, which thrown exceptions violate. To see this, consider the following contrived example:

// src/script/scala/progscala3/forcomps/RefTransparency.scala

scala> def addInts(s1: String, s2: String): Int = s1.toInt + s2.toInt

scala> def addInts2(s1: String, s2: String): Either[String,Int] =
     |   try
     |     Right(s1.toInt + s2.toInt)
     |   catch
     |     case nfe: NumberFormatException => Left("NFE: "+nfe.getMessage)

scala> val add12a = addInts("1", "2")
     | val add12b = addInts2("1", "2")
val add12a: Int = 3
val add12b: Either[String, Int] = Right(3)

scala> val add1x  = addInts2("1", "x")
     | val addx2  = addInts2("x", "2")
     | val addxy  = addInts2("x", "y")
val add1x: Either[String, Int] = Left(NFE: For input string: "x")
val addx2: Either[String, Int] = Left(NFE: For input string: "x")
val addxy: Either[String, Int] = Left(NFE: For input string: "x")

We would like to believe that addInts is referentially transparent, so we could replace calls to it with values from a cache of previous invocations, for example. However, addInts will throw an exception if we pass a String that can’t be parsed as an Int. Hence, we can’t replace the function call with values that can be returned for all parameter lists.

Also, the type signature of addInts provides no indication that trouble lurks.

Using Either as the return type of addInts2 restores referential transparency and the type signature is explicit about potential errors. It is referentially transparent, because we could replace all calls with a value, even for bad string input.

Also, instead of grabbing control of the call stack by throwing the exception, we’ve reified the error by returning the exception as a Left value.

So, Either lets us maintain control of call stack in the event of a wide class of failures. It also makes the behavior more explicit to users of your APIs, through type signatures.

However, look at the implementation of addInts2 again. Handling exceptions is quite common, so the try … catch … boilerplate shown appears a lot in code.

So, for handling exceptions, we should encapsulate this boilerplate with types and use names for these types that express more clearly when we have either a “failure” or a “success.” The Try type does just that.

Try: When There Is No Do

scala.util.Try is structurally similar to Either. It is a sealed abstract class with two subclasses, Success and Failure.

Success is analogous to the conventional use of Right. It holds the normal return value. Failure is analogous to Left, but Failure always holds a Throwable, which is why Try has one type parameter, instead of two, for the value held by Success.

Here are the signatures of these types (omitting some traits that aren’t relevant to the discussion):

sealed abstract class Try[+T] extends AnyRef {...}
final case class Success[+T](value: T) extends Try[T] {...}
final case class Failure[+T](exception: Throwable) extends Try[T] {...}

Try is clearly asymmetric, unlike Either, where the asymmetry isn’t clear from the type signature, it just reflects convention and the convention determined how the combinators were implemented asymmetrically.

Let’s see how Try is used, again porting our previous example. First, if you have a list of Try values and just want to discard the Failures, a simple for comprehension does the trick:

// src/script/scala/progscala3/forcomps/ForTries.scala

scala> import scala.util.{ Try, Success, Failure }

scala> def positiveTries(i: Int): Try[Int] = Try {
     |   assert (i > 0, s"nonpositive number $i")
     |   i
     | }

scala> val result4 = for
     |   i1 <- positiveTries(5)
     |   i2 <- positiveTries(10 * i1)
     |   i3 <- positiveTries(25 * i2)
     |   i4 <- positiveTries(2  * i3)
     | yield (i1 + i2 + i3 + i4)
val result4: scala.util.Try[Int] = Success(3805)

scala> val result5 = for
     |   i1 <- positiveTries(5)
     |   i2 <- positiveTries(-1 * i1)      // FAIL!
     |   i3 <- positiveTries(25 * i2)
     |   i4 <- positiveTries(-2 * i3)
     | yield (i1 + i2 + i3 + i4)
     |
val result5: scala.util.Try[Int] =
  Failure(java.lang.AssertionError: assertion failed: nonpositive number -5)

Note the concise definition of positiveTries. If the assertion fails, the Try block will return a Failure wrapping the thrown java.lang.AssertionError. Otherwise, the result of the Try expression is wrapped in a Success. A more explicit definition of positiveTries showing the boilerplate is the following:

def positiveTries2(i: Int): Try[Int] =
  if (i > 0) Success(i)
  else Failure(new AssertionError("assertion failed"))

The for comprehensions look exactly like those for the original Option example. With type inference, there is very little boilerplate here, too. You can focus on the “happy path” logic and let Try capture errors.

Cats Validator

While using Option, Either, or Try meets most needs, there is one common scenario where using any of them remains tedious. Consider the case of form validation, where a user submits a form with several fields, all of which need to be validated. Ideally, you would validate all at once and report all errors, rather than doing one at a time, which is not a friendly user experience. Using Option, Either, or Try in a for comprehension doesn’t support this need, because processing is short-circuited as soon as a failure occurs. This is where cats.datatypes.Validated provides several useful approaches.

We’ll consider one approach here. First, start with some domain specific classes:

// src/main/scala/progscala3/forcomps/LoginFormValidation.scala

package progscala3.forcomps

case class ValidLoginForm(userName: String, password: String)        1

sealed trait LoginValidation:                                        2
  def error: String

case class Empty(name: String) extends LoginValidation:
  val error: String = s"The $name field can't be empty"

case class TooShort(name: String, n: Int) extends LoginValidation:
  val error: String = s"The $name field must have at least $n characters"

case class BadCharacters(name: String) extends LoginValidation:
  val error: String = s"The $name field has invalid characters"
1

A case class with the form fields to validate.

2

A trait used by other case classes that encapsulate each error.

Now we use them in the following code, where the acronym Nec stands for “non empty chain”. In this context, that means that a failed validation will have a sequence (“chain”) of one or more error objects.

// src/main/scala/progscala3/forcomps/LoginFormValidatorNec.scala
package progscala3.forcomps

import cats.implicits._
import cats.data._
import cats.data.Validated._
import scala.language.implicitConversions

/**
 * Nec variant, where NEC stands for "non empty chain".
 * @see https://typelevel.org/cats/datatypes/validated.html
 */
object LoginFormValidatorNec:

  type V[T] = ValidatedNec[LoginValidation, T]                       1

  def nonEmpty(field: String, name: String): V[String] =             2
    if field.length > 0 then field.validNec
    else Empty(name).invalidNec

  def notTooShort(field: String, name: String, n: Int): V[String] =
    if field.length >= n then field.validNec
    else TooShort(name, n).invalidNec

  /** For simplicity, just disallow whitespace. */
  def goodCharacters(field: String, name: String): V[String] =
    val re = raw".*s.*".r
    if re.matches(field) == false then field.validNec
    else BadCharacters(name).invalidNec

  def apply(                                                         3
      userName: String, password: String): V[ValidLoginForm] =
    (nonEmpty(userName, "user name"),
    notTooShort(userName, "user name", 5),
    goodCharacters(userName, "user name"),
    nonEmpty(password, "password"),
    notTooShort(password, "password", 5),
    goodCharacters(password, "password")).mapN {
      case (s1, _, _, s2, _, _) => ValidLoginForm(s1, s2)
    }

/**
 * This method uses the matching clauses shown rather something like this:
 *   assert(LoginFormValidatorNec("123 45", "678 90") ==
 *     Invalid(Chain(BadCharacters("user name"), BadCharacters("password"))))
 * This is necessary because we use -language:strictEquality, which causes
 * these == expressions to fail compilation!
 */
@main def TryLoginFormValidatorNec =
  import LoginFormValidatorNec._
  assert(LoginFormValidatorNec("", "") ==
    Invalid(Chain(
      Empty("user name"), TooShort("user name", 5),
      Empty("password"), TooShort("password", 5))))

  assert(LoginFormValidatorNec("1234", "6789") ==
    Invalid(Chain(
      TooShort("user name", 5),
      TooShort("password", 5))))

  assert(LoginFormValidatorNec("12345", "") ==
    Invalid(Chain(
      Empty("password"), TooShort("password", 5))))

  assert(LoginFormValidatorNec("123 45", "678 90") ==
    Invalid(Chain(
      BadCharacters("user name"), BadCharacters("password"))))

  assert(LoginFormValidatorNec("12345", "67890") ==
    Valid(ValidLoginForm("12345", "67890")))
1

Shorthand type alias. ValidationNec will encapsulate errors or successful results.

2

Several functions to test that fields meet desired criteria. When successful, an appropriate ValidationNec is constructed by calling either of the extension methods on String, validNec or invalidNec.

3

The apply method uses a Cats function mapN for mapping over the N elements of a tuple. It returns a final ValidationNec instance with all the accumulated errors in an Invalid(Chain(…)) or if all validation criteria were met, a Valid(ValidLoginForm(…)) holding the passed-in field values.

For comparison, see also in the example code, src/main/scala/progscala3/forcomps/LoginFormValidatorSingle.scala, which handles single failures using Either, but following a similar implementation approach.

Without a tool like Validation, we would have to manage the chain of errors ourselves.

Recap and What’s Next

Either, Try, and Validator express through types a fuller picture of how the program actually behaves. All three say that a valid value or values will (hopefully) be returned, but if not, they also encapsulate the failure information needed. Similarly, Option encapsulates the presence or absence of a value explicitly in the type signature.

Using these types instead of thrown exceptions keeps control of the call stack, signals to the reader the kinds of errors that might occur, and allows error conditions to be less “exceptional” and more amenable to programmatic handling, just like the happy path scenarios.

Another benefit we haven’t mentioned yet is a benefit for asynchronous (concurrent) code. Because asynchronous code isn’t guaranteed to be running on the same thread as the caller, it might not be possible to catch and handle an exception. However, by returning errors the same way normal results are returned, the caller can more easily intercept and handle the problem. We’ll explore the details in ToolsForConcurrency.

You probably expected this chapter to be a perfunctory explanation of Scala’s fancy for loops. Instead, we broke through the facade to find a surprisingly powerful set of tools. We saw how a set of functions, map, flatMap, foreach, and withFilter, plug into for comprehensions to provide concise, flexible, yet powerful tools for building nontrivial application logic.

We saw how to use for comprehensions to work with collections, but we also saw how useful they are for other container types, specifically Option, Either, Try, and Cats Validated.

Now we have finished our exploration of the essential parts of functional programming and their support in Scala. We’ll learn more concepts when we discuss the type system in ScalasTypeSystemI and ScalasTypeSystemII and explore advanced concepts in AdvancedFunctionalProgramming.

Let’s now turn to Scala’s support for object-oriented programming. We’ve already covered many of the details in passing. Now we’ll complete the picture.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.235.144