Chapter 3. Rounding Out the Essentials

Before we dive into Scala’s support for object-oriented and functional programming, let’s finish our discussion of the essential features you’ll use in most of your programs.

Operator? Operator?

An important fundamental concept in Scala is that all operators are actually methods. Consider this most basic of examples:

// code-examples/Rounding/one-plus-two-script.scala

1 + 2

That plus sign between the numbers? It’s a method. First, Scala allows non-alphanumeric method names. You can call methods +, -, $, or whatever you desire. Second, this expression is identical to 1 .+(2). (We put a space after the 1 because 1. would be interpreted as a Double.) When a method takes one argument, Scala lets you drop both the period and the parentheses, so the method invocation looks like an operator invocation. This is called “infix” notation, where the operator is between the instance and the argument. We’ll find out more about this shortly.

Similarly, a method with no arguments can be invoked without the period. This is called “postfix” notation.

Ruby and Smalltalk programmers should now feel right at home. As users of those languages know, these simple rules have far-reaching benefits when it comes to creating programs that flow naturally and elegantly.

So, what characters can you use in identifiers? Here is a summary of the rules for identifiers, used for method and type names, variables, etc. For the precise details, see [ScalaSpec2009]. Scala allows all the printable ASCII characters, such as letters, digits, the underscore (_), and the dollar sign ($), with the exceptions of the “parenthetical” characters—(, ), [, ], {, and }—and the “delimiter” characters—`, , ', ", ., ;, and ,. Scala allows the other characters between u0020–u007F that are not in the sets just shown, such as mathematical symbols and “other” symbols. These remaining characters are called operator characters, and they include characters such as /, <, etc.

Reserved words can’t be used

As in most languages, you can’t reuse reserved words for identifiers. We listed the reserved words in Reserved Words. Recall that some of them are combinations of operator and punctuation characters. For example, a single underscore (_) is a reserved word!

Plain identifiers—combinations of letters, digits, $, _, and operators

Like Java and many languages, a plain identifier can begin with a letter or underscore, followed by more letters, digits, underscores, and dollar signs. Unicode-equivalent characters are also allowed. However, like Java, Scala reserves the dollar sign for internal use, so you shouldn’t use it in your own identifiers. After an underscore, you can have either letters and digits or a sequence of operator characters. The underscore is important. It tells the compiler to treat all the characters up to the next whitespace as part of the identifier. For example, val xyz_++= = 1 assigns the variable xyz_++= the value 1, while the expression val xyz++= = 1 won’t compile because the “identifier” could also be interpreted as xyz ++=, which looks like an attempt to append something to xyz. Similarly, if you have operator characters after the underscore, you can’t mix them with letters and digits. This restriction prevents ambiguous expressions like this: abc_=123. Is that an identifier abc_=123 or an assignment of the value 123 to abc_?

Plain identifiers—operators

If an identifier begins with an operator character, the rest of the characters must be operator characters.

“Back-quote” literals

An identifier can also be an arbitrary string (subject to platform limitations) between two back quote characters, e.g., val `this is a valid identifier` = "Hello World!". Recall that this syntax is also the way to invoke a method on a Java or .NET class when the method’s name is identical to a Scala reserved word, e.g., java.net.Proxy.‵type‵().

Pattern matching identifiers

In pattern matching expressions, tokens that begin with a lowercase letter are parsed as variable identifiers, while tokens that begin with an uppercase letter are parsed as constant identifiers. This restriction prevents some ambiguities because of the very succinct variable syntax that is used, e.g., no val keyword is present.

Syntactic Sugar

Once you know that all operators are methods, it’s easier to reason about unfamiliar Scala code. You don’t have to worry about special cases when you see new operators. When working with Actors in A Taste of Concurrency, you may have noticed that we used an exclamation point (!) to send a message to an Actor. Now you know that the ! is just another method, as are the other handy shortcut operators you can use to talk to Actors. Similarly, Scala’s XML library provides the and \ operators to dive into document structures. These are just methods on the scala.xml.NodeSeq class.

This flexible method naming gives you the power to write libraries that feel like a natural extension of Scala itself. You could write a new math library with numeric types that accept all the usual mathematical operators, like addition and subtraction. You could write a new concurrent messaging layer that behaves just like Actors. The possibilities are constrained only by Scala’s method naming limitations.

Caution

Just because you can doesn’t mean you should. When designing your own libraries and APIs in Scala, keep in mind that obscure punctuational operators are hard for programmers to remember. Overuse of these can contribute a “line noise” quality of unreadability to your code. Stick to conventions and err on the side of spelling method names out when a shortcut doesn’t come readily to mind.

Methods Without Parentheses and Dots

To facilitate a variety of readable programming styles, Scala is flexible about the use of parentheses in methods. If a method takes no parameters, you can define it without parentheses. Callers must invoke the method without parentheses. If you add empty parentheses, then callers may optionally add parentheses. For example, the size method for List has no parentheses, so you write List(1, 2, 3).size. If you try List(1, 2, 3).size(), you’ll get an error. However, the length method for java.lang.String does have parentheses in its definition, but Scala lets you write both "hello".length() and "hello".length.

The convention in the Scala community is to omit parentheses when calling a method that has no side effects. So, asking for the size of a sequence is fine without parentheses, but defining a method that transforms the elements in the sequence should be written with parentheses. This convention signals a potentially tricky method for users of your code.

It’s also possible to omit the dot (period) when calling a parameterless method or one that takes only one argument. With this in mind, our List(1, 2, 3).size example could be written as:

// code-examples/Rounding/no-dot-script.scala

List(1, 2, 3) size

Neat, but confusing. When does this syntactical flexibility become useful? When chaining method calls together into expressive, self-explanatory “sentences” of code:

// code-examples/Rounding/no-dot-better-script.scala

def isEven(n: Int) = (n % 2) == 0

List(1, 2, 3, 4) filter isEven foreach println

As you might guess, running this produces the following output:

2
4

Scala’s liberal approach to parentheses and dots on methods provides one building block for writing Domain-Specific Languages. We’ll learn more about them after a brief discussion of operator precedence.

Precedence Rules

So, if an expression like 2.0 * 4.0 / 3.0 * 5.0 is actually a series of method calls on Doubles, what are the operator precedence rules? Here they are in order from lowest to highest precedence (see [ScalaSpec2009]):

  1. All letters

  2. |

  3. ^

  4. &

  5. < >

  6. = !

  7. :

  8. + -

  9. * / %

  10. All other special characters

Characters on the same line have the same precedence. An exception is = when used for assignment, when it has the lowest precedence.

Since * and / have the same precedence, the two lines in the following scala session behave the same:

scala> 2.0 * 4.0 / 3.0 * 5.0
res2: Double = 13.333333333333332

scala> (((2.0 * 4.0) / 3.0) * 5.0)
res3: Double = 13.333333333333332

In a sequence of left-associative method invocations, they simply bind in left-to-right order. “Left-associative” you say? In Scala, any method with a name that ends with a colon : actually binds to the right, while all other methods bind to the left. For example, you can prepend an element to a List using the :: method (called “cons,” short for “constructor”):

scala> val list = List('b', 'c', 'd')
list: List[Char] = List(b, c, d)

scala> 'a' :: list
res4: List[Char] = List(a, b, c, d)

The second expression is equivalent to list.::(a). In a sequence of right-associative method invocations, they bind from right to left. What about a mixture of left-binding and right-binding expressions?

scala> 'a' :: list ++ List('e', 'f')
res5: List[Char] = List(a, b, c, d, e, f)

(The ++ method appends two lists.) In this case, list is added to the List(e, f), then a is prepended to create the final list. It’s usually better to add parentheses to remove any potential uncertainty.

Tip

Any method whose name ends with a : binds to the right, not the left.

Finally, note that when you use the scala command, either interactively or with scripts, it may appear that you can define “global” variables and methods outside of types. This is actually an illusion; the interpreter wraps all definitions in an anonymous type before generating JVM or .NET CLR byte code.

Domain-Specific Languages

Domain-Specific Languages, or DSLs, provide a convenient syntactical means for expressing goals in a given problem domain. For example, SQL provides just enough of a programming language to handle the problems of working with databases, making it a Domain-Specific Language.

While some DSLs like SQL are self-contained, it’s become popular to implement DSLs as subsets of full-fledged programming languages. This allows programmers to leverage the entirety of the host language for edge cases that the DSL does not cover, and saves the work of writing lexers, parsers, and the other building blocks of a language.

Scala’s rich, flexible syntax makes writing DSLs a breeze. Consider this example of a style of test writing called Behavior-Driven Development (see [BDD]) using the Specs library (see Specs):

// code-examples/Rounding/specs-script.scala

"nerd finder" should {
  "identify nerds from a List" in {
    val actors = List("Rick Moranis", "James Dean", "Woody Allen")
    val finder = new NerdFinder(actors)
    finder.findNerds mustEqual List("Rick Moranis", "Woody Allen")
  }
}

Notice how much this code reads like English: “This should test that in the following scenario,” “This value must equal that value,” and so forth. This example uses the superb Specs library, which effectively provides a DSL for the Behavior-Driven Development testing and engineering methodology. By making maximum use of Scala’s liberal syntax and rich methods, Specs test suites are readable even by non-developers.

This is just a taste of the power of DSLs in Scala. We’ll see other examples later and learn how to write our own as we get more advanced (see Chapter 11).

Scala if Statements

Even the most familiar language features are supercharged in Scala. Let’s have a look at the lowly if statement. As in most every language, Scala’s if evaluates a conditional expression, then proceeds to a block if the result is true, or branches to an alternate block if the result is false. A simple example:

// code-examples/Rounding/if-script.scala

if (2 + 2 == 5) {
  println("Hello from 1984.")
} else if (2 + 2 == 3) {
    println("Hello from Remedial Math class?")
} else {
  println("Hello from a non-Orwellian future.")
}

What’s different in Scala is that if and almost all other statements are actually expressions themselves. So, we can assign the result of an if expression, as shown here:

// code-examples/Rounding/assigned-if-script.scala

val configFile = new java.io.File("~/.myapprc")

val configFilePath = if (configFile.exists()) {
  configFile.getAbsolutePath()
} else {
  configFile.createNewFile()
  configFile.getAbsolutePath()
}

Note that if statements are expressions, meaning they have values. In this example, the value configFilePath is the result of an if expression that handles the case of a configuration file not existing internally, then returns the absolute path to that file. This value can now be reused throughout an application, and the if expression won’t be reevaluated when the value is used.

Because if statements are expressions in Scala, there is no need for the special-case ternary conditional expressions that exist in C-derived languages. You won’t see x ? doThis() : doThat() in Scala. Scala provides a mechanism that’s just as powerful and more readable.

What if we omit the else clause in the previous example? Typing the code in the scala interpreter will tell us what happens:

scala> val configFile = new java.io.File("~/.myapprc")
configFile: java.io.File = ~/.myapprc

scala> val configFilePath = if (configFile.exists()) {
     |   configFile.getAbsolutePath()
     | }
configFilePath: Unit = ()

scala>

Note that configFilePath is now Unit. (It was String before.) The type inference picks a type that works for all outcomes of the if expression. Unit is the only possibility, since no value is one possible outcome.

Scala for Comprehensions

Another familiar control structure that’s particularly feature-rich in Scala is the for loop, referred to in the Scala community as a for comprehension or for expression. This corner of the language deserves at least one fancy name, because it can do some great party tricks.

Actually, the term comprehension comes from functional programming. It expresses the idea that we are traversing a set of some kind, “comprehending” what we find, and computing something new from it.

A Dog-Simple Example

Let’s start with a basic for expression:

// code-examples/Rounding/basic-for-script.scala

val dogBreeds = List("Doberman", "Yorkshire Terrier", "Dachshund",
                     "Scottish Terrier", "Great Dane", "Portuguese Water Dog")

for (breed <- dogBreeds)
  println(breed)

As you might guess, this code says, “For every element in the list dogBreeds, create a temporary variable called breed with the value of that element, then print it.” Think of the <- operator as an arrow directing elements of a collection, one by one, to the scoped variable by which we’ll refer to them inside the for expression. The left-arrow operator is called a generator, so named because it’s generating individual values from a collection for use in an expression.

Filtering

What if we want to get more granular? Scala’s for expressions allow for filters that let us specify which elements of a collection we want to work with. So to find all terriers in our list of dog breeds, we could modify the previous example to the following:

// code-examples/Rounding/filtered-for-script.scala

for (breed <- dogBreeds
  if breed.contains("Terrier")
) println(breed)

To add more than one filter to a for expression, separate the filters with semicolons:

// code-examples/Rounding/double-filtered-for-script.scala

for (breed <- dogBreeds
  if breed.contains("Terrier");
  if !breed.startsWith("Yorkshire")
) println(breed)

You’ve now found all the terriers that don’t hail from Yorkshire, and hopefully learned just how useful filters can be in the process.

Yielding

What if, rather than printing your filtered collection, you needed to hand it off to another part of your program? The yield keyword is your ticket to generating new collections with for expressions. In the following example, note that we’re wrapping up the for expression in curly braces, as we would when defining any block:

// code-examples/Rounding/yielding-for-script.scala

val filteredBreeds = for {
  breed <- dogBreeds
  if breed.contains("Terrier")
  if !breed.startsWith("Yorkshire")
} yield breed

Tip

for expressions may be defined with parentheses or curly braces, but using curly braces means you don’t have to separate your filters with semicolons. Most of the time, you’ll prefer using curly braces when you have more than one filter, assignment, etc.

Every time through the for expression, the filtered result is yielded as a value named breed. These results accumulate with every run, and the resulting collection is assigned to the value filteredBreeds (as we did with if statements earlier). The type of the collection resulting from a for-yield expression is inferred from the type of the collection being iterated over. In this case, filteredBreeds is of type List[String], since it is a subset of the dogBreeds list, which is also of type List[String].

Expanded Scope

One final useful feature of Scala’s for comprehensions is the ability to define variables inside the first part of your for expressions that can be used in the latter part. This is best illustrated with an example:

// code-examples/Rounding/scoped-for-script.scala

for {
  breed <- dogBreeds
  upcasedBreed = breed.toUpperCase()
} println(upcasedBreed)

Note that without declaring upcasedBreed as a val, you can reuse it within the body of your for expression. This approach is ideal for transforming elements in a collection as you loop through them.

Finally, in Options and for Comprehensions, we’ll see how using Options with for comprehensions can greatly reduce code size by eliminating unnecessary “null” and “missing” checks.

Other Looping Constructs

Scala provides several other looping constructs.

Scala while Loops

Familiar in many languages, the while loop executes a block of code as long as a condition is true. For example, the following code prints out a complaint once a day until the next Friday the 13th has arrived:

// code-examples/Rounding/while-script.scala
// WARNING: This script runs for a LOOOONG time!

import java.util.Calendar

def isFridayThirteen(cal: Calendar): Boolean = {
  val dayOfWeek = cal.get(Calendar.DAY_OF_WEEK)
  val dayOfMonth = cal.get(Calendar.DAY_OF_MONTH)

  // Scala returns the result of the last expression in a method
  (dayOfWeek == Calendar.FRIDAY) && (dayOfMonth == 13)
}

while (!isFridayThirteen(Calendar.getInstance())) {
  println("Today isn't Friday the 13th. Lame.")
  // sleep for a day
  Thread.sleep(86400000)
}

Table 3-1 later in this chapter shows the conditional operators that work in while loops.

Scala do-while Loops

Like the while loop, a do-while loop executes some code while a conditional expression is true. The only difference that a do-while checks to see if the condition is true after running the block. To count up to 10, we could write this:

// code-examples/Rounding/do-while-script.scala

var count = 0

do {
  count += 1
  println(count)
} while (count < 10)

As it turns out, there’s a more elegant way to loop through collections in Scala, as we’ll see in the next section.

Generator Expressions

Remember the arrow operator (<-) from the discussion about for loops? We can put it to work here, too. Let’s clean up the do-while example just shown:

// code-examples/Rounding/generator-script.scala

for (i <- 1 to 10) println(i)

Yup, that’s all that’s necessary. This clean one-liner is possible because of Scala’s RichInt class. An implicit conversion is invoked by the compiler to convert the 1, an Int, into a RichInt. (We’ll discuss these conversions in The Scala Type Hierarchy and in Implicit Conversions.) RichInt defines a to method that takes another integer and returns an instance of Range.Inclusive. That is, Inclusive is a nested class in the Range companion object (a concept we introduced briefly in Chapter 1; see Chapter 6 for details). This subclass of the class Range inherits a number of methods for working with sequences and iterable data structures, including those necessary to use it in a for loop.

By the way, if you wanted to count from 1 up to but not including 10, you could use until instead of to. For example: for (i <- 0 until 10).

This should paint a clearer picture of how Scala’s internal libraries compose to form easy-to-use language constructs.

Note

When working with loops in most languages, you can break out of a loop or continue the iterations. Scala doesn’t have either of these statements, but when writing idiomatic Scala code, they’re not necessary. Use conditional expressions to test if a loop should continue, or make use of recursion. Better yet, filter your collections ahead of time to eliminate complex conditions within your loops. However, because of demand for it, Scala version 2.8 includes support for break, implemented as a library method, rather than a built-in break keyword.

Conditional Operators

Scala borrows most of the conditional operators from Java and its predecessors. You’ll find the ones listed in Table 3-1 in if statements, while loops, and everywhere else conditions apply.

Table 3-1. Conditional operators
OperatorOperationDescription

&&

and

The values on the left and right of the operator are true. The righthand side is only evaluated if the lefthand side is true.

||

or

At least one of the values on the left or right is true. The righthand side is only evaluated if the lefthand side is false.

>

greater than

The value on the left is greater than the value on the right.

>=greater than or equalsThe value on the left is greater than or equal to the value on the right.

<

less than

The value on the left is less than the value on the right.

<=less than or equalsThe value on the left is less than or equal to the value on the right.

==

equals

The value on the left is the same as the value on the right.

!=

not equal

The value on the left is not the same as the value on the right.

Note that && and || are “short-circuiting” operators. They stop evaluating expressions as soon as the answer is known.

We’ll discuss object equality in more detail in Equality of Objects. For example, we’ll see that == has a different meaning in Scala versus Java. Otherwise, these operators should all be familiar, so let’s move on to something new and exciting.

Pattern Matching

An idea borrowed from functional languages, pattern matching is a powerful yet concise way to make a programmatic choice between multiple conditions. Pattern matching is the familiar case statement from your favorite C-like language, but on steroids. In the typical case statement you’re limited to matching against values of ordinal types, yielding trivial expressions like this: “In the case that i is 5, print a message; in the case that i is 6, exit the program.” With Scala’s pattern matching, your cases can include types, wildcards, sequences, regular expressions, and even deep inspections of an object’s variables.

A Simple Match

To begin with, let’s simulate flipping a coin by matching the value of a boolean:

// code-examples/Rounding/match-boolean-script.scala

val bools = List(true, false)

for (bool <- bools) {
  bool match {
    case true => println("heads")
    case false => println("tails")
    case _ => println("something other than heads or tails (yikes!)")
  }
}

It looks just like a C-style case statement, right? The only difference is the last case with the underscore (_) wildcard. It matches anything not defined in the cases above it, so it serves the same purpose as the default keyword in Java and C# switch statements.

Pattern matching is eager; the first match wins. So, if you try to put a case _ clause before any other case clauses, the compiler will throw an “unreachable code” error on the next clause, because nothing will get past the default clause!

Tip

Use case _ for the default, “catch-all” match.

What if we want to work with matches as variables?

Variables in Matches

In the following example, we assign the wildcard case to a variable called otherNumber, then print it in the subsequent expression. If we generate a 7, we’ll extol that number’s virtues. Otherwise, we’ll curse fate for making us suffer an unlucky number:

// code-examples/Rounding/match-variable-script.scala

import scala.util.Random

val randomInt = new Random().nextInt(10)

randomInt match {
  case 7 => println("lucky seven!")
  case otherNumber => println("boo, got boring ol' " + otherNumber)
}

Matching on Type

These simple examples don’t even begin to scratch the surface of Scala’s pattern matching features. Let’s try matching based on type:

// code-examples/Rounding/match-type-script.scala

val sundries = List(23, "Hello", 8.5, 'q')

for (sundry <- sundries) {
  sundry match {
    case i: Int => println("got an Integer: " + i)
    case s: String => println("got a String: " + s)
    case f: Double => println("got a Double: " + f)
    case other => println("got something else: " + other)
  }
}

Here we pull each element out of a List of Any type of element, in this case containing a String, a Double, an Int, and a Char. For the first three of those types, we let the user know specifically which type we got and what the value was. When we get something else (the Char), we just let the user know the value. We could add further elements to the list of other types and they’d be caught by the other wildcard case.

Matching on Sequences

Since working in Scala often means working with sequences, wouldn’t it be handy to be able to match against the length and contents of lists and arrays? The following example does just that, testing two lists to see if they contain four elements, the second of which is the integer 3:

// code-examples/Rounding/match-seq-script.scala

val willWork = List(1, 3, 23, 90)
val willNotWork = List(4, 18, 52)
val empty = List()

for (l <- List(willWork, willNotWork, empty)) {
  l match {
    case List(_, 3, _, _) => println("Four elements, with the 2nd being '3'.")
    case List(_*) => println("Any other list with 0 or more elements.")
  }
}

In the second case we’ve used a special wildcard pattern to match a List of any size, even zero elements, and any element values. You can use this pattern at the end of any sequence match to remove length as a condition.

Recall that we mentioned the “cons” method for List, ::. The expression a :: list prepends a to a list. You can also use this operator to extract the head and tail of a list:

// code-examples/Rounding/match-list-script.scala

val willWork = List(1, 3, 23, 90)
val willNotWork = List(4, 18, 52)
val empty = List()

def processList(l: List[Any]): Unit = l match {
  case head :: tail =>
    format("%s ", head)
    processList(tail)
  case Nil => println("")
}

for (l <- List(willWork, willNotWork, empty)) {
  print("List: ")
  processList(l)
}

The processList method matches on the List argument l. It may look strange to start the method definition like the following:

def processList(l: List[Any]): Unit = l match {
  ...
}

Hopefully hiding the details with the ellipsis makes the meaning a little clearer. The processList method is actually one statement that crosses several lines.

It first matches on head :: tail, where head will be assigned the first element in the list and tail will be assigned the rest of the list. That is, we’re extracting the head and tail from the list using ::. When this case matches, it prints the head and calls processList recursively to process the tail.

The second case matches the empty list, Nil. It prints an end of line and terminates the recursion.

Matching on Tuples (and Guards)

Alternately, if we just wanted to test that we have a tuple of two items, we could do a tuple match:

// code-examples/Rounding/match-tuple-script.scala

val tupA = ("Good", "Morning!")
val tupB = ("Guten", "Tag!")

for (tup <- List(tupA, tupB)) {
  tup match {
    case (thingOne, thingTwo) if thingOne == "Good" =>
        println("A two-tuple starting with 'Good'.")
    case (thingOne, thingTwo) =>
        println("This has two things: " + thingOne + " and " + thingTwo)
  }
}

In the second case in this example, we’ve extracted the values inside the tuple to scoped variables, then reused these variables in the resulting expression.

In the first case we’ve added a new concept: guards. The if condition after the tuple is a guard. The guard is evaluated when matching, but only extracting any variables in the preceding part of the case. Guards provide additional granularity when constructing cases. In this example, the only difference between the two patterns is the guard expression, but that’s enough for the compiler to differentiate them.

Tip

Recall that the cases in a pattern match are evaluated in order. For example, if your first case is broader than your second case, the second case will never be reached. (Unreachable cases will cause a compiler error.) You may include a “default” case at the end of a pattern match, either using the underscore wildcard character or a meaningfully named variable. When using a variable, it should have no explicit type or it should be declared as Any, so it can match anything. On the other hand, try to design your code to avoid a catch-all clause by ensuring it only receives specific items that are expected.

Matching on Case Classes

Let’s try a deep match, examining the contents of objects in our pattern match:

// code-examples/Rounding/match-deep-script.scala

case class Person(name: String, age: Int)

val alice = new Person("Alice", 25)
val bob = new Person("Bob", 32)
val charlie = new Person("Charlie", 32)

for (person <- List(alice, bob, charlie)) {
  person match {
    case Person("Alice", 25) => println("Hi Alice!")
    case Person("Bob", 32) => println("Hi Bob!")
    case Person(name, age) =>
      println("Who are you, " + age + " year-old person named " + name + "?")
  }
}

Poor Charlie gets the cold shoulder, as we can see in the output:

Hi Alice!
Hi Bob!
Who are you, 32 year-old person named Charlie?

We first define a case class, a special type of class that we’ll learn more about in Case Classes. For now, it will suffice to say that a case class allows for very terse construction of simple objects with some predefined methods. Our pattern match then looks for Alice and Bob by inspecting the values passed to the constructor of the Person case class. Charlie falls through to the catch-all case; even though he has the same age value as Bob, we’re matching on the name property as well.

This type of pattern match becomes extremely useful when working with Actors, as we’ll see later on. Case classes are frequently sent to Actors as messages, and deep pattern matching on an object’s contents is a convenient way to “parse” those messages.

Matching on Regular Expressions

Regular expressions are convenient for extracting data from strings that have an informal structure, but are not “structured data” (that is, in a format like XML or JSON, for example). Commonly referred to as regexes, regular expressions are a feature of nearly all modern programming languages. They provide a terse syntax for specifying complex matches, one that is typically translated into a state machine behind the scenes for optimum performance.

Regexes in Scala should contain no surprises if you’ve used them in other programming languages. Let’s see an example:

// code-examples/Rounding/match-regex-script.scala

val BookExtractorRE = """Book: title=([^,]+),s+authors=(.+)""".r
val MagazineExtractorRE = """Magazine: title=([^,]+),s+issue=(.+)""".r

val catalog = List(
  "Book: title=Programming Scala, authors=Dean Wampler, Alex Payne",
  "Magazine: title=The New Yorker, issue=January 2009",
  "Book: title=War and Peace, authors=Leo Tolstoy",
  "Magazine: title=The Atlantic, issue=February 2009",
  "BadData: text=Who put this here??"
)

for (item <- catalog) {
  item match {
    case BookExtractorRE(title, authors) =>
      println("Book "" + title + "", written by " + authors)
    case MagazineExtractorRE(title, issue) =>
      println("Magazine "" + title + "", issue " + issue)
    case entry => println("Unrecognized entry: " + entry)
  }
}

We start with two regular expressions, one for records of books and another for records of magazines. Calling .r on a string turns it into a regular expression; we use raw (triple-quoted) strings here to avoid having to double-escape backslashes. Should you find the .r transformation method on strings unclear, you can also define regexes by creating new instances of the Regex class, as in: new Regex("""W""").

Notice that each of our regexes defines two capture groups, connoted by parentheses. Each group captures the value of a single field in the record, such as a book’s title or author. Regexes in Scala translate those capture groups to extractors. Every match sets a field to the captured result; every miss is set to null.

What does this mean in practice? If the text fed to the regular expression matches, case BookExtractorRE(title, authors) will assign the first capture group to title and the second to authors. We can then use those values on the righthand side of the case clause, as we have in the previous example. The variable names title and author within the extractor are arbitrary; matches from capture groups are simply assigned from left to right, and you can call them whatever you’d like.

That’s regexes in Scala in nutshell. The scala.util.matching.Regex class supplies several handy methods for finding and replacing matches in strings, both all occurrences of a match and just the first occurrence, so be sure to make use of them.

What we won’t cover in this section is the details of writing regular expressions. Scala’s Regex class uses the underlying platform’s regular expression APIs (that is, Java’s or .NET’s). Consult references on those APIs for the hairy details, as they may be subtly different from the regex support in your language of choice.

Binding Nested Variables in Case Clauses

Sometimes you want to bind a variable to an object enclosed in a match, where you are also specifying match criteria on the nested object. Suppose we modify a previous example so we’re matching on the key-value pairs from a map. We’ll store our same Person objects as the values and use an employee ID as the key. We’ll also add another attribute to Person, a role field that points to an instance from a type hierarchy:

// code-examples/Rounding/match-deep-pair-script.scala

class Role
case object Manager extends Role
case object Developer extends Role

case class Person(name: String, age: Int, role: Role)

val alice = new Person("Alice", 25, Developer)
val bob = new Person("Bob", 32, Manager)
val charlie = new Person("Charlie", 32, Developer)

for (item <- Map(1 -> alice, 2 -> bob, 3 -> charlie)) {
  item match {
    case (id, p @ Person(_, _, Manager)) => format("%s is overpaid.
", p)
    case (id, p @ Person(_, _, _)) => format("%s is underpaid.
", p)
  }
}

The case objects are just singleton objects like we’ve seen before, but with the special case behavior. We’re most interested in the embedded p @ Person(...) inside the case clause. We’re matching on particular kinds of Person objects inside the enclosing tuple. We also want to assign the Person to a variable p, so we can use it for printing:

Person(Alice,25,Developer) is underpaid.
Person(Bob,32,Manager) is overpaid.
Person(Charlie,32,Developer) is underpaid.

If we weren’t using matching criteria in Person itself, we could just write p: Person. For example, the previous match clause could be written this way:

item match {
  case (id, p: Person) => p.role match {
    case Manager => format("%s is overpaid.
", p)
    case _ => format("%s is underpaid.
", p)
  }
}

Note that the p @ Person(...) syntax gives us a way to flatten this nesting of match statements into one statement. It is analogous to using “capture groups” in a regular expression to pull out substrings we want, instead of splitting the string in several successive steps to extract the substrings we want. Use whichever technique you prefer.

Using try, catch, and finally Clauses

Through its use of functional constructs and strong typing, Scala encourages a coding style that lessens the need for exceptions and exception handling. But where Scala interacts with Java, exceptions are still prevalent.

Note

Scala does not have checked exceptions, like Java. Even Java’s checked exceptions are treated as unchecked by Scala. There is also no throws clause on method declarations. However, there is a @throws annotation that is useful for Java interoperability. See the section Annotations.

Thankfully, Scala treats exception handling as just another pattern match, allowing us to make smart choices when presented with a multiplicity of potential exceptions. Let’s see this in action:

// code-examples/Rounding/try-catch-script.scala

import java.util.Calendar

val then = null
val now = Calendar.getInstance()

try {
  now.compareTo(then)
} catch {
  case e: NullPointerException => println("One was null!"); System.exit(-1)
  case unknown => println("Unknown exception " + unknown); System.exit(-1)
} finally {
  println("It all worked out.")
  System.exit(0)
}

In this example, we explicitly catch the NullPointerException thrown when trying to compare a Calendar instance with null. We also define unknown as a catch-all case, just to be safe. If we weren’t hardcoding this program to fail, the finally block would be reached and the user would be informed that everything worked out just fine.

Note

You can use an underscore (Scala’s standard wildcard character) as a placeholder to catch any type of exception (really, to match any case in a pattern matching expression). However, you won’t be able to refer to the exception in the subsequent expression. Name the exception variable if you need it; for example, if you need to print the exception as we do in the catch-all case of the previous example.

Pattern matching aside, Scala’s treatment of exception handling should be familiar to those fluent in Java, Ruby, Python, and most other mainstream languages. And yes, you throw an exception by writing throw new MyBadException(...). That’s all there is to it.

Concluding Remarks on Pattern Matching

Pattern matching is a powerful and elegant way of extracting information from objects, when used appropriately. Recall from Chapter 1 that we highlighted the synergy between pattern matching and polymorphism. Most of the time, you want to avoid the problems of “switch” statements that know a class hierarchy, because they have to be modified every time the hierarchy is changed.

In our drawing Actor example, we used pattern matching to separate different “categories” of messages, but we used polymorphism to draw the shapes sent to it. We could change the Shape hierarchy and the Actor code would not require changes.

Pattern matching is also useful for the design problem where you need to get at data inside an object, but only in special circumstances. One of the unintended consequences of the JavaBeans (see [JavaBeansSpec]) specification was that it encouraged people to expose fields in their objects through getters and setters. This should never be a default decision. Access to “state information” should be encapsulated and exposed only in ways that make logical sense for the type, as viewed from the abstraction it exposes.

Instead, consider using pattern matching for those “rare” times when you need to extract information in a controlled way. As we will see in Unapply, the pattern matching examples we have shown use unapply methods defined to extract information from instances. These methods let you extract that information while hiding the implementation details. In fact, the information returned by unapply might be a transformation of the actual information in the type.

Finally, when designing pattern matching statements, be wary of relying on a default case clause. Under what circumstances would “none of the above” be the correct answer? It may indicate that the design should be refined so you know more precisely all the possible matches that might occur. We’ll learn one technique that helps when we discuss sealed class hierarchies in Sealed Class Hierarchies.

Enumerations

Remember our examples involving various breeds of dog? In thinking about the types in these programs, we might want a top-level Breed type that keeps track of a number of breeds. Such a type is called an enumerated type, and the values it contains are called enumerations.

While enumerations are a built-in part of many programming languages, Scala takes a different route and implements them as a class in its standard library. This means there is no special syntax for enumerations in Scala, as in Java and C#. Instead, you just define an object that extends the Enumeration class. Hence, at the byte code level, there is no connection between Scala enumerations and the enum constructs in Java and C#.

Here is an example:

// code-examples/Rounding/enumeration-script.scala

object Breed extends Enumeration {
  val doberman = Value("Doberman Pinscher")
  val yorkie = Value("Yorkshire Terrier")
  val scottie = Value("Scottish Terrier")
  val dane = Value("Great Dane")
  val portie = Value("Portuguese Water Dog")
}

// print a list of breeds and their IDs
println("ID	Breed")
for (breed <- Breed) println(breed.id + "	" + breed)

// print a list of Terrier breeds
println("
Just Terriers:")
Breed.filter(_.toString.endsWith("Terrier")).foreach(println)

When run, you’ll get the following output:

ID      Breed
0       Doberman Pinscher
1       Yorkshire Terrier
2       Scottish Terrier
3       Great Dane
4       Portuguese Water Dog

Just Terriers:
Yorkshire Terrier
Scottish Terrier

We can see that our Breed enumerated type contains several variables of type Value, as in the following example:

val doberman = Value("Doberman Pinscher")

Each declaration is actually calling a method named Value that takes a string argument. We use this method to assign a long-form breed name to each enumeration value, which is what the Value.toString method returned in the output.

Note that there is no namespace collision between the type and method that both have the name Value. There are other overloaded versions of the Value method. One of them takes no arguments, another takes an Int ID value, and another takes both an Int and String. These Value methods return a Value object, and they add the value to the enumeration’s collection of values.

In fact, Scala’s Enumeration class supports the usual methods for working with collections, so we can easily iterate through the breeds with a for loop and filter them by name. The output above also demonstrated that every Value in an enumeration is automatically assigned a numeric identifier, unless you call one of the Value methods where you specify your own ID value explicitly.

You’ll often want to give your enumeration values human-readable names, as we did here. However, sometimes you may not need them. Here’s another enumeration example adapted from the Scaladoc entry for Enumeration:

// code-examples/Rounding/days-enumeration-script.scala

object WeekDay extends Enumeration {
  type WeekDay = Value
  val Mon, Tue, Wed, Thu, Fri, Sat, Sun = Value
}
import WeekDay._

def isWorkingDay(d: WeekDay) = ! (d == Sat || d == Sun)

WeekDay filter isWorkingDay foreach println

Running this script with scala yields the following output:

Main$$anon$1$WeekDay(0)
Main$$anon$1$WeekDay(1)
Main$$anon$1$WeekDay(2)
Main$$anon$1$WeekDay(3)
Main$$anon$1$WeekDay(4)

When a name isn’t assigned using one of the Value methods that takes a String argument, Value.toString prints the name of the type that is synthesized by the compiler, along with the ID value that was generated automatically.

Note that we imported WeekDay._. This made each enumeration value (Mon, Tues, etc.) in scope. Otherwise, you would have to write WeekDay.Mon, WeekDay.Tues, etc.

Also, the import made the type alias, type Weekday = Value, in scope, which we used as the type for the argument for the isWorkingDay method. If you don’t define a type alias like this, then you would declare the method as def isWorkingDay(d: WeekDay.Value).

Since Scala enumerations are just regular objects, you could use any object with vals to indicate different “enumeration values.” However, extending Enumeration has several advantages. It automatically manages the values as a collection that you can iterate over, etc., as in our examples. It also automatically assigns unique integer IDs to each value.

Case classes (see Case Classes) are often used instead of enumerations in Scala because the “use case” for them often involves pattern matching. We’ll revisit this topic in Enumerations Versus Pattern Matching.

Recap and What’s Next

We’ve covered a lot of ground in this chapter. We learned how flexible Scala’s syntax can be, and how it facilitates the creation of Domain-Specific Languages. Then we explored Scala’s enhancements to looping constructs and conditional expressions. We experimented with different uses for pattern matching, a powerful improvement on the familiar case-switch statement. Finally, we learned how to encapsulate values in enumerations.

You should now be prepared to read a fair bit of Scala code, but there’s plenty more about the language to put in your tool belt. In the next four chapters, we’ll explore Scala’s approach to object-oriented programming, starting with traits.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.126.239