Chapter 6. Sequences

This chapter looks at Kotlin sequences, which are just like the streams introduced in Java version 1.8. Admittedly that’s a useful statement only if you already know how Java streams work,1 but the recipes in this chapter will highlight both their similarities and their differences.

With collections, processing is eager: when you invoke map or filter on a collection, every element in the collection is processed. Sequences, however, are lazy: when you use a sequence to process data, each element completes the entire pipeline before the next element is processed. This helps when you have a lot of data or if a short-circuiting operation, like first, lets you exit the sequence when a desired value is found.

6.1 Using Lazy Sequences

Problem

You want to process the minimum amount of data necessary to satisfy a certain condition.

Solution

Use a Kotlin sequence with a short-circuiting function.

Discussion

Kotlin adds extension functions to basic collections, so that List has functions like map and filter. Those functions are eager, however, meaning that they process every element in the collection.

Consider the following admittedly highly contrived problem: you want to take the numbers from 100 to 200 and double each one, and then find the first double that’s evenly divisible by 3. One way to solve that problem is to use the function in Example 6-1.

Example 6-1. Finding the first double divisible by 3 (version 1)
(100 until 200).map { it * 2 } 1
    .filter { it % 3 == 0 }    2
    .first()
1

100 computations

2

Another 100 (bad design—can be improved)

The problem with this approach is that it’s horribly inefficient. First, you double all 100 numbers in the range, then you perform the modulus operation on all 100 results, and then grab just the first element. Fortunately, there is an overload of the first function that takes a predicate (a lambda of one argument that returns a boolean), as shown in Example 6-2.

Example 6-2. Finding the first double divisible by 3 (version 2)
(100 until 200).map { it * 2 } 1
    .first { it % 3 == 0 }     2
1

100 computations

2

Only 3 computations

This version of the first method uses a loop to process each element of the collection, but stops when it hits the first one that satisfies the predicate. That process is known as short-circuiting—processing only as much data as needed until a condition is reached. If you forget to use the overloaded version of first, however, you’re back to doing much more work than needed.

Kotlin sequences process data differently. Example 6-3 is much better still.

Example 6-3. Finding the first double divisible by 3 (best)
(100 until 2_000_000).asSequence()  1
    .map { println("doubling $it"); it * 2 }
    .filter { println("filtering $it"); it % 3 == 0 }
    .first()
1

Converts the range into a sequence

The function this time executes only six operations before returning the right answer:

doubling 100
filtering 200
doubling 101
filtering 202
doubling 102
filtering 204

For sequences, it doesn’t matter whether you use the filter function shown or the other overload of first. The latter may be simpler, but either way, only six calculations are done, because each element of the sequence is processed by the entire pipeline completely before moving to the next one. Note that just to prove a point, the upper limit of the sequence this time was changed to two million, which didn’t affect the resulting behavior at all.

Incidentally, the first function used here throws an exception if the sequence is empty. If that might happen, consider using firstOrNull instead.

The API for Sequence contains the same functions as that for Collection, but the operations fall into two categories: intermediate and terminal. Intermediate operations, like map and filter, return a new sequence. Terminal operations, like first or toList, return anything else. The key is that without a terminal operation, a sequence will not process any data.

Tip

A sequence processes data only if the pipeline of chained functions called on it ends with a terminal operation.

Unlike Java streams, Kotlin sequences can be iterated multiple times, though some cannot and are documented as such.

6.2 Generating Sequences

Problem

You want to generate a sequence of values.

Solution

Use sequenceOf if you already have the elements, use asSequence if you already have an Iterable, or use a sequence generator otherwise.

Discussion

The first two options are trivial. The sequenceOf function works the same way arrayOf, listOf, or any of the other related functions work. The asSequence function converts an existing Iterable (most likely a list or other collection) into a Sequence. Both of those options are shown in Example 6-4.

Example 6-4. Creating sequences when you already have the values
val numSequence1 = sequenceOf(3, 1, 4, 1, 5, 9)
val numSequence2 = listOf(3, 1, 4, 1, 5, 9).asSequence()

Both of these statements produce a Sequence<Int>, either from the given values or from the provided list.

Life gets interesting when you have to generate the elements in the sequence, which may even be infinite. To see that in action, first consider Example 6-5, which is an extension function on Int that determines whether a number is prime by dividing by each number from 2 up to the square root of the number, looking for any value that divides it evenly.

Example 6-5. Checking whether an Int is prime
import kotlin.math.ceil
import kotlin.math.sqrt

fun Int.isPrime() =
    this == 2 || (2..ceil(sqrt(this.toDouble())).toInt())
        .none { divisor -> this % divisor == 0 }

This function first checks to see whether the number is 2. If not, it creates a range from 2 up to the square root of the given number, rounded up to the next integer. For each number in that range, the none function returns true only if none of the values evenly divide the original number.

Note

Test cases to verify the behavior of the isPrime function are included in the source code repository for this book.

Now say you have a particular integer and you want to know the next prime number that comes after it. As an example, given 6, the next prime is 7. Given 182, the next prime after that is 191. For 9,973, the next prime number is 10,007. What makes that problem interesting is that there’s no way to know how many numbers you’re going to need to check before you find the next prime. That makes a natural application for a sequence. An implementation of the nextPrime function is shown in Example 6-6.

Example 6-6. Finding the next prime after a given integer
fun nextPrime(num: Int) =
    generateSequence(num + 1) { it + 1 }  1
        .first(Int::isPrime)              2
1

Starts one past the given number and iterates by 1

2

Returns the first prime value

The generateSequence function takes two arguments: an initial value, and a function to produce the next value in the sequence. Its signature is given by the following:

fun <T : Any> generateSequence(
    seed: T?,
    nextFunction: (T) -> T?
): Sequence<T>

The seed in this case is the integer right after the provided one, and the function simply increments by one. By the normal Kotlin idiom, the lambda is placed after the parentheses in the generateSequence function. The first function on a sequence returns the first value that satisfies the supplied lambda, which in this case is a reference to the isPrime extension function.

In this case, the nextPrime function generates an infinite sequence of integers, evaluating them one by one until it finds the first prime. The first function returns a value rather than a sequence, so it is a terminal operation. Without a terminal operation, no values are processed by the sequence. In this case, the first operation is given a lambda known as a predicate (because it returns a boolean value), and the sequence keeps producing values until the predicate is satisfied.

See Also

Infinite sequences are also examined in Recipe 6.3.

6.3 Managing Infinite Sequences

Problem

You need a portion of an infinite sequence.

Solution

Use a sequence generator that returns a null, or use one of the sequence functions such as takeWhile.

Discussion

Sequences, like Java streams, have intermediate operations and terminal operations. An intermediate operation returns a new sequence, and a terminal operation returns anything else. When you create a pipeline of function calls on a sequence, no data is pulled through the sequence until you have a terminal operation.

The function firstNPrimes, shown in Example 6-7, calculates the first N prime numbers, starting from 2. The sample shown here leverages the nextPrime function given in Example 6-6 and described in Recipe 6.2, and repeated here for convenience:

fun nextPrime(num: Int) =
    generateSequence(num + 1) { it + 1 }
        .first(Int::isPrime)

This sequence uses the isPrime extension function from the same recipe.

Once again, there’s no way to know ahead of time how many numbers need to be examined in order to find that many primes, so a sequence is a natural way to solve the problem.

Example 6-7. Finding the first N prime numbers
fun firstNPrimes(count: Int) =
    generateSequence(2, ::nextPrime)  1
        .take(count)                  2
        .toList()                     3
1

Infinite sequence of primes, starting from 2

2

Intermediate operation to retrieve only the count requested

3

Terminal operation

The sequence generated is infinite. The take function is a stateless, intermediate operation that returns a sequence consisting of only the first count values. If you simply execute this function without the toList function at the end, no primes are computed. All you have is a sequence, but no values. The terminal operation, toList, is used to actually compute values and return them all in a list.

Note

On a 2017 MacBook Pro, even this nonoptimized algorithm generated 10,000 prime numbers in just under 50 milliseconds. Computers are fast now. FYI, the 10,000th prime is 104,729.

Another way of truncating an infinite sequence is to use a generation function that eventually returns null. Instead of asking for the first N primes, say you want all the prime numbers less than a particular limit. The primesLessThan function is shown in Example 6-8.

Example 6-8. Primes less than a given value (version 1)
fun primesLessThan(max: Int): List<Int> =
    generateSequence(2) { n -> if (n < max) nextPrime(n) else null }
        .toList().dropLast(1)

The function used in the generateSequence call in this case checks whether the current value is less than the supplied limit. If so, it computes the next prime. Otherwise, it returns null, and returning null terminates the sequence.

Of course, there’s no way to know whether the next prime will be greater than the limit, so this function actually produces a list that includes the first prime above the limit. The dropLast function then truncates the resulting list before returning it.

Most of the time, there’s an easier way to solve the same problem. In this case, rather than cause the generating function to return null, it’s arguably easier to use takeWhile on the sequence, as in Example 6-9.

Example 6-9. Primes less than a given value (version 2)
fun primesLessThan(max: Int): List<Int> =
    generateSequence(2, ::nextPrime)
        .takeWhile { it < max }
        .toList()

The takeWhile function pulls values from the sequence as long as the supplied predicate returns true.

Either of these approaches work, so which to use is a matter of personal preference.

6.4 Yielding from a Sequence

Problem

You want to produce values in a sequence, but at specified intervals.

Solution

Use the sequence function along with the yield suspend function.

Discussion

Another function associated with sequences is sequence, which has the signature shown in Example 6-10.

Example 6-10. The signature of the sequence function
fun <T> sequence(
    block: suspend SequenceScope<T>.() -> Unit
): Sequence<T>

The sequence function produces a sequence by evaluating the given block. The block is a lambda function of no arguments that returns void, acting on the receiver of type SequenceScope.

That all sounds complicated until you see how it is used. Normally, to generate a sequence, you produce one from existing data using sequenceOf, you transform a collection into a sequence using asSequence, or you produce values with a function supplied to generateSequence. In this case, however, you supply a lambda that produces a value, which you yield whenever you like.

A good example is generating Fibonacci numbers, as shown in Example 6-11, based on an example from the library documentation.

Example 6-11. Generating Fibonacci numbers as a sequence
fun fibonacciSequence() = sequence {
    var terms = Pair(0, 1)

    while (true) {
        yield(terms.first)
        terms = terms.second to terms.first + terms.second
    }
}

The lambda provided to the sequence function starts off with a Pair containing the first two Fibonacci numbers, 0 and 1. Then it uses an infinite loop to produce the subsequent values. Each time a new element is produced, the yield function returns the first element of the resulting pair.

The yield function is one of two similar functions that are part of SequenceScope, the receiver of the lambda provided to the sequence operation. The signatures of both yield and yieldAll are shown in Example 6-12, along with their overloaded versions.

Example 6-12. The yield and yieldAll functions from SequenceScope
abstract suspend fun yield(value: T)

abstract suspend fun yieldAll(iterator: Iterator<T>)
suspend fun yieldAll(elements: Iterable<T>)
suspend fun yieldAll(sequence: Sequence<T>)

The job of the yield function is to provide a value to an iterator and suspend until the next value is requested. Therefore, in the sequence generated by the suspend function, yield is used to output individual values. The fact that yield is a suspend function means it plays nicely with coroutines. In other words, the Kotlin runtime can provide a value and then put the current coroutine on hold until the next value is requested. That’s why the infinite loop—the while(true) loop in Example 6-11—provides values one by one when invoked by the take operation in Example 6-13.

Example 6-13. Pulling values from the sequence operation
@Test
fun `first 10 Fibonacci numbers from sequence`() {
    val fibs = fibonacciSequence()
        .take(10)
        .toList()

    assertEquals(listOf(0, 1, 1, 2, 3, 5, 8, 13, 21, 34), fibs)
}

As you might expect, yieldAll yields several values to the iterator. The example given in the Kotlin documentation is shown in Example 6-14.

Example 6-14. The yieldAll function inside a sequence
vale sequence = sequence {
    val start = 0
    yield(start)                          1
    yield(1..5 step 2)                    2
    yield(generateSequence(8) { it * 3 }) 3
}
1

Yields a single value (0)

2

Yields an iterable over the range (1, 3, 5)

3

Yields an infinite sequence that starts at 8 and multiplies each value by 3

The result of this code is the sequence 0, 1, 3, 5, 8, 24, 72.… When the sequence is accessed with a take function, it returns as many elements as are requested, using the given pattern.

The combination of yield and yieldAll inside suspend makes it easy to customize a sequence to any desired set of generated values.

See Also

The suspend keyword is covered in more detail in the section on coroutines in Chapter 13.

1 The most famous example of an explanation like that is, “Monads are just monoids in the category of endofunctors,” which may be true but is almost, but not quite, a completely useless statement.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.163.197