This chapter looks at Kotlin sequences, which are just like the streams introduced in Java version 1.8. Admittedly that’s a useful statement only if you already know how Java streams work,1 but the recipes in this chapter will highlight both their similarities and their differences.
With collections, processing is eager: when you invoke map
or filter
on a collection, every element in the collection is processed. Sequences, however, are lazy: when you use a sequence to process data, each element completes the entire pipeline before the next element is processed. This helps when you have a lot of data or if a short-circuiting operation, like first
, lets you exit the sequence when a desired value is found.
Use a Kotlin sequence with a short-circuiting function.
Kotlin adds extension functions to basic collections, so that List
has functions like map
and filter
. Those functions are eager, however, meaning that they process every element in the collection.
Consider the following admittedly highly contrived problem: you want to take the numbers from 100 to 200 and double each one, and then find the first double that’s evenly divisible by 3. One way to solve that problem is to use the function in Example 6-1.
(
1
0
0
until
2
0
0
)
.
map
{
it
*
2
}
.
filter
{
it
%
3
=
=
0
}
.
first
(
)
The problem with this approach is that it’s horribly inefficient. First, you double all 100 numbers in the range, then you perform the modulus operation on all 100 results, and then grab just the first element. Fortunately, there is an overload of the first
function that takes a predicate (a lambda of one argument that returns a boolean), as shown in Example 6-2.
(
1
0
0
until
2
0
0
)
.
map
{
it
*
2
}
.
first
{
it
%
3
=
=
0
}
This version of the first
method uses a loop to process each element of the collection, but stops when it hits the first one that satisfies the predicate. That process is known as short-circuiting—processing only as much data as needed until a condition is reached. If you forget to use the overloaded version of first
, however, you’re back to doing much more work than needed.
Kotlin sequences process data differently. Example 6-3 is much better still.
(
1
0
0
until
2
_000_000
)
.
asSequence
(
)
.
map
{
println
(
"doubling $it"
)
;
it
*
2
}
.
filter
{
println
(
"filtering $it"
)
;
it
%
3
=
=
0
}
.
first
(
)
The function this time executes only six operations before returning the right answer:
doubling 100 filtering 200 doubling 101 filtering 202 doubling 102 filtering 204
For sequences, it doesn’t matter whether you use the filter
function shown or the other overload of first
. The latter may be simpler, but either way, only six calculations are done, because each element of the sequence is processed by the entire pipeline completely before moving to the next one. Note that just to prove a point, the upper limit of the sequence this time was changed to two million, which didn’t affect the resulting behavior at all.
Incidentally, the first
function used here throws an exception if the sequence is empty. If that might happen, consider using firstOrNull
instead.
The API for Sequence
contains the same functions as that for Collection
, but the operations fall into two categories: intermediate and terminal. Intermediate operations, like map
and filter
, return a new sequence. Terminal operations, like first
or toList
, return anything else. The key is that without a terminal operation, a sequence will not process any data.
A sequence processes data only if the pipeline of chained functions called on it ends with a terminal operation.
Unlike Java streams, Kotlin sequences can be iterated multiple times, though some cannot and are documented as such.
The first two options are trivial. The sequenceOf
function works the same way arrayOf
, listOf
, or any of the other related functions work. The asSequence
function converts an existing Iterable
(most likely a list or other collection) into a Sequence
. Both of those options are shown in Example 6-4.
val
numSequence1
=
sequenceOf
(
3
,
1
,
4
,
1
,
5
,
9
)
val
numSequence2
=
listOf
(
3
,
1
,
4
,
1
,
5
,
9
).
asSequence
()
Both of these statements produce a Sequence<Int>
, either from the given values or from the provided list.
Life gets interesting when you have to generate the elements in the sequence, which may even be infinite. To see that in action, first consider Example 6-5, which is an extension function on Int
that determines whether a number is prime by dividing by each number from 2 up to the square root of the number, looking for any value that divides it evenly.
Int
is primeimport
kotlin.math.ceil
import
kotlin.math.sqrt
fun
Int
.
isPrime
()
=
this
==
2
||
(
2.
.
ceil
(
sqrt
(
this
.
toDouble
())).
toInt
())
.
none
{
divisor
->
this
%
divisor
==
0
}
This function first checks to see whether the number is 2. If not, it creates a range from 2 up to the square root of the given number, rounded up to the next integer. For each number in that range, the none
function returns true
only if none of the values evenly divide the original number.
Test cases to verify the behavior of the isPrime
function are included in the source code repository for this book.
Now say you have a particular integer and you want to know the next prime number that comes after it. As an example, given 6, the next prime is 7. Given 182, the next prime after that is 191. For 9,973, the next prime number is 10,007. What makes that problem interesting is that there’s no way to know how many numbers you’re going to need to check before you find the next prime. That makes a natural application for a sequence. An implementation of the nextPrime
function is shown in Example 6-6.
fun
nextPrime
(
num
:
Int
)
=
generateSequence
(
num
+
1
)
{
it
+
1
}
.
first
(
Int
::
isPrime
)
The generateSequence
function takes two arguments: an initial value, and a function to produce the next value in the sequence. Its signature is given by the following:
fun
<
T
:
Any
>
generateSequence
(
seed
:
T
?,
nextFunction
:
(
T
)
->
T
?
):
Sequence
<
T
>
The seed in this case is the integer right after the provided one, and the function simply increments by one. By the normal Kotlin idiom, the lambda is placed after the parentheses in the generateSequence
function. The first
function on a sequence returns the first value that satisfies the supplied lambda, which in this case is a reference to the isPrime
extension function.
In this case, the nextPrime
function generates an infinite sequence of integers, evaluating them one by one until it finds the first prime. The first
function returns a value rather than a sequence, so it is a terminal operation. Without a terminal operation, no values are processed by the sequence. In this case, the first
operation is given a lambda known as a predicate
(because it returns a boolean value), and the sequence keeps producing values until the predicate is satisfied.
Infinite sequences are also examined in Recipe 6.3.
Use a sequence generator that returns a null, or use one of the sequence functions such as takeWhile
.
Sequences, like Java streams, have intermediate operations and terminal operations. An intermediate operation returns a new sequence, and a terminal operation returns anything else. When you create a pipeline of function calls on a sequence, no data is pulled through the sequence until you have a terminal operation.
The function firstNPrimes
, shown in Example 6-7, calculates the first N
prime numbers, starting from 2. The sample shown here leverages the nextPrime
function given in Example 6-6 and described in Recipe 6.2, and repeated here for convenience:
fun
nextPrime
(
num
:
Int
)
=
generateSequence
(
num
+
1
)
{
it
+
1
}
.
first
(
Int
::
isPrime
)
This sequence uses the isPrime
extension function from the same recipe.
Once again, there’s no way to know ahead of time how many numbers need to be examined in order to find that many primes, so a sequence is a natural way to solve the problem.
N
prime numbersfun
firstNPrimes
(
count
:
Int
)
=
generateSequence
(
2
,
::
nextPrime
)
.
take
(
count
)
.
toList
(
)
The sequence generated is infinite. The take
function is a stateless, intermediate operation that returns a sequence consisting of only the first count
values. If you simply execute this function without the toList
function at the end, no primes are computed. All you have is a sequence, but no values. The terminal operation, toList
, is used to actually compute values and return them all in a list.
On a 2017 MacBook Pro, even this nonoptimized algorithm generated 10,000 prime numbers in just under 50 milliseconds. Computers are fast now. FYI, the 10,000th prime is 104,729.
Another way of truncating an infinite sequence is to use a generation function that eventually returns null. Instead of asking for the first N
primes, say you want all the prime numbers less than a particular limit. The primesLessThan
function is shown in Example 6-8.
fun
primesLessThan
(
max
:
Int
):
List
<
Int
>
=
generateSequence
(
2
)
{
n
->
if
(
n
<
max
)
nextPrime
(
n
)
else
null
}
.
toList
().
dropLast
(
1
)
The function used in the generateSequence
call in this case checks whether the current value is less than the supplied limit. If so, it computes the next prime. Otherwise, it returns null
, and returning null
terminates the sequence.
Of course, there’s no way to know whether the next prime will be greater than the limit, so this function actually produces a list that includes the first prime above the limit. The dropLast
function then truncates the resulting list before returning it.
Most of the time, there’s an easier way to solve the same problem. In this case, rather than cause the generating function to return null
, it’s arguably easier to use takeWhile
on the sequence, as in Example 6-9.
fun
primesLessThan
(
max
:
Int
):
List
<
Int
>
=
generateSequence
(
2
,
::
nextPrime
)
.
takeWhile
{
it
<
max
}
.
toList
()
The takeWhile
function pulls values from the sequence as long as the supplied predicate returns true
.
Either of these approaches work, so which to use is a matter of personal preference.
Use the sequence
function along with the yield
suspend function.
Another function associated with sequences is sequence
, which has the signature shown in Example 6-10.
sequence
functionfun
<
T
>
sequence
(
block
:
suspend
SequenceScope
<
T
>.()
->
Unit
):
Sequence
<
T
>
The sequence
function produces a sequence by evaluating the given block. The block is a lambda function of no arguments that returns void
, acting on the receiver of type SequenceScope
.
That all sounds complicated until you see how it is used. Normally, to generate a sequence, you produce one from existing data using sequenceOf
, you transform a collection into a sequence using asSequence
, or you produce values with a function supplied to generateSequence
. In this case, however, you supply a lambda that produces a value, which you yield whenever you like.
A good example is generating Fibonacci numbers, as shown in Example 6-11, based on an example from the library documentation.
fun
fibonacciSequence
()
=
sequence
{
var
terms
=
Pair
(
0
,
1
)
while
(
true
)
{
yield
(
terms
.
first
)
terms
=
terms
.
second
to
terms
.
first
+
terms
.
second
}
}
The lambda provided to the sequence
function starts off with a Pair
containing the first two Fibonacci numbers, 0 and 1. Then it uses an infinite loop to produce the subsequent values. Each time a new element is produced, the yield
function returns the first
element of the resulting pair.
The yield
function is one of two similar functions that are part of SequenceScope
, the receiver of the lambda provided to the sequence
operation. The signatures of both yield
and yieldAll
are shown in Example 6-12, along with their overloaded versions.
yield
and yieldAll
functions from SequenceScope
abstract
suspend
fun
yield
(
value
:
T
)
abstract
suspend
fun
yieldAll
(
iterator
:
Iterator
<
T
>)
suspend
fun
yieldAll
(
elements
:
Iterable
<
T
>)
suspend
fun
yieldAll
(
sequence
:
Sequence
<
T
>)
The job of the yield
function is to provide a value to an iterator and suspend until the next value is requested. Therefore, in the sequence generated by the suspend
function, yield
is used to output individual values. The fact that yield
is a suspend
function means it plays nicely with coroutines. In other words, the Kotlin runtime can provide a value and then put the current coroutine on hold until the next value is requested. That’s why the infinite loop—the while(true)
loop in Example 6-11—provides values one by one when invoked by the take
operation in Example 6-13.
sequence
operation@Test
fun
`
first
10
Fibonacci
numbers
from
sequence
`
()
{
val
fibs
=
fibonacciSequence
()
.
take
(
10
)
.
toList
()
assertEquals
(
listOf
(
0
,
1
,
1
,
2
,
3
,
5
,
8
,
13
,
21
,
34
),
fibs
)
}
As you might expect, yieldAll
yields several values to the iterator. The example given in the Kotlin documentation is shown in Example 6-14.
yieldAll
function inside a sequence
vale
sequence
=
sequence
{
val
start
=
0
yield
(
start
)
yield
(
1.
.
5
step
2
)
yield
(
generateSequence
(
8
)
{
it
*
3
}
)
}
The result of this code is the sequence 0, 1, 3, 5, 8, 24, 72.… When the sequence
is accessed with a take
function, it returns as many elements as are requested, using the given pattern.
The combination of yield
and yieldAll
inside suspend
makes it easy to customize a sequence to any desired set of generated values.
The suspend
keyword is covered in more detail in the section on coroutines in Chapter 13.
1 The most famous example of an explanation like that is, “Monads are just monoids in the category of endofunctors,” which may be true but is almost, but not quite, a completely useless statement.
3.137.163.197