Chapter 6. Parallelism

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6. Parallelism

This chapter covers

Exploring the importance of parallelism
Examining concurrency versus parallelism
Getting to know threads in Nim
Advanced parsing of data using regular expressions and other means
Parallelizing the parsing of large datasets

Every computer program performs one or more computations, and these computations are usually performed sequentially. That is, the current computation has to complete before the next one starts. For example, consider a simple calculation, (2 + 2) x 4, in which the addition must be computed first, to give 4, followed by the multiplication, to give 16. In that example, the calculation is performed sequentially.

Concurrency allows more than one computation to make progress without waiting for all other computations to complete. This form of computing is useful in many situations, such as in an I/O application like the chat application you developed in chapter 3. If executed sequentially, such applications waste time waiting on input or output operations to complete. Concurrency allows this time to be used for another task, drastically reducing the execution time of the application. You learned about concurrency in chapter 3; in this chapter, you’ll learn about a related concept called parallelism.

Nim offers many built-in facilities for concurrency and parallelism including asynchronous I/O features in the form of futures and await, spawn for creating new threads, and more. You’ve already seen some of these used in chapter 3.

Parallelism in Nim is still evolving, which means that the features described in this chapter may change or be replaced by more-robust features. But the core concepts of parallelism in Nim should remain the same, and what you’ll learn in this chapter will be applicable to other programming languages as well.

In addition to showing you Nim’s parallelism features, this chapter will lead you through the implementation of a simple parser, which will show you different methods for creating parsers. Toward the end of the chapter, you’ll optimize the parser so it’s concurrent and can be run in parallel on multiple CPU cores.

6.1. Concurrency vs. parallelism

Nowadays, almost all OSs support multitasking, the ability to perform multiple tasks over a certain period of time. A task is usually known as a process, which is an instance of a computer program being executed. Each CPU executes only a single process at a time, but multitasking allows the OS to change the process that’s currently being executed on the CPU without having to wait for the process to finish its execution. Figure 6.1 shows how two processes are executed concurrently on a multitasking OS.

Figure 6.1. Concurrent execution of two processes

Because CPUs are extremely fast, process A can be executed for 1 nanosecond, followed by process B for 2 nanoseconds, followed by process A for another nanosecond.^[1] This gives the impression of multiple processes being executed at the same time, even though a CPU can only execute a single instruction at a time. This apparent simultaneous execution of multiple processes is called concurrency.

¹
For simplicity, I’ll ignore the time taken for a context switch here.

In recent years, multicore CPUs have become popular. This kind of CPU consists of two or more independent units that can run multiple instructions simultaneously. This allows a multitasking OS to run two or more processes at the same time in parallel. Figure 6.2 shows how two processes are executed in parallel on a dual-core CPU.

Figure 6.2. Parallel execution of two processes

Unlike a single-core CPU, a dual-core CPU can actually execute two processes at the same time. This type of execution is called parallelism, and it can only be achieved on multiple physical CPUs or via a simultaneous multithreading (SMT) technology such as Intel’s Hyper-Threading (HT) Technology. Remember that despite the apparent similarities between concurrency and parallelism, the two are not the same.

In addition to processes, the OS also manages the execution of threads. A thread is a component of a process, and more than one can exist within the same process. It can be executed concurrently or in parallel, just like a process, although unlike processes, threads share resources such as memory among each other.

To make use of the full power of a multicore CPU, CPU-intensive computations must be parallelized. This can be done by using multiple processes, although threads are more appropriate for computations that require a large amount of data to be shared.

The asynchronous await that you saw used in chapter 3 is strictly concurrent. Because the asynchronous code always runs on a single thread, it isn’t parallel, which means that it can’t currently use the full power of multicore CPUs.

Parallel async await

It’s very likely that a future version of Nim will include an asynchronous await that’s parallel.

Unlike asynchronous await, spawn is parallel and has been designed specifically for CPU-intensive computations that can benefit from being executed on multicore CPUs.

Parallelism in other programming languages

Some programming languages, such as Python and Ruby, don’t support thread-level parallelism due to a global interpreter lock in their interpreter. This prevents applications that use threads from using the full power of multicore CPUs. There are ways around this limitation, but they require the use of processes that aren’t as flexible as threads.

6.2. Using threads in Nim

Now that you’ve learned the difference between concurrency and parallelism, you’re ready to learn how to use threads in Nim.

In Nim, there are two modules for working with threads. The threads module (http://nim-lang.org/docs/threads.html) exposes the ability to create threads manually. Threads created this way immediately execute a specified procedure and run for the duration of that procedure’s runtime. There’s also the threadpool module (http://nim-lang.org/docs/threadpool.html), which implements a thread pool. It exposes spawn, which adds a specified procedure to the thread pool’s task queue. The act of spawning a procedure doesn’t mean it will be running in a separate thread immediately, though. The creation of threads is managed entirely by the thread pool.

The sections that follow will explain all about the two different threading modules, so don’t feel overwhelmed by the new terms I just introduced.

6.2.1. The threads module and GC safety

In this section, we’ll look at the threads module. But before we start, I should explain how threads work in Nim. In particular, you need to know what garbage collector safety (GC safety) is in Nim. There’s a very important distinction between the way threads work in Nim and in most other programming languages. Each of Nim’s threads has its own isolated memory heap. Sharing of memory between threads is restricted, which helps to prevent race conditions and improves efficiency.

Efficiency is also improved by each thread having its own garbage collector. Other implementations of threads that share memory need to pause all threads while the garbage collector does its business. This can add problematic pauses to the application.

Let’s look at how this threading model works in practice. The following listing shows a code sample that doesn’t compile.

Listing 6.1. Mutating a global variable using a `Thread`

var data = "Hello World"                  1

proc showData() {.thread.} =              2
  echo(data)                              3

var thread: Thread[void]                  4
createThread[void](thread, showData)      5
joinThread(thread)                        6

1 Defines a new mutable global variable named data and assigns the text “Hello World” to it
2 Defines a new procedure that will be executed in a new thread. The {.thread.} pragma must be used to signify this.
3 Attempts to display the value of the data variable
4 Defines a variable to store the new thread. The generic parameter signifies the type of parameter that the thread procedure takes. In this case, the void means that the procedure takes no parameters.
5 The createThread procedure executes the specified procedure in a new thread.
6 Waits for the thread to finish

The threads module

The threads module is a part of the implicitly imported system module, so you don’t need to import it explicitly.

This example illustrates what’s disallowed by the GC safety mechanism in Nim, and you’ll see later on how to fix this example so that it compiles.

Save the code in listing 6.1 as listing01.nim, and then execute nim c --threads:on listing01.nim to compile it. The --threads:on flag is necessary to enable thread support. You should see an error similar to this:

listing01.nim(3, 6) Error: 'showData' is not GC-safe as it accesses
   'data' which is a global using GC'ed memory

This error describes the problem fairly well. The global variable data has been created in the main thread, so it belongs to the main thread’s memory. The showData thread can’t access another thread’s memory, and if it attempts to, it’s not considered GC safe by the compiler. The compiler refuses to execute threads that aren’t GC safe.

A procedure is considered GC safe by the compiler as long as it doesn’t access any global variables that contain garbage-collected memory. An assignment or any sort of mutation also counts as an access and is disallowed. Garbage-collected memory includes the following types of variables:

string
seq[T]
ref T
Closure iterators and procedures, as well as types that include them

There are other ways of sharing memory between threads that are GC safe. You may, for example, pass the contents of data as one of the parameters to showData. The following listing shows how to pass data as a parameter to a thread; the differences between listings 6.2 and 6.1 are shown in bold.

Listing 6.2. Passing data to a thread safely

var data = "Hello World"

proc showData(param: string) {.thread.} =                1
  echo(param)                                            2

var thread: Thread[string]                               3
createThread[string](thread, showData, data)             4
joinThread(thread)

1 A parameter of type string is specified in the procedure definition.
2 The procedure argument is passed to echo instead of the global variable data.
3 The void has been replaced by string to signify the type of parameter that the showData procedure takes.
4 The data global variable is passed to the createThread procedure, which will pass it on to showData.

Save the code in listing 6.2 as listing2.nim, and then compile it using nim c --threads:on listing2.nim. The compilation should be successful, and running the program should display "Hello World".

The createThread procedure can only pass one variable to the thread that it’s creating. In order to pass multiple separate pieces of data to the thread, you must define a new type to hold the data. The following listing shows how this can be done.

Listing 6.3. Passing multiple values to a thread

type
  ThreadData = tuple[param: string, param2: int]

var data = "Hello World"

proc showData(data: ThreadData) {.thread.} =
  echo(data.param, data.param2)

var thread: Thread[ThreadData]
createThread[ThreadData](thread, showData, (param: data, param2: 10))
joinThread(thread)

Executing threads

The threads created in the previous listings don’t do very much. Let’s examine the execution of these threads and see what happens when two threads are created at the same time and are instructed to display a few lines of text. In the following examples, two series of integers are displayed.

Listing 6.4. Executing multiple threads

var data = "Hello World"

proc countData(param: string) {.thread.} =
  for i in 0 .. <param.len:                                1
    stdout.write($i)                                       2
  echo()                                                   3

var threads: array[2, Thread[string]]                      4
createThread[string](threads[0], countData, data)          5
createThread[string](threads[1], countData, data)          5
joinThreads(threads)                                       6

1 Iterates from 0 to the length of the param argument minus 1
2 Displays the current iteration counter without displaying the newline character
3 Goes to the next line
4 This time, there are two threads stored in an array.
5 Creates a thread and assigns it to one of the elements in the threads array
6 Waits for all threads to finish

Save the code in listing 6.4 as listing3.nim, and then compile and run it. Listing 6.5 shows what the output will look like in most cases, and listing 6.6 shows what it may sometimes look like instead.

Listing 6.5. First possible output when the code in listing 6.4 is executed

001122334455667788991010

Listing 6.6. Second possible output when the code in listing 6.4 is executed

012345678910
012345678910

The execution of the threads depends entirely on the OS and computer used. On my machine, the output in listing 6.5 likely happens as a result of the two threads running in parallel on two CPU cores, whereas the output in listing 6.6 is a result of the first thread finishing before the second thread even starts. Your system may show different results. Figure 6.3 shows what the execution for both the first and second sets of results looks like.

Figure 6.3. The two possible executions of listing 6.4

The threads created using the threads module are considerably resource intensive. They consume a lot of memory, so creating large numbers of them is inefficient. They’re useful if you need full control over the threads that your application is using, but for most use cases the threadpool module is superior. Let’s take a look at how the threadpool module works.

6.2.2. Using thread pools

The main purpose of using multiple threads is to parallelize your code. CPU-intensive computations should make use of as much CPU power as possible, which includes using the power of all the cores in a multicore CPU.

A single thread can use the power of a single CPU core. To use the power of all the cores, you could simply create one thread per core. The biggest problem then is making sure that those threads are all busy. You might have 100 tasks that don’t all take the same amount of time to complete, and distributing them across the threads isn’t a trivial job.

Alternatively, one thread per task could be created. But this creates problems of its own, in part because thread creation is very expensive. A large number of threads will consume a lot of memory due to OS overhead.

What is a thread pool?

The threadpool module implements an abstraction that manages the distribution of tasks over a number of threads. The threads themselves are also managed by the thread pool.

The spawn command allows tasks, in the form of procedure invocations, to be added to the thread pool, which then executes the tasks in one of the threads it manages. The thread pool ensures that the tasks keep all the threads busy so that the CPU’s power is utilized in the best way possible. Figure 6.4 shows how the thread pool manages tasks under the hood.

Figure 6.4. A Nim thread pool

Using spawn

The spawn procedure accepts an expression, which in most cases is a procedure call. spawn returns a value of the type FlowVar[T] that holds the return value of the procedure that was called. This is an advantage in comparison to the threads module, where threads can’t return any values.

The following listing shows the spawn equivalent of the code in listing 6.4.

Listing 6.7. Executing multiple threads using `spawn`

import threadpool                       1
var data = "Hello World"

proc countData(param: string) =         2
  for i in 0 .. <param.len:
    stdout.write($i)
  echo()
spawn countData(data)                   3
spawn countData(data)

sync()                                  4

1 The threadpool module needs to be explicitly imported to use spawn.
2 The procedure passed to spawn doesn’t need the {.thread.} pragma.
3 The syntax for spawning the procedure is much simpler than using createThread.
4 The sync procedure waits for all spawned procedures to finish.

Save the code in listing 6.7 as listing4.nim, and then compile and run it. Keep in mind that the --threads:on flag still needs to be specified. The output should be mostly the same as the output shown in listings 6.5 and 6.6.

Procedures executed using spawn also have to be GC safe.

Retrieving return values from the FlowVar type

Let’s look at an example that shows how to retrieve the return values from a spawned procedure. This involves dealing with the FlowVar[T] type.

FlowVar[T] can be thought of as a container similar to the Future[T] type, which you used in chapter 3. At first, the container has nothing inside it. When the spawned procedure is executed in a separate thread, it returns a value sometime in the future. When that happens, the returned value is put into the FlowVar container.

The following listing shows the readLine procedure from chapter 3, which uses a while loop to read text from the terminal without blocking.

Listing 6.8. Reading input from the terminal with `spawn`

import threadpool, os                               1

let lineFlowVar = spawn stdin.readLine()            2
while not lineFlowVar.isReady:                      3
  echo("No input received.")                        4
  echo("Will check again in 3 seconds.")            4
  sleep(3000)                                       5

echo("Input received: ", ^lineFlowVar)              6

1 The threadpool module is necessary for spawn. The os module defines the sleep procedure.
2 Adds the readLine procedure to the thread pool. spawn will return a FlowVar[string] type that will be assigned to the lineFlowVar variable.
3 Loops until lineFlowVar contains the string value returned by readLine
4 Displays some status messages about what the program is doing
5 Suspends the main thread for 3 seconds; sleep’s parameter is in ms
6 When the loop finishes, lineFlowVar can be read immediately using the ^ operator. This line displays the input that was read by readLine.

Save listing 6.8 as listing5.nim, and then compile and run it. The application will wait until you enter some input into the terminal. It will check for input every 3 seconds.

Using the FlowVar type is straightforward. You can read the value inside it with the ^ operator, but this operator will block the thread it’s used in until the FlowVar it’s called on contains a value. You can check whether a FlowVar contains a value by using the isReady procedure. Listing 6.8 checks whether the lineFlowVar variable contains a value periodically, every 3 seconds.

Keep in mind that listing 6.8 is meant to demonstrate how the FlowVar[T] works. It’s not meant to be very practical, because the program will only check for input every 3 seconds.

In this example, you could just as well call readLine on the main thread, since there’s nothing else running on it. A more realistic example might replace the sleep(3000) statement with another procedure that does some useful work on the main thread. For example, you might draw your application’s user interface or call the asynchronous I/O event loop’s poll procedure, as in chapter 3.

6.2.3. Exceptions in threads

The ways exceptions behave in separate threads may be surprising. When a thread crashes with an unhandled exception, the application will crash with it. It doesn’t matter whether you read the value of the FlowVar or not.

Future versions

This behavior will change in a future version of Nim, so that exceptions aren’t raised unless you read the value of the FlowVar.

The following listing shows this behavior in action.

Listing 6.9. Exceptions in a spawned procedure

import threadpool

proc crash(): string =
  raise newException(Exception, "Crash")

let lineFlowVar = spawn crash()
sync()

Save listing 6.9 as listing6.nim, and then compile and run it. You should see a traceback in the output pointing you to the raise statement in the crash procedure.

The raises pragma

The raises pragma can be used to ensure that your threads handle all exceptions. To make use of it, you can define the crash procedure like so: proc crash(): string {.raises: [].} = .... This will mark the crash procedure as raising no exceptions. Exceptions that are allowed to be raised by the procedure can be specified in the square brackets.

In summary, the simplicity of both passing arguments to a spawned procedure and receiving the procedure’s result makes spawn good for tasks that have a relatively short runtime. Such tasks typically produce results at the end of their execution, and as such don’t need to communicate with other threads until their execution stops.

For long-running tasks that need to communicate with other threads periodically, the createThread procedure defined in the threads module should be used instead.

6.3. Parsing data

Now that you know how to use threads in Nim, let’s look at a practical example of how they can be used. The example in this section involves parsers and shows a practical use case involving Nim’s concurrency and parallelism features.

A lot of data is generated every day from many different sources and intended for many different applications. Computers are very useful tools for processing this data, but in order for that data to be consumed, the computers must understand the format the data is stored in.

A parser is a software component that takes data as input and builds a data structure out of it. The input data is typically in the form of text. In chapter 3, you looked at the JSON data format and at how it was parsed, using the json module, into a data structure that could then be queried for specific information.

There often comes a time when you need to write a custom parser for a simple data format. There are many ways such a task can be tackled in Nim.

In this section, I’ll show you how to write a parser for Wikipedia’s page-view data.^[2] This data is useful for many different applications, but in this section we’ll create an application that will find the most popular page in the English Wikipedia. In the process, you’ll do the following:

²
https://wikitech.wikimedia.org/wiki/Analytics/Data/Pagecounts-all-sites.

Learn the structure and format of the Wikipedia page-counts files
Use different techniques to write a parser for the page-counts format
Read large files by breaking them up into conveniently sized chunks or fragments

Wikipedia API

Wikipedia recently introduced a Pageview API (https://wikitech.wikimedia.org/wiki/Analytics/PageviewAPI) that supplements the raw page-view data and makes finding the most popular page in the English Wikipedia much easier. If you’re writing an application that needs to find the most popular pages on Wikipedia, you may want to use the API instead. Parsing the raw data manually is less efficient, but you’ll find the example applicable to other tasks.

At the end of this section, I’ll also show you how to parallelize the parser, allowing it to perform better on systems with multicore CPUs.

6.3.1. Understanding the Wikipedia page-counts format

The raw page-count data can be downloaded from Wikipedia here: https://dumps.wikimedia.org/other/pagecounts-all-sites/.

The data files are organized into specific years and months. For example, the page-count data for January 2016 is available at https://dumps.wikimedia.org/other/p-agecounts-all-sites/2016/2016-01/. The data is then further subdivided into days and hours. Each file at the preceding URL represents the visitors within a single hour. The files are all gzipped to reduce their size.

Download the following file and then extract it: https://dumps.wikimedia.org/other/pagecounts-all-sites/2016/2016-01/pagecounts-20160101-050000.gz.

For Windows users

On Windows, you may need to install 7-Zip or another application for extracting gzipped archives.

The file may take a while to download, depending on your internet speed. It’s around 92 MB before extraction, and around 428 MB after extraction, so it’s a fairly large file. The parser will need to be efficient to parse that file in a timely manner.

The file is filled with lines of text separated by newline characters. Each line of text consists of the following four fields separated by spaces:

domain_code page_title count_views total_response_size

domain_code contains an abbreviated domain name; for example, en.wikipedia.org is abbreviated as en. page_title contains the title of the page requested; for example, Dublin for http://en.wikipedia.org/wiki/Dublin. count_views contains the number of times the page has been viewed within the hour. Finally, total_response_size is the number of bytes that have been transferred due to requests for that page.

For example, consider the following line:

en Nim_(programming_language) 1 70231

This means that there was one request to http://en.wikipedia.org/wiki/Nim_(programming_language) that accounted in total for 70,231 response bytes.

The file I asked you to download is one of the smaller files from January 2016. It contains data about the Wikipedia pages visited from January 1, 2016, 4:00 a.m. UTC, to January 1, 2016, 5:00 a.m. UTC.

6.3.2. Parsing the Wikipedia page-counts format

There are many different options when it comes to parsing the page-counts format. I’ll show you how to implement a parser using three different methods: regular expressions, the split procedure, and the parseutils module.

Parsing using regular expressions

A common way to parse data is using regular expressions (regexes), and if you’ve ever dealt with string processing in any way, you’ve likely come across them. Regular expressions are very popular, and often when developers need to parse a string, they immediately jump to using regular expressions.

Regular expressions are by no means a magical solution to every parsing problem. For example, writing a regular expression to parse arbitrary HTML is virtually impossible.^[3] But for parsing a simple data format like the Wikipedia page-counts format, regular expressions work well.

³
That doesn’t stop people from trying: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags.

Learning about regular expressions

Explaining regular expressions in depth is beyond the scope of this chapter. If you aren’t familiar with them, I encourage you to read up on them online.

Regular expressions are supported in Nim via the re module. It defines procedures and types for using regular expressions to parse and manipulate strings.

Warning: External dependency

The re module is an impure module, which means it depends on an external C library. In re’s case, the C library is called PCRE, and it must be installed alongside your application for your application to function properly.

Let’s focus on parsing a single line first. The following listing shows how you can do that with the re module.

Listing 6.10. Parsing data with the `re` module

import re                                                  1

let pattern = re"([^s]+)s([^s]+)s(d+)s(d+)"         2

var line = "en Nim_(programming_language) 1 70231"
var matches: array[4, string]                              3
let start = find(line, pattern, matches)                   4
doAssert start == 0                                        5
doAssert matches[0] == "en"                                6
doAssert matches[1] == "Nim_(programming_language)"        6
doAssert matches[2] == "1"                                 6
doAssert matches[3] == "70231"                             6
echo("Parsed successfully!")

1 The re module defines the find procedure.
2 A new regex pattern is constructed using the re constructor.
3 This matches array will hold the matched substrings of line.
4 The find procedure is used to find matching substrings as specified by the subgroups in the regex. The substrings are put into the matches array.
5 The return value indicates the starting position of the matching string; -1 is returned if no match was made.
6 The first matching group will capture the substring “en”, followed by the second matching group, which will capture “Nim_(programming_language)”, and so on.

Warning: The re constructor

Constructing a regular expression is an expensive operation. When you’re performing multiple regex matches with the same regular expression, make sure you reuse the value returned by the re constructor.

Save listing 6.10 as listing7.nim, and then compile and run it. The program should compile and run successfully, displaying “Parsed successfully!”

PCRE problems

If the program exits with an error similar to could not load: pcre.dll, you’re missing the PCRE library and must install it.

The code for parsing strings with regular expressions is straightforward. As long as you know how to create regular expressions, you should have no trouble using it.

The re module also includes other procedures for parsing and manipulating strings. For example, you can replace matched substrings using the replace procedure. Take a look at the documentation for the re module for more information (http://nim-lang.org/docs/re.html).

Parsing the data manually using split

You can also parse data manually in many different ways. This approach provides multiple advantages but also a few disadvantages. The biggest advantage over using regular expressions is that your application will have no dependency on the PCRE library. Manual parsing also makes it easier to control the parsing process. In some cases, the biggest disadvantage is that it takes more code to parse data manually.

For such a simple data format as the Wikipedia page-counts format, you can use the split procedure defined in the strutils module. The following listing shows how split can be used to parse “en Nim_(programming_language) 1 70231.”

Listing 6.11. Parsing using `split`

import strutils                                           1

var line = "en Nim_(programming_language) 1 70231"
var matches = line.split()                                2
doAssert matches[0] == "en"                               3
doAssert matches[1] == "Nim_(programming_language)"       3
doAssert matches[2] == "1"                                3
doAssert matches[3] == "70231"                            3

1 The strutils module defines the split procedure.
2 By default, the split procedure splits the string when it finds whitespace. The returned sequence will be @[“en”, “Nim_(programming_language)”, “1”, “70231”].
3 The contents of the resulting matches variable are the same as in listing 6.10.

This solution will work very well for this use case, but for more-complex data formats, you may want a solution that’s more flexible. The most flexible way to parse a string is to iterate over every character in that string using a while loop. This method of parsing is very verbose, but it’s useful in certain circumstances, such as when parsing more-complex data formats like HTML. Nim’s parseutils module defines procedures that make parsing using such methods much easier.

Parsing data manually using parseutils

The following listing shows how the parseutils module can be used to parse “en Nim_(programming_language) 1 70231.”

Listing 6.12. Parsing using `parseutils`

import parseutils                                           1

var line = "en Nim_(programming_language) 1 70231"

var i = 0                                                   2
var domainCode = ""                                         3
i.inc parseUntil(line, domainCode, {' '}, i)                4
i.inc                                                       5
var pageTitle = ""                                          3
i.inc parseUntil(line, pageTitle, {' '}, i)                 4
i.inc                                                       5
var countViews = 0                                          3
i.inc parseInt(line, countViews, i)                         6
i.inc                                                       5
var totalSize = 0                                           3
i.inc parseInt(line, totalSize, i)                          6

doAssert domainCode == "en"
doAssert pageTitle == "Nim_(programming_language)"
doAssert countViews == 1
doAssert totalSize == 70231

1 Imports parseutils, which defines parseUntil
2 Defines a counter to keep track of the program’s current position in the string
3 Defines a string or int variable where the parsed token will be stored
4 Copies characters starting at index i from the line string to the string specified in the second argument, until line[i] == ‘ ’. The returned value is the number of characters captured, and it’s used to increment i.
5 Skips whitespace character by simply incrementing i
6 Parses an int starting at index i in the line string. The parsed int is stored in the second argument. The returned value is the number of characters captured.

The code in listing 6.12 is far more complex than the previous listing, but it allows for far greater flexibility.

The parseutils module also defines many other procedures that are useful for parsing. They’re mostly convenience wrappers over a while loop. For example, the equivalent code for i.inc parseUntil(line, domainCode, {' '}, i) is the following:

while line[i] != ' ':
  domainCode.add(line[i])
  i.inc

Because of the flexibility of this parser, the code can parse the last two fields into integers in a single step. That’s instead of having to first separate the fields and then parse the integer, which is inefficient.

In summary, the split procedure is the simplest approach, but it’s slower than parseutils because it needs to create a sequence and new strings to hold the matches. In comparison, the parsing code that uses parseutils only needs to create two new strings and two new integers; there’s no overhead associated with the creation of a sequence.

The regex parsing code is also simpler than parseutils, but it suffers from the PCRE dependency and is also slower than the parseutils parser.

This makes the parseutils parser the best solution for this use case, even though it’s slightly more complex and significantly more verbose. Its speed will come in handy when parsing the 7,156,099 lines in the pagecounts-20160101-050000 file.

6.3.3. Processing each line of a file efficiently

The Wikipedia page-counts files are large. Each measures around 500 MB and contains around 10 million lines of data. The pagecounts-20160101-050000 file that I asked you to download measures 428 MB and contains 7,156,099 lines of page-count data.

In order to parse this file efficiently, you’ll need to consume the file in fragments. Reading the full file into your program’s memory would consume at least 428 MB of RAM, and the actual consumption would likely be far larger due to various overheads. That’s why it’s a good idea to read large files by breaking them up into conveniently sized, smaller fragments, otherwise known as chunks.

Using an iterator to read a file in fragments

Nim defines an iterator called lines that iterates over each line in a file. This iterator doesn’t need to copy the full file’s contents into the program’s memory, which makes it very efficient. The lines iterator is defined in the system module.

The following listing shows how the lines iterator can be used to read lines from the pagecounts-20160101-050000 file.

Listing 6.13. Iterating over each line in a file

import os                                            1
proc readPageCounts(filename: string) =              2
  for line in filename.lines:                        3
    echo(line)                                       4

when isMainModule:                                   5
  const file = "pagecounts-20160101-050000"          6
  let filename = getCurrentDir() / file              7
  readPageCounts(filename)                           8

1 The os module defines the getCurrentDir procedure.
2 Defines a readPageCounts procedure that takes the filename of the page-counts file as an argument
3 Iterates through each line in the file located at filename using the lines iterator
4 Displays each line that was read
5 Checks whether this module is being compiled as the main module
6 Defines a file constant and assigns it the name of the page-counts file
7 Defines a filename variable and assigns it the path of the program’s current working directory joined with file. The / operator is defined in the os module and is used to concatenate file paths.
8 Calls the readPageCounts procedure and passes the value of the filename variable as an argument

Save listing 6.13 as sequential_counts.nim, and then compile and run it. The program will take around a minute to execute because it will display each line of the page-counts file. You may terminate it by pressing Ctrl-C. As it runs, you can observe the memory usage, which should remain low.

Parsing each line

You can now add the parsing code from listing 6.12 to the code in listing 6.13. Listing 6.14 shows how the parser can be integrated into listing 6.13, with the changes highlighted in bold.

Listing 6.14. Parsing each line in a file

import os, parseutils
proc parse(line: string, domainCode, pageTitle: var string,
       countViews, totalSize: var int) =                             1

  var i = 0
  domainCode.setLen(0)                                               2
  i.inc parseUntil(line, domainCode, {' '}, i)
  i.inc
  pageTitle.setLen(0)                                                2
  i.inc parseUntil(line, pageTitle, {' '}, i)
  i.inc
  countViews = 0                                                     3
  i.inc parseInt(line, countViews, i)
  i.inc
  totalSize = 0                                                      3
  i.inc parseInt(line, totalSize, i)

proc readPageCounts(filename: string) =
  var domainCode = ""
  var pageTitle = ""
  var countViews = 0
  var totalSize = 0
  for line in filename.lines:
    parse(line, domainCode, pageTitle, countViews, totalSize)       4
    echo("Title: ", pageTitle)                                      5

when isMainModule:
  const file = "pagecounts-20160101-050000"
  let filename = getCurrentDir() / file
  readPageCounts(filename)

1 The variables in which the parsed tokens are stored are passed by reference. This is efficient because new strings don’t have to be allocated for each call to parse.
2 The length of the string is reset to 0. This is much more efficient than assigning “” because setLen reuses memory instead of allocating new strings.
3 The integer variables are simply reset to 0.
4 Calls the parse procedure and passes it the current line together with variables where tokens can be stored
5 Displays the title of each page that was found in the page-counts file

Replace the code in sequential_counts.nim with the code in listing 6.14. The following listing shows what some of the output from sequential_counts.nim may look like.

Listing 6.15. The output of sequential_counts.nim

...
Title: List_of_digital_terrestrial_television_channels_(UK)
Title: List_of_diglossic_regions
Title: List_of_dignitaries_at_the_state_funeral_of_John_F._Kennedy
Title: List_of_dimensionless_quantities
Title: List_of_diners
Title: List_of_dinosaur_genera
Title: List_of_dinosaur_specimens_with_nicknames
Title: List_of_dinosaurs
...

The code in listing 6.14 employs a number of optimizations. First, the biggest slowdowns in Nim applications are often caused by too many variables being allocated and deallocated. The parse procedure could return the parsed tokens, but that would result in a new string being allocated for each iteration. Instead, the parse procedure here accepts a mutable reference to two strings and two ints, which it then fills with the parsed tokens. A file that takes 9.3 seconds to parse without this optimization takes 7.8 seconds to parse with the optimization. That’s a difference of 1.5 seconds.

The use of setLen is another optimization. It ensures that the string isn’t reallocated but is instead reused. The parse procedure is executed at least 7 million times, so a tiny optimization creates massive gains in total execution speed.

Finding the most popular article

Now that the parsing code has been introduced, all that’s left is to find the most popular article on the English Wikipedia. The following listing shows the finished sequential_counts application with the latest changes shown in bold.

Listing 6.16. The finished sequential_counts.nim

import os, parseutils

proc parse(line: string, domainCode, pageTitle: var string,
    countViews, totalSize: var int) =
  var i = 0
  domainCode.setLen(0)
  i.inc parseUntil(line, domainCode, {' '}, i)
  i.inc
  pageTitle.setLen(0)
  i.inc parseUntil(line, pageTitle, {' '}, i)
  i.inc
  countViews = 0
  i.inc parseInt(line, countViews, i)
  i.inc
  totalSize = 0
  i.inc parseInt(line, totalSize, i)

proc readPageCounts(filename: string) =
  var domainCode = ""
  var pageTitle = ""
  var countViews = 0
  var totalSize = 0
  var mostPopular = ("", "", 0, 0)                                       1
  for line in filename.lines:
    parse(line, domainCode, pageTitle, countViews, totalSize)
    if domainCode == "en" and countViews > mostPopular[2]:               2
      mostPopular = (domainCode, pageTitle, countViews, totalSize)       3

  echo("Most popular is: ", mostPopular)

when isMainModule:
  const file = "pagecounts-20160101-050000"
  let filename = getCurrentDir() / file
  readPageCounts(filename)

1 Defines a tuple to store the four parsed fields for the most popular page
2 Checks whether the current line contains information about a page from the English Wikipedia and whether its view count is greater than that of the currently most popular page
3 If it’s greater, saves it as the new most popular page

Warning: Release mode

Ensure that you compile sequential_counts.nim in release mode by passing the -d:release flag to the Nim compiler. Without that flag, the execution time of the application will be significantly longer.

Replace the contents of sequential_counts.nim with the code in listing 6.16, and then compile it in release mode and run it. After a few seconds, you should see output similar to the following.

Listing 6.17. Output for sequential_counts.nim

Most popular is: (Field0: en, Field1: Main_Page, Field2: 271165, Field3: 4791147476)

The most popular page in the English Wikipedia is in fact the main page! This makes a lot of sense, and although it’s obvious in hindsight, it’s trivial to edit the code you’ve written to find more-interesting statistics. I challenge you to edit sequential_counts.nim and play around with the data. You can try finding the top-10 most popular pages in the English Wikipedia, or you can download different page-counts files and compare the results.

You should now have a good understanding of how you can parse data effectively. You’ve learned what bottlenecks you should look out for in your Nim applications and how to fix them. The next step is to parallelize this parser so that its execution time is even lower on multicore CPUs.

6.4. Parallelizing a parser

In order for the program to be parallel, it must make use of threads. As mentioned previously, there are two ways that threads can be created in Nim: using the threads module, or using the threadpool module. Both will work, but the threadpool module is more appropriate for this program.

6.4.1. Measuring the execution time of sequential_counts

Before we parallelize the code, let’s measure how long sequential_counts takes to execute.

This can be done very easily on UNIX-like OSs by using the time command. Executing time ./sequential_counts should output the execution time of sequential_counts.nim. On a MacBook Pro with an SSD and a dual-core 2.7 GHz Intel Core i5 CPU, which includes hyperthreading, the execution time is about 2.8 seconds.

On Windows, you’ll need to open a new Windows PowerShell window, and then use the Measure-Command command to measure the execution time. Executing Measure-Command {./sequential_counts.exe} should output the execution time.

The program currently runs in a single thread and is very CPU-intensive. This means its speed can be significantly improved by making it parallel.

6.4.2. Parallelizing sequential_counts

Create a new parallel_counts.nim file. This is the file that we’ll populate with code from now on.

How can the threadpool module be used to parallelize this code? You may be tempted to spawn the parse procedure, but this won’t work because it needs var parameters that can’t safely be passed to a spawned procedure. It also wouldn’t help much, because a single call to parse is relatively quick.

Before you can parallelize this code, you must first change the way that the page-counts file is read. Instead of reading each line separately, you need to read the file in larger fragments. But what size fragment should you read?

Consider the following scenario. The page-counts file begins with the following lines:

en Main_Page 123 1234567
en Nim_(programming_language) 100 12415551

If the fragment size is so small that only "en Main_Page" is read, the program will fail because the size of the fragment is insufficient.

Alternatively, a fragment might contain valid data at the start, but it may end with a line that was not fully read, such as "en Main_Page 123 1234567 en Nim_". This data will need to be split after every newline (" "), and each line will need to be parsed separately. The last line in this example will lead to an error, because it’s not complete. A solution is to find where the last line ends, and then defer parsing the line that hasn’t been fully read until the next time a fragment of the file is read.

Here’s how parallel_counts.nim should work:

Instead of reading lines, a large fragment of text should be read.
A new procedure called parseChunk should be created.
The parseChunk procedure should receive a fragment of text, go through each line, and pass the line to the parse procedure.
At the same time, it should check which of the parsed pages are the most popular.
The parseChunk procedure should be spawned. A slice of the fragment should be passed to parseChunk, and the slice should not contain any incomplete lines.
The incomplete line should be saved. Once the next fragment is read, the incomplete line should be prepended to the newly read fragment.

Terminology

The term chunk is synonymous with the term fragment, and throughout this chapter both will be used interchangeably. A slice means a subset of the full data, such as a substring.

Listings 6.18, 6.19, and 6.20 show different sections of a parallel_counts.nim file that implements this solution.

6.4.3. Type definitions and the parse procedure

Listing 6.18 starts with the top section of the file, which is not much different from the sequential version. This section includes the import statement, some new type definitions, and the original parse procedure. A new Stats type is defined to store page-count statistics about a specific page; this type will be used to store the most popular page in each spawned procedure. The Stats type will be returned from the spawned procedure, so it must be a ref type because spawn currently can’t spawn procedures that return custom value types. A new procedure called newStats is also defined, which constructs a new empty Stats object. There’s also the definition of $, which converts a Stats type to a string.

Listing 6.18. The top section of parallel_counts.nim

import os, parseutils, threadpool, strutils                               1

type
  Stats = ref object                                                      2
    domainCode, pageTitle: string                                         3
    countViews, totalSize: int                                            3

proc newStats(): Stats =                                                  4
  Stats(domainCode: "", pageTitle: "", countViews: 0, totalSize: 0)

proc `$`(stats: Stats): string =                                          5
  "(domainCode: $#, pageTitle: $#, countViews: $#, totalSize: $#)" % [
    stats.domainCode, stats.pageTitle, $stats.countViews, $stats.totalSize
  ]

proc parse(line: string, domainCode, pageTitle: var string,
    countViews, totalSize: var int) =                                     6
  if line.len == 0: return
  var i = 0
  domainCode.setLen(0)
  i.inc parseUntil(line, domainCode, {' '}, i)
  i.inc
  pageTitle.setLen(0)
  i.inc parseUntil(line, pageTitle, {' '}, i)
  i.inc
  countViews = 0
  i.inc parseInt(line, countViews, i)
  i.inc
  totalSize = 0
  i.inc parseInt(line, totalSize, i)

1 The threadpool module is required for spawn, and the strutils module is required for the % operator.
2 Defines a new Stats type that will hold information about a page’s statistics. The type has to be defined as a ref because a procedure that returns a non-ref type can’t be spawned.
3 The Stats type defines fields for each of the parsed tokens.
4 Defines a new procedure called newStats that acts as a constructor for the Stats type
5 Defines a $ operator for the Stats type so that it can be converted to a string easily. In practice, this means that echo can display it.
6 The parse procedure is the same.

6.4.4. The parseChunk procedure

Listing 6.19 shows the middle section of the parallel_counts.nim file. It defines a new procedure called parseChunk, which takes a string parameter called chunk and returns the most popular English Wikipedia page in that fragment. The fragment consists of multiple lines of page-count data.

The procedure begins by initializing the result variable; the return type is a ref type that must be initialized so that it’s not nil. The rest of the procedure is similar to the readPageCounts procedure in the sequential_counts.nim file. It defines four variables to store the parsed tokens, and then it iterates through the lines in the chunk using the splitLines iterator, and parses each of the lines.

Listing 6.19. The middle section of parallel_counts.nim

proc parseChunk(chunk: string): Stats =                                   1
  result = newStats()                                                     2
  var domainCode = ""                                                     3
  var pageTitle = ""                                                      3
  var countViews = 0                                                      3
  var totalSize = 0                                                       3
  for line in splitLines(chunk):                                          4
    parse(line, domainCode, pageTitle, countViews, totalSize)             5
    if domainCode == "en" and countViews > result.countViews:             6
      result = Stats(domainCode: domainCode, pageTitle: pageTitle,        7
                     countViews: countViews, totalSize: totalSize)

1 The parseChunk procedure is very similar to the readPageCounts procedure in sequential_counts.nim.
2 Initializes the result variable with a new value of the Stats type
3 Creates variables to store the parsed tokens.
4 Iterates over every line in chunk
5 Calls the parse procedure on each line inside the chunk to parse into the 4 fields: domainCode, pageTitle, countViews, and totalSize
6 Checks if the parsed page is in the English Wikipedia and whether it got more views than the page stored in result
7 If that’s the case, result is assigned the parsed page.

6.4.5. The parallel readPageCounts procedure

Listing 6.20 shows the readPageCounts procedure, which has been modified significantly since the last time you saw it in listing 6.16. It now takes an optional parameter called chunkSize that determines how many characters it should read each iteration. But the procedure’s implementation is what differs most. The file is opened manually using the open procedure, followed by definitions of variables required to properly store the results of the fragment-reading process.

The fragment-reading process is complicated by the fact that the code needs to keep track of unfinished lines. It does so by moving backwards through the contents of buffer, which stores the fragment temporarily, until it finds a newline character. The buffer string is then sliced from the start of the fragment to the end of the last full line in the fragment. The resulting slice is then passed to the parseChunk procedure, which is spawned in a new thread using spawn.

The end of the fragment that hasn’t yet been parsed is then moved to the beginning of the buffer. In the next iteration, the length of the characters that will be read will be chunkSize minus the length of the buffer that wasn’t parsed in the last iteration.

Listing 6.20. The last section of parallel_counts.nim

proc readPageCounts(filename: string, chunkSize = 1_000_000) =                 1
  var file = open(filename)                                                    2
  var responses = newSeq[FlowVar[Stats]]()                                     3
  var buffer = newString(chunkSize)                                            4
  var oldBufferLen = 0                                                         5
  while not endOfFile(file):                                                   6
    let reqSize = chunksize - oldBufferLen                                     7
    let readSize = file.readChars(buffer, oldBufferLen, reqSize) + oldBufferLen8
    var chunkLen = readSize                                                    9

    while chunkLen >= 0 and buffer[chunkLen - 1] notin NewLines:               10
      chunkLen.dec

    responses.add(spawn parseChunk(buffer[0 .. <chunkLen]))                    11
    oldBufferLen = readSize - chunkLen
    buffer[0 .. <oldBufferLen] = buffer[readSize - oldBufferLen .. ^1]         12

  var mostPopular = newStats()
  for resp in responses:                                                       13
    let statistic = ^resp                                                      14
    if statistic.countViews > mostPopular.countViews:                          15
      mostPopular = statistic

  echo("Most popular is: ", mostPopular)

  file.close()                                                                 16

when isMainModule:
  const file = "pagecounts-20160101-050000"
  let filename = getCurrentDir() / file
  readPageCounts(filename)

1 The readPageCounts procedure now includes a chunkSize parameter with a default value of 1_000_000. The underscores help readability and are ignored by Nim.
2 The open procedure is now used to open a file. It returns a File object that’s stored in the file variable.
3 Defines a new responses sequence to hold the FlowVar objects that will be returned by spawn
4 Defines a new buffer string of length equal to chunkSize. Fragments will be stored here.
5 Defines a variable to store the length of the last buffer that wasn’t parsed
6 Loops until the full file is read
7 Calculates the number of characters that need to be read
8 Uses the readChars procedure to read the reqSize number of characters. This procedure will place the characters that it reads starting at oldBufferLen, which will ensure that the old buffer isn’t overwritten. The oldBufferLen is added because that’s the length of the old buffer that was read previously.
9 Creates a variable to store the fragment length that will be parsed
10 Decreases the chunkLen variable until chunkLen - 1 points to any newline character
11 Creates a new thread to execute the parseChunk procedure and passes a slice of the buffer that contains a fragment that can be parsed. Adds the FlowVar[string] returned by spawn to the list of responses.
12 Assigns the part of the fragment that wasn’t parsed to the beginning of buffer
13 Iterates through each response
14 Blocks the main thread until the response can be read and then saves the response value in the statistics variable
15 Checks if the most popular page in a particular fragment is more popular than the one saved in the mostPopular variable. If it is, overwrites the mostPopular variable with it.
16 Ensures that the file object is closed

The parallel version is unfortunately more complex, but the complexity is mostly restricted to the readPageCounts procedure, where the algorithm for reading the file in fragments adds great complexity to the program. In terms of the line count, though, the parallel version is only about twice as long.

6.4.6. The execution time of parallel_counts

Merge listings 6.18, 6.19, and 6.20 into a single parallel_counts.nim file. Then compile and run the program. Make sure you pass both the --threads:on flag as well as the -d:release flag to Nim when compiling. Measure the execution time using the techniques described in section 6.4.1.

On a MacBook Pro with an SSD and a dual core 2.7 GHz Intel Core i5 CPU that includes hyperthreading, the execution time is about 1.2 seconds, which is less than half of the 2.8 seconds that the sequential version took to execute. That’s a considerable difference!

On UNIX-like systems, the time command allows you to verify that the parallel version is in fact parallel by looking at its CPU usage. For example, the time command outputs ./parallel_counts 4.30s user 0.25s system 364% cpu 1.251 total, showing that parallel_counts was using 364% of the available CPU. In comparison, sequential_counts almost always shows around 99% CPU usage. This high CPU usage percentage proves that parallel_counts is using all cores together with hyperthreading.

Now that you’ve seen how to parallelize a parser, you should have a better idea about how to parallelize Nim code in general. The last sections of this chapter will teach you about race conditions and how to avoid them.

6.5. Dealing with race conditions

You don’t typically need to worry about race conditions when writing concurrent code in Nim because of the restriction that Nim puts on GC-safe procedures: memory belonging to another thread can’t be accessed in a spawned procedure or a procedure marked using the {.thread.} pragma.

A race condition occurs when two or more threads attempt to read and write to a shared resource at the same time. Such behavior can result in unpredictable results that often are difficult to debug. This is one of the reasons why Nim prevents the sharing of some resources between threads. Nim instead prefers data to be shared using alternative methods such as channels, which prevent race conditions.

Sometimes these methods aren’t appropriate for certain use cases, such as when lots of data needs to be modified by the thread. Because of this, Nim also supports shared memory. Sharing memory via global variables is easy as long as you only want to share value types. Sharing reference types is much harder because you must make use of Nim’s manual memory-management procedures.

Warning: Shared memory

Using shared memory is risky because it increases the chances for race conditions in your code. Also, you must manage the memory yourself. I advise you to only use shared memory if you’re certain that it’s required and if you know what you’re doing. In future versions of Nim, using shared memory will likely become safer and much easier.

Listing 6.21 implements a simple program that increments the value of a global variable inside two threads running in parallel. The result is a race condition.

Listing 6.21. Race condition with shared memory

import threadpool                 1

var counter = 0                   2

proc increment(x: int) =
  for i in 0 .. <x:               3
    var value = counter           4
    value.inc                     5
    counter = value               6
spawn increment(10_000)           7
spawn increment(10_000)           7
sync()                            8
echo(counter)                     9

1 The threadpool module defines the spawn procedure.
2 Defines a global variable called counter
3 Iterates from 0 to x-1
4 Defines a new local variable called value and assigns it the value of counter
5 Increments value
6 Sets the value of the global counter variable to the value of “value”
7 Spawns two new threads that will call the increment procedure with 10_000 as the argument
8 Waits until all the threads are finished
9 Displays the value of the counter

In this example, the increment procedure is GC safe because the global variable counter it accesses is of type int, which is a value type. The increment procedure increments the global counter variable x times. The procedure is spawned twice, which means that there will be two increment procedures executing at the same time. The fact that they’re both reading, incrementing, and then writing the incremented value to the global counter variable in discrete steps means that some increments may be missed.

Sharing memory that must be allocated on the heap

Value types, such as integers, can exist on the stack (or in the executable’s data section if the value is stored in a global variable), but reference types such as string, seq[T] and ref T can’t. Nim supports the sharing of reference types, but it won’t manage the memory for you. This may change in a future version of Nim, but currently you must use a procedure called allocShared defined in the system module to allocate shared memory manually.

Save listing 6.21 as race_condition.nim, and then compile it without the -d:release flag and run it. Run it a couple of times and note the results. The results should appear random, and should almost never display the expected value of 20,000. Figure 6.5 shows what the execution of listing 6.21 looks like.

Figure 6.5. Synchronized and unsynchronized execution of listing 6.21

Preventing race conditions is very important because whenever a bug occurs due to a race condition, it’s almost always nondeterministic. The bug will be very difficult to reproduce, and once it is reproduced, debugging it will be even harder because the mere act of doing so may cause the bug to disappear.

Now that you know what race conditions are, let’s look at ways to prevent them.

6.5.1. Using guards and locks to prevent race conditions

Just like most languages, Nim provides synchronization mechanisms to ensure that resources are only used by a single thread at a time.

One of these mechanisms is a lock. It enforces limits on access to a resource, and it’s usually paired with a single resource. Before that resource is accessed, the lock is acquired, and after the resource is accessed, it’s released. Other threads that try to access the same resource must attempt to acquire the same lock, and if the lock has already been acquired by another thread, the acquire operation will block the thread until the lock is released. This ensures that only one thread has access to the resource.

Locks work very well, but they aren’t assigned to any variables by default. They can be assigned using a guard. When a variable is guarded with a specific lock, the compiler will ensure that the lock is locked before allowing access. Any other access will result in a compile-time error.

The following listing shows how a new Lock, together with a guard, can be defined.

Listing 6.22. Attempting to access a guarded global variable from a thread

import threadpool, locks                    1

var counterLock: Lock                       2
initLock(counterLock)                       3
var counter {.guard: counterLock.} = 0      4

proc increment(x: int) =
  for i in 0 .. <x:
    var value = counter
    value.inc
    counter = value

spawn increment(10_000)
spawn increment(10_000)
sync()
echo(counter)

1 Imports the locks module, which defines the Lock type and associated procedures
2 Defines a new counterLock of type Lock
3 Initializes the counterLock lock using the initLock procedure
4 Uses the {.guard.} pragma to ensure that the counter variable is protected by the counterLock lock

Save listing 6.22 as unguarded_access.nim, and then compile it. The compilation should fail with “unguarded_access.nim(9, 17) Error: unguarded access: counter.” This is because the counter variable is protected by the guard, which ensures that any access to counter must occur after the counterLock lock is locked. Let’s fix this error by locking the counterLock lock.

Listing 6.23. Incrementing a global variable with a lock

import threadpool, locks

var counterLock: Lock
initLock(counterLock)
var counter {.guard: counterLock.} = 0

proc increment(x: int) =
  for i in 0 .. <x:
    withLock counterLock:               1
      var value = counter
      value.inc
      counter = value

spawn increment(10_000)
spawn increment(10_000)
sync()
echo(counter)

1 The code that accesses the counter variable is now inside a withLock section. This locks the lock and ensures that it’s unlocked after the code under the withLock body ends.

Save the code in listing 6.23 as parallel_incrementer.nim, and then compile and run it. The file should compile successfully and its output should always be 20000, which means that the race condition is fixed! The fact that the compiler verifies that every guarded variable is locked properly ensures the safe execution of the code. It also helps prevent bugs from appearing accidentally in the future, when new code is added or existing code is changed.

6.5.2. Using channels so threads can send and receive messages

Despite all Nim’s efforts to make locks as safe as possible, they may not always be the safest choice. And for some use cases, they may simply be inappropriate, such as when threads share very few resources. Channels offer an alternative form of synchronization that allows threads to send and receive messages between each other.

A channel is an implementation of a queue—a first-in-first-out (FIFO) data structure. This means that the first value to be added to the channel will be the first one to be removed. The best way to visualize such a data structure is to imagine yourself queuing for food at a cafeteria. The first person to queue is also the first person to get their food. Figure 6.6 shows a representation of a FIFO channel.

Figure 6.6. Representation of a FIFO channel

Nim implements channels in the channels module of the standard library. This module is part of system, so it doesn’t need to be explicitly imported.

A channel is created as a global variable, allowing every thread to send and receive messages through it. Once a channel is defined, it must be initialized using the open procedure. Listing 6.24 defines and initializes a new chan variable of type Channel [string]. You can specify any type inside the square brackets, including your own custom types.

Listing 6.24. Initializing a channel using `open`

var chan: Channel[string]
open(chan)

Values can be sent using the send procedure and received using the recv procedure. The following listing shows how to use both procedures.

Listing 6.25. Sending and receiving data through a channel

import os, threadpool                1
var chan: Channel[string]
open(chan)

proc sayHello() =
  sleep(1000)                        2
  chan.send("Hello!")

spawn sayHello()                     3
doAssert chan.recv() == "Hello!"     4

1 The os module defines the sleep procedure. The threadpool module is needed for spawn.
2 The sayHello procedure will sleep its thread for 1 second before sending a message through chan.
3 Executes the sayHello procedure in another thread
4 Blocks the main thread until a “Hello!” is received

The recv procedure will block until a message is received. You can use the tryRecv procedure to get nonblocking behavior; it returns a tuple consisting of a Boolean, which indicates whether or not data was received, and the actual data.

To give you a better idea of how channels work, let’s implement listing 6.23 with channels instead of locks. The following listing shows parallel_incrementer.nim implemented using channels.

Listing 6.26. parallel_incrementer.nim implemented using channels

import threadpool

var resultChan: Channel[int]            1
open(resultChan)                        2

proc increment(x: int) =
  var counter = 0                       3
  for i in 0 .. <x:
    counter.inc
  resultChan.send(counter)              4

spawn increment(10_000)
spawn increment(10_000)
sync()                                  5
var total = 0
for i in 0 .. <resultChan.peek:         6
  total.inc resultChan.recv()           7
echo(total)

1 Defines a new global Channel[int] variable
2 Initializes the channel so that messages can be sent through it
3 This time the counter variable is local to the increment procedure.
4 Once the counter calculation finishes, its value is sent through the channel.
5 Waits for both of the threads to finish
6 The peek procedure returns the number of messages waiting to be read inside the channel.
7 Reads one of the messages and increments the total by the message’s value

The global counter variable is replaced by a global resultChan channel. The increment procedure increments a local counter variable x times, and then it sends counter’s value through the channel. This is done in two different threads.

The main thread waits for the two threads to finish, at which point it reads the messages that have been sent to the resultChan. Figure 6.7 shows the execution of listing 6.26.

Figure 6.7. Execution of listing 6.26

6.6. Summary

The apparent execution of processes at the same time is called concurrency, whereas true simultaneous execution of processes is called parallelism.
Each thread in Nim has a separate heap that’s managed by a separate garbage collector.
Threads can be created using the createThread procedure defined in the threads module.
A procedure can be added to a thread pool using the spawn procedure defined in the threadpool module.
GC safety, which is enforced by the compiler, ensures that garbage-collected data isn’t shared between threads.
Data parsing can be performed using regular expressions, the split procedure, or the parseutils module.
Threads can be used to parallelize a parser.
Locks or channels can be used to synchronize the execution of threads to prevent race conditions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6. Parallelism

Create new playlist

Sign In

Sign Up

Chapter 6. Parallelism

6.1. Concurrency vs. parallelism

Figure 6.1. Concurrent execution of two processes

Figure 6.2. Parallel execution of two processes

Parallel async await

Parallelism in other programming languages

6.2. Using threads in Nim

6.2.1. The threads module and GC safety

Listing 6.1. Mutating a global variable using a Thread

The threads module

Listing 6.2. Passing data to a thread safely

Listing 6.3. Passing multiple values to a thread

Executing threads

Listing 6.4. Executing multiple threads

Listing 6.5. First possible output when the code in listing 6.4 is executed

Listing 6.6. Second possible output when the code in listing 6.4 is executed

Figure 6.3. The two possible executions of listing 6.4

6.2.2. Using thread pools

What is a thread pool?

Figure 6.4. A Nim thread pool

Using spawn

Listing 6.7. Executing multiple threads using spawn

Retrieving return values from the FlowVar type

Listing 6.8. Reading input from the terminal with spawn

6.2.3. Exceptions in threads

Future versions

Listing 6.9. Exceptions in a spawned procedure

The raises pragma

6.3. Parsing data

Wikipedia API

6.3.1. Understanding the Wikipedia page-counts format

For Windows users

6.3.2. Parsing the Wikipedia page-counts format

Parsing using regular expressions

Learning about regular expressions

Warning: External dependency

Listing 6.10. Parsing data with the re module

Warning: The re constructor

PCRE problems

Parsing the data manually using split

Listing 6.11. Parsing using split

Parsing data manually using parseutils

Listing 6.12. Parsing using parseutils

6.3.3. Processing each line of a file efficiently

Using an iterator to read a file in fragments

Listing 6.13. Iterating over each line in a file

Parsing each line

Listing 6.14. Parsing each line in a file

Listing 6.15. The output of sequential_counts.nim

Finding the most popular article

Listing 6.16. The finished sequential_counts.nim

Warning: Release mode

Listing 6.17. Output for sequential_counts.nim

6.4. Parallelizing a parser

6.4.1. Measuring the execution time of sequential_counts

6.4.2. Parallelizing sequential_counts

Terminology

6.4.3. Type definitions and the parse procedure

Listing 6.18. The top section of parallel_counts.nim

6.4.4. The parseChunk procedure

Listing 6.19. The middle section of parallel_counts.nim

6.4.5. The parallel readPageCounts procedure

Listing 6.20. The last section of parallel_counts.nim

6.4.6. The execution time of parallel_counts

6.5. Dealing with race conditions

Warning: Shared memory

Listing 6.21. Race condition with shared memory

Sharing memory that must be allocated on the heap

Figure 6.5. Synchronized and unsynchronized execution of listing 6.21

6.5.1. Using guards and locks to prevent race conditions

Listing 6.22. Attempting to access a guarded global variable from a thread

Listing 6.23. Incrementing a global variable with a lock

6.5.2. Using channels so threads can send and receive messages

Figure 6.6. Representation of a FIFO channel

Listing 6.24. Initializing a channel using open

Listing 6.25. Sending and receiving data through a channel

Table of Contents for
Chapter 6. Parallelism

Listing 6.1. Mutating a global variable using a `Thread`

Listing 6.7. Executing multiple threads using `spawn`

Listing 6.8. Reading input from the terminal with `spawn`

Listing 6.10. Parsing data with the `re` module

Listing 6.11. Parsing using `split`

Listing 6.12. Parsing using `parseutils`

Listing 6.24. Initializing a channel using `open`