Chapter 4. A tour through the standard library

This chapter covers

  • Understanding the standard library
  • Examining modules in depth
  • Getting to know the modules in Nim’s standard library
  • Using Nim’s standard library modules

Every programming language supports the notion of a library. A library is a collection of prewritten software that implements a set of behaviors. These behaviors can be accessed by other libraries or applications via a library-defined interface.

For example, a music-playback library such as libogg might define play and stop procedures that start music playing and stop it. The libogg library’s interface can be said to consist of those two procedures.

A library such as libogg can be reused by multiple applications, so that the behaviors the library implements don’t have to be reimplemented for each application.

A standard library is one that’s always available as part of a programming language. A standard library typically includes definitions of common algorithms, data structures, and mechanisms for interacting with the OS.

The design of a standard library differs between languages. Python’s standard library rather famously follows the “batteries included” philosophy, embracing an inclusive design. C’s standard library, on the other hand, takes a more conservative approach. As such, in Python you’ll find packages that allow you to process XML, send email messages, and make use of the SQLite library, whereas in C, you won’t.

The Nim standard library also follows the “batteries included” philosophy. It’s similar to Python in that regard, because it also contains packages for processing XML, sending email messages, and making use of the SQLite library, amongst a wide range of other modules. This chapter is dedicated to Nim’s standard library and will show you some of its most useful parts. In addition to describing what each part of the standard library does, this chapter presents examples of how each module in the standard library can be used.

Figures 4.1 and 4.2 show some of the most useful modules in Nim’s standard library. The difference between pure and impure modules is explained in section 4.2.

Figure 4.1. The most useful pure modules

Figure 4.2. The most useful impure modules

Let’s begin by looking in more detail at what a module is and how modules can be imported.

4.1. A closer look at modules

The Nim standard library is made up of modules. A module in Nim is a file containing Nim code, and by default the code inside a module is isolated from all other code. This isolation restricts which types, procedures, variables, and other definitions are accessible to code defined in a different module.

When a new definition is made inside a module, it’s not visible to any other modules by default. It’s private. But a definition can be made public, which means that it’s visible to other modules, using the * character. The following example.nim module defines a moduleVersion variable that’s made public by the * character.

Listing 4.1. Module example.nim
var moduleVersion* = "0.12.0"
var randomNumber* = 42

You might remember the * character from the previous chapter, where I introduced the * access modifier and used it to export identifiers from the protocol module. Let’s now take a look at the different ways that modules can be imported.

You should remember the basic import keyword, which can be used to import the example.nim module like so.

Listing 4.2. Module main.nim
import example                    1
echo(moduleVersion)               2

  • 1 The .nim extension must not be specified.
  • 2 After importing the example module, you can access the moduleVersion variable because it’s public.

The import keyword does something very straightforward—it imports all the public definitions from a specified module. But what might not be immediately obvious is how it finds the specified module.

The Nim compiler has a configurable list of directories that it searches for modules. This list is configured in a configuration file normally named nim.cfg. The compiler may use multiple configuration files, but there’s one defined by the compiler that’s always used. It usually resides in $nimDir/config, where $nimDir is the path to the Nim compiler. Listing 4.3 shows what a small part of the default Nim configuration looks like. In the listing, each line specifies a directory that the Nim compiler will look at when searching for modules.

Listing 4.3. Some of the directories in Nim’s configuration file
path="$lib/pure"           1
path="$lib/impure"
path="$lib/arch"
path="$lib/core"
...                        2

  • 1 $lib is expanded by the Nim compiler to a full path that leads to the location where Nim’s standard library has been installed.
  • 2 The configuration file contains many more options. You may wish to take a look at it to see which bits of the compiler can be configured.
Project config files

You can create a configuration file that’s specific to your project and use it to customize the behavior of the compiler when compiling your project. Create a main.nims file, where main.nim is the name of the file you’re compiling. The config file must be placed beside your Nim source code file. You can then place any flags you’d pass on the command line verbatim in that file, such as --threads:on.

When a module is imported using the import statement, the Nim compiler searches for files alongside the module that’s doing the importing. If the module isn’t found there, it searches each of the directories defined in the configuration file. This means that for the main.nim module in listing 4.2 to compile, the example.nim module in listing 4.1 should be placed alongside the main.nim module. Figure 4.3 shows how the compiler searches for modules.

Figure 4.3. The compiler searches for modules starting in the project’s directory.

When compiling main.nim, the local example module and the standard library system module need to be compiled first, so the compiler will search for those modules first and compile them automatically.

Modules can also be placed in subdirectories. For example, consider the directory structure shown in figure 4.4.

Figure 4.4. The example.nim file has been moved into the misc directory.

With the example module in the misc directory, the main module needs to be modified as follows.

Listing 4.4. Importing from a subdirectory
import misc/example
echo(moduleVersion)

The misc directory simply needs to be added to the import statement.

4.1.1. Namespacing

Namespaces are common in many programming languages. They act as a context for identifiers, allowing the same identifier to be used in two different contexts. Language support for namespaces varies widely. C doesn’t support them, C++ contains an explicit keyword for defining them, and Python uses the module name as the namespace. Just like in Python, namespaces in Nim are defined by individual modules.

To get a better idea of what namespacing is used for, let’s look at an example use case. Assume that you wish to load images of two separate formats: PNG and BMP. Also assume that there are two libraries for reading the two types of files: one called libpng and the other called libbmp. Both libraries define a load procedure that loads the image for you, so if you want to use both libraries at the same time, how do you distinguish between the two load procedures?

If those libraries are written in C, they would need to emulate namespaces. They’d do this by prefixing the procedure names with the name of the library, so the procedures would be named png_load and bmp_load to avoid conflicts. C++ versions of those libraries might define namespaces such as png and bmp, and the load procedures could then be invoked via png::load and bmp::load. Python versions of those libraries don’t need to explicitly define a namespace—the module name is the namespace. In Python, if the PNG and BMP libraries define their load procedures in png and bmp modules, respectively, the load procedures can be invoked via png.load and bmp.load.

In Nim, when a module is imported, all of its public definitions are placed in the namespace of the importing module. You can still specify the fully qualified name, but doing so isn’t required. This is in contrast to how the Python module system works.

import example
echo(example.moduleVersion)           1

  • 1 Specify the module namespace explicitly by writing the module name followed by a dot character.

The module namespace only needs to be specified when the same definition has been imported from two different modules. Let’s say a new module called example2.nim was imported, and example2.nim also defines a public moduleVersion variable. In that case, the code will need to explicitly specify the module name.

Listing 4.5. Module example2.nim
var moduleVersion* = "10.23"
Listing 4.6. Disambiguating identifiers
import example, example2                               1
echo("Example's version: ", example.moduleVersion)
echo("Example 2's version: ", example2.moduleVersion)

  • 1 An import statement can import multiple modules. You just need to separate them with a comma.

Compiling and running the code in listing 4.6 will result in the following output:

Example's version: 0.12.0
Example 2's version: 10.23

But suppose you attempt to display the value of moduleVersion without qualifying it.

import example, example2
echo(moduleVersion)

In that case, you’ll receive an error:

main.nim(2,6) Error: ambiguous identifier: 'moduleVersion' -- use a qualifier

You can prevent all the definitions from being imported into the importing module’s namespace by using a special import syntax.

Listing 4.7. Importing modules into their own namespace
from example import nil            1
echo(moduleVersion)                2
echo(example.moduleVersion)        3

  • 1 Imports the example module without importing any of its definitions into this file’s namespace
  • 2 This will no longer work because moduleVersion is no longer in this file’s namespace.
  • 3 The moduleVersion variable can be accessed by explicitly writing the module namespace.

When you use the from statement, the specific definitions that you want imported can be listed after the import keyword.

Listing 4.8. Importing only some of the definitions from a module
from example import moduleVersion              1
echo(moduleVersion)                            2
echo(example.randomNumber)                     3

  • 1 Imports moduleVersion into this file’s namespace. All other public definitions need to be accessed via the example namespace.
  • 2 The moduleVersion variable can again be accessed without explicitly writing the module namespace.
  • 3 The randomNumber variable must be qualified.

Certain definitions can be excluded using the except keyword.

Listing 4.9. Excluding some definitions when importing
import example except moduleVersion
echo(example.moduleVersion)                    1
echo(moduleVersion)                            2
echo(randomNumber)                             3

  • 1 Accessing the moduleVersion variable via the module’s namespace still works.
  • 2 Accessing the moduleVersion variable without qualifying the name doesn’t work.
  • 3 Accessing the randomNumber variable without qualifying the name does work.

In Nim, it’s idiomatic to import all modules so that all identifiers end up in the importing module’s namespace, so you only need to explicitly specify the namespace when the name is ambiguous. This is different from Python, which requires every identifier that’s imported to be accessed via the module’s namespace unless the module is imported using the from x import * syntax.

Nim’s default import behavior allows flexible Uniform Function Call Syntax (UFCS) and operator overloading. Another benefit is that you don’t need to constantly retype the module names.

You might not recall the discussion on UFCS in chapter 1. It allows any procedure to be called on an object as if the function were a method of the object’s class. The following listing shows UFCS in action.

Listing 4.10. Uniform Function Call Syntax
proc welcome(name: string) = echo("Hello ", name)

welcome("Malcolm")                                  1
"Malcolm".welcome()                                 1

  • 1 Both syntaxes are valid and perform the same action.

You should now have a better understanding of Nim’s module system. Let’s go on to look at Nim’s standard library in greater detail.

4.2. Overview of the standard library

Nim’s standard library is split up into three major categories: pure, impure, and wrappers. This section will look at these categories in general. Later sections in this chapter explore a few specific modules from a couple of these categories.

4.2.1. Pure modules

A large proportion of Nim’s standard library is composed of pure modules. These modules are written completely in Nim and require no dependencies; you should prefer them because of this.

The pure modules themselves are further split up into multiple categories, including the following:

  • The core
  • Collections and algorithms
  • String handling
  • Generic OS services
  • Math libraries
  • Internet protocols
  • Parsers

4.2.2. Impure modules

Impure modules consist of Nim code that uses external C libraries. For example, the re module implements procedures and types for handling regular expressions. It’s an impure library because it depends on PCRE, which is an external C library. This means that if your application imports the re module, it won’t work unless the user installs the PCRE library on their system.

Shared libraries

Impure modules such as re use what’s known as a shared library, typically a C library that’s been compiled into a shared library file. On Windows, these files use the .dll extension, on Linux the .so extension, and on Mac OS the .dylib extension.[a]

a

See Wikipedia’s “Dynamic linker” article: https://en.wikipedia.org/wiki/Dynamic_linker#Implementations.

When you import an impure module, your application will need to be able to find these shared libraries. They’ll need to be installed via your OS’s package manager or bundled with your application. On Linux, it’s common to use a package manager; on Mac OS, both methods are fairly common; and on Windows, bundling the dependencies with your application is popular.

4.2.3. Wrappers

Wrappers are the modules that allow these external C libraries to be used. They provide an interface to these libraries that, in most cases, matches the C interface exactly. Impure modules build on top of wrappers to provide a more idiomatic interface.

You can use wrappers directly, but doing so isn’t easy because you’ll need to use some of Nim’s unsafe features, such as pointers and bit casts. This can lead to errors because in most cases you’ll need to manage memory manually.

Impure modules define abstractions to provide a memory-safe interface that you can easily use in your source code without worrying about the low-level details of C.

4.2.4. Online documentation

We’ll start looking at different modules in a moment, but I first want to mention that the Nim website contains documentation for the full standard library. A list of all the modules in the standard library can be found in the Nim documentation: http://nim-lang.org/docs/lib.html. This URL always shows the documentation for the latest release of Nim, and it contains links to documentation for each module.

The documentation for each module provides definitions and links to implementations of those definitions. It can, for example, link to a line of code where a procedure is implemented, showing you exactly how it functions.

Every part of Nim is open source, including its standard library, so you can look at the source of the standard library to see Nim code written by the Nim developers themselves. This allows you to truly understand the behavior of each part of the standard library, and you can even modify it to your liking.

Figure 4.5 shows what the documentation for the os module looks like.

Figure 4.5. The documentation for the os module

The Nim documentation also includes a Nimble section,[1] with links to community--created modules. Nimble is a Nim package manager that makes the installation of these packages easy. You’ll learn more about it in the next chapter.

1

The list of Nimble packages is split into official and unofficial lists. The official packages are ones that are officially supported by the core Nim developers, and as such they’re far more stable than some of the unofficial packages. The official packages include modules that used to be part of the standard library but which have been transferred out in order to make the standard library a bit more lean.

We’ll now look at the pure modules in a bit more detail. We’ll start with the core modules.

4.3. The core modules

The most important module in the core of the standard library is the system module. This is the only module that’s implicitly imported, so you don’t need to include import system at the top of each of your own modules. This module is imported automatically because it contains commonly used definitions.

The system module includes definitions for all the primitive types, such as int and string. Common procedures and operators are also defined in this module. Table 4.1 lists some examples.

Table 4.1. Some examples of definitions in the system module

Definitions

Purpose

Examples

+, -, *, / Addition, subtraction, multiplication, division of two numbers. doAssert(5 + 5 == 10) doAssert(5 / 2 == 2.5)
==, !=, >, <, >=, <= General comparison operators. doAssert(5 == 5) doAssert(5 > 2)
and, not, or Bitwise and Boolean operations. doAssert(true and true) doAssert(not false) doAssert(true or false)
add Adds a value to a string or sequence. var text = "hi" text.add('!') doAssert(text == "hi!")
len Returns the length of a string or sequence. doAssert("hi".len == 2)
shl, shr Bitwise shift left and shift right. doAssert(0b0001 shl 1 == 0b0010)
& Concatenation operator; joins two strings into one. doAssert("Hi" & "!" == "Hi!")
quit Terminates the application with a specified error code. quit(QuitFailure)
$ Converts the specified value into a string. This is defined in the system module for some common types. doAssert($5 == "5")
repr Takes any value and returns its string representation. This differs from $ because it works on any type; a custom repr doesn’t need to be defined. doAssert(5.repr == "5")
substr Returns a slice of the specified string. doAssert("Hello".substr(0, 1) == "He")
echo Displays the specified values in the terminal. echo(2, 3.14, true, "a string")
items An iterator that loops through the items of a sequence or string. for i in items([1, 2]): echo(i)
doAssert, assert Raises an exception if the value specified is false. (assert calls are removed when compiled with -d:release. doAssert calls are always present.) doAssert(true)

In addition to the definitions in table 4.1, the system module also contains types that map directly to C types. Remember that Nim compiles to C by default and that these types are necessary to interface with C libraries. Interfacing with C is an advanced topic; I’ll go into it in more detail in chapter 8.

Whenever the --threads:on flag is specified when compiling, the system module includes the threads and channels modules. This means that all the definitions found in those modules are available through the system module. These modules implement threads that provide a useful abstraction for concurrent execution. Concurrency will be touched on in more detail in chapter 6.

Other modules in the core category include threadpool and locks, both of which implement different threading abstractions, and macros, which implements an API for metaprogramming.

The main module in the core that you’ll be interested in is the system module. The others aren’t as important, and you’ll be using them only for specialized tasks like concurrency.

You should now have a basic idea of what some of the core modules implement, particularly the procedures and types defined in the implicitly imported system module. Next, let’s look at the modules that implement data structures and common algorithms, and how they can be used.

4.4. Data structures and algorithms

A large proportion of data structures are defined in the system module, including ones you’ve already seen in chapter 2: seq, array, and set.

Other data structures are implemented as separate modules in the standard library. These modules are listed under the “Collections and algorithms” category in the standard library documentation. They include the tables, sets, lists, queues, intsets, and critbits modules.

Many of those modules have niche use cases, so we won’t go into much detail about them, but we will talk about the tables and sets modules. We’ll also look at some modules that implement different algorithms to deal with these data structures.

4.4.1. The tables module

Assume that you’re writing an application that stores the average life expectancy of different kinds of animals. After adding all the data, you may wish to look up the average life expectancy of a specific animal. The data can be stored in many different data structures to accommodate the lookup.

One data structure that can be used to store the data is a sequence. The sequence type seq[T] defines a list of elements of type T. It can be used to store a dynamic list of elements of any type; dynamic refers to the fact that a sequence can grow to hold more items at runtime.

The following listing shows one way that the data describing the average life expectancy of different animals could be stored.

Listing 4.11. Defining a list of integers and strings
var numbers = @[3, 8, 1, 10]                            1
numbers.add(12)                                         2
var animals = @["Dog", "Raccoon", "Sloth", "Cat"]       3
animals.add("Red Panda")                                4

  • 1 Defines a new variable of type seq[int] that holds some numbers
  • 2 Adds the number 12 to the numbers sequence
  • 3 Defines a new variable of type seq[string] that holds some animals
  • 4 Adds the animal “Red Panda” to the animals sequence

In listing 4.11, the numbers variable holds the ages of each of the animals. The animals’ names are then stored in the animals sequence. Each age stored in the numbers sequence has the same position as the animal it corresponds to in animals, but that’s not intuitive and raises many issues. For example, it’s possible to add an animal’s average age expectancy to numbers without adding the corresponding animal’s name into animals, and vice versa. A better approach is to use a data structure called a hash table.

A hash table is a data structure that maps keys to values. It stores a collection of (key, value) pairs, and the key appears only once in the collection. You can add, remove, and modify these pairs as well as look up values based on a key. Hash tables typically support keys of any type, and they’re typically more efficient than any other lookup structure, which makes their use popular. Figure 4.6 shows how data about animals can be retrieved from a hash table by performing a lookup based on a key.

Figure 4.6. Looking up the value of the key "Dog" in the animalsAges hash table

The tables module implements a hash table, allowing you to write the following.

Listing 4.12. Creating a hash table
import tables                                  1
var animalAges = toTable[string, int](         2
  {                                            2
    "Dog": 3,
    "Raccoon": 8,
    "Sloth": 1,
    "Cat": 10
  })

animalAges["Red Panda"] = 12                   4

  • 1 Hash tables are in the tables module, so it needs to be imported.
  • 2 Creates a new Table[string, int] out of the mapping defined in listing 4.11. The key and value types need to be specified because the compiler can’t infer them in all cases.
  • 3 Uses the {:} syntax to define a mapping from string to int
  • 4 Adds “Red Panda” to the animalAges hash table

Several different types of hash tables are defined in the tables module: the generic version defined as Table[A, B]; the OrderedTable[A, B], which remembers the insertion order; and the CountTable[A], which simply counts the number of each key. The ordered and count tables are used far less often than the generic table because their use cases are more specific.

The Table[A, B] type is a generic type. In its definition, A refers to the type of the hash table’s key, and B refers to the type of the hash table’s value. There are no restrictions on the types of the key or the value, as long as there’s a definition of a hash procedure for the type specified as the key. You won’t run into this limitation until you attempt to use a custom type as a key, because a hash procedure is defined for most types in the standard library.

Listing 4.13. Using a custom type as a key in a hash table
import tables
type                                              1
  Dog = object                                    2
    name: string

var dogOwners = initTable[Dog, string]()          3
dogOwners[Dog(name: "Charlie")] = "John"          4

  • 1 The type keyword begins a section of code where types can be defined.
  • 2 Defines a new Dog object with a name field of type string
  • 3 The initTable procedure can be used to initialize a new empty hash table.
  • 4 Creates a new instance of the Dog object and uses that as the key. Sets the value of that key in the dogOwners hash table to “John”.

Compiling listing 4.13 will result in the following output:

file.nim(7, 10) template/generic instantiation from here          1
lib/pure/collections/tableimpl.nim(92, 21)
template/generic instantiation from here                      2
lib/pure/collections/tableimpl.nim(43, 12)
Error: type mismatch: got (Dog)                               2
but expected one of:                                              3
hashes.hash(x: T)
hashes.hash(x: pointer)
hashes.hash(x: T)
hashes.hash(x: float)
hashes.hash(x: set[A])
hashes.hash(x: T)
hashes.hash(x: string)
hashes.hash(x: int)
hashes.hash(aBuf: openarray[A], sPos: int, ePos: int)
hashes.hash(x: int64)
hashes.hash(x: char)
hashes.hash(sBuf: string, sPos: int, ePos: int)
hashes.hash(x: openarray[A])

  • 1 This refers to dogOwners[Dog(name: “Charlie”)] = “John”, where you’re trying to use the Dog as the key.
  • 2 These errors are inside the standard library because that’s where the call to hash(key) is made.
  • 3 Lists all the available definitions of the hash procedure. As you can see, there’s no definition for the Dog type present in that list.

The compiler rejects the code with the excuse that it can’t find the definition of a hash procedure for the Dog type. Thankfully, it’s easy to define a hash procedure for custom types.

Listing 4.14. Defining a hash procedure for custom types
import tables, hashes                          1
type
  Dog = object
    name: string

proc hash(x: Dog): Hash =                      2
  result = x.name.hash                         3
  result = !$result                            4

var dogOwners = initTable[Dog, string]()
dogOwners[Dog(name: "Charlie")] = "John"

  • 1 Imports the hashes module, which defines procedures for computing hashes
  • 2 Defines a hash procedure for the Dog type
  • 3 Uses the Dog’s name field to compute a hash
  • 4 Uses the !$ operator to finalize the computed hash

The code in listing 4.14 shows in bold the additions that make the example compile. The hashes module is necessary to aid in computing a hash in the hash procedure. It defines the Hash type, the hash procedure for many common types including string, and the !$ operator. The !$ operator finalizes the computed hash, which is necessary when writing a custom hash procedure. The use of the !$ operator ensures that the computed hash is unique.

4.4.2. The sets module

Now let’s have a quick look at another data structure: the set. The basic set type, introduced in chapter 2, is defined in the system module. This set type has a limitation—its base type is limited to an ordinal type of a certain size, specifically one of the following:

  • int8, int16
  • uint8/byte, uint16
  • char
  • enum

Attempting to define a set with any other base type, such as set[int64], will result in an error.

The sets module defines a HashSet[A] type that doesn’t have this limitation. Just like the Table[A,B] type, the HashSet[A] type requires a hash procedure for the type A to be defined. The following listing creates a new HashSet[string] variable.

Listing 4.15. Modeling an access list using a HashSet
import sets                                                  1
var accessSet = toSet(["Jack", "Hurley", "Desmond"])         2
if "John" notin accessSet:                                   3
  echo("Access Denied")
else:                                                        4
  echo("Access Granted")

  • 1 Imports the sets module where the toSet procedure is defined
  • 2 Defines a new HashSet[string] with a list of names
  • 3 Checks if John is in the access set, and if he’s not, displays the “Access Denied” message
  • 4 If John is in the access set, displays the “Access Granted” message

Determining whether an element is within a set is much more efficient than checking whether it’s within a sequence or array, because each element of a set doesn’t need to be checked. This makes a very big difference when the list of elements grows, because the time complexity is O(1) for sets and O(n) for sequences.[2]

2

For more info on time complexity, see the Wikipedia article: https://en.wikipedia.org/wiki/Time_complexity.

In addition to the HashSet[A] type, the sets module also defines an OrderedSet[A] type that remembers the insertion order.

4.4.3. The algorithms

Nim’s standard library also includes an algorithm module defining a selection of algorithms that work on some of the data structures mentioned so far, particularly sequences and arrays.

Among the most useful algorithms in the algorithm module is a sorting algorithm defined in the sort procedure. The procedure takes either an array or a sequence of values and sorts them according to a specified compare procedure.

Let’s jump straight to an example that sorts a list of names, allowing you to display it to the user in alphabetical order, thereby making the process of searching the list much easier.

Listing 4.16. Sorting using the algorithm module
import algorithm                                                    1
var numbers = @[3, 8, 67, 23, 1, 2]                                 2
numbers.sort(system.cmp[int])                                       3
doAssert(numbers == @[1, 2, 3, 8, 23, 67])                          4
var names = ["Dexter", "Anghel", "Rita", "Debra"]                   5
let sorted = names.sorted(system.cmp[string])                       6
doAssert(sorted == @["Anghel", "Debra", "Dexter", "Rita"])          7
doAssert(names == ["Dexter", "Anghel", "Rita", "Debra"])            8

  • 1 Imports the algorithm module, which defines the sort and sorted procedures
  • 2 Defines a numbers variable of type seq[int] with some values
  • 3 Sorts the numbers sequence in place. This uses a standard cmp procedure for integers defined in system when sorting.
  • 4 The numbers sequence is now sorted in ascending order.
  • 5 Defines a new names variable of type array[4, string] with some values
  • 6 Returns a copy of the names array as a sequence with the elements sorted. This uses the standard cmp procedure for strings defined in system when sorting.
  • 7 The sorted sequence contains the elements in ascending alphabetical order.
  • 8 The names array has not been modified.

The code in listing 4.16 shows two different ways that both sequences and arrays can be sorted: using the sort procedure, which sorts the list in place, and using the sorted procedure, which returns a copy of the original list with the elements sorted. The former is more efficient because no copy of the original list needs to be made.

Note that the sorted procedure returns a seq[T] type, no matter what the input type is. This is why the sorted comparison must be done against a sequence literal.

Consider the system.cmp[int] procedure used in the sort call. Notice the lack of parentheses, (). Without them the procedure isn’t called but is instead passed as a value to the sort procedure. The definition of the system.cmp procedure is actually pretty simple.

Listing 4.17. The definition of the generic cmp procedure
proc cmp*[T](x, y: T): int =            1
 if x == y: return 0
  if x < y: return -1
  else: return 1

doAssert(cmp(6, 5) == 1)                2
doAssert(cmp(5, 5) == 0)                3
doAssert(cmp(5, 6) == -1)               4

  • 1 Defines a new generic cmp procedure taking two parameters and returning an integer
  • 2 The sort procedure expects the specified cmp procedure to return a value that’s larger than 0 when x > y.
  • 3 Whereas when x == y, sort expects cmp to return exactly 0.
  • 4 When x < y, sort expects cmp to return a value less than 0.

The cmp procedure is generic. It takes two parameters, x and y, both of type T. In listing 4.16, when the cmp procedure is passed to the sort procedure the first time, the T is bound to int because int is specified in the square brackets. In listing 4.17, the compiler can infer the T type for you, so there’s no need to specify the types explicitly. You’ll learn more about generics in chapter 8.

The cmp procedure will work for any type T as long as both the == and < operators are defined for it. The predefined cmp should be enough for most of your use cases, but you can also write your own cmp procedures and pass them to sort.

The algorithm module includes many other definitions that work on both arrays and sequences. For example, there’s a reverse procedure that reverses the order of the elements of a sequence or array and a fill procedure that fills every position in an array with the specified value. For a full list of procedures, take a look at the algorithm module documentation: http://nim-lang.org/docs/algorithm.html.

4.4.4. Other modules

There are many other modules that implement data structures in Nim’s standard library. Before you decide to implement a data structure yourself, take a look at the list of modules in Nim’s standard library (http://nim-lang.org/docs/lib.html). It includes linked lists, queues, ropes, and much more.

There are also many more modules dedicated to manipulating data structures, such as the sequtils module, which includes many useful procedures for manipulating sequences and other lists. These procedures should be familiar to you if you have any previous experience with functional programming. For example, apply allows you to apply a procedure to each element of a sequence, filter returns a new list with elements that have fulfilled a specified predicate, and so on. To learn more about the sequtils module, take a look at its documentation: http://nim-lang.org/docs/sequtils.html.

This section provided some examples of the most useful data structures and algorithms in Nim’s standard library. Let’s now look at modules that allow us to make use of the services an OS provides.

4.5. Interfacing with the operating system

The programs that you create will usually require an OS to function. The OS manages your computer’s hardware and software and provides common services for computer programs.

These services are available through a number of OS APIs, and many of the modules in Nim’s standard library abstract these APIs to provide a single cross-platform Nim API that’s easy to use in Nim code. Almost all of the modules that do so are listed under the “Generic Operating System Services” category in the standard library module list (https://nim-lang.org/docs/lib.html). These modules implement a range of OS services, including the following:

  • Accessing the filesystem
  • Manipulating file and folder paths
  • Retrieving environment variables
  • Reading command-line arguments
  • Executing external processes
  • Accessing the current system time and date
  • Manipulating the time and date

Many of these services are essential to successfully implementing some applications. In the previous chapter, I showed you how to read command-line arguments and communicate with applications over a network. Both of these are services provided by the OS, but communicating with applications over a network isn’t in the preceding list because it has its own category in the standard library. I’ll talk about modules that deal with networks and internet protocols in section 4.7.

4.5.1. Working with the filesystem

A typical filesystem consists primarily of files and folders. This is something that the three major OSs thankfully agree on, but you don’t need to look far to start seeing differences. Even something as simple as a file path isn’t consistent. Take a look at table 4.2, which shows the file path to a file.txt file in the user’s home directory.

Table 4.2. File paths on different operating systems

Operating system

Path to file in home directory

Windows C:Usersuserfile.txt
Mac OS /Users/user/file.txt
Linux /home/user/file.txt

Note both the different directory separators and the different locations of what’s known as the home directory. This inconsistency proves problematic when you want to write software that works on all three of these OSs.

The os module defines constants and procedures that allow you to write cross-platform code. The following example shows how to create and write to a new file at each of the file paths defined in table 4.2, without having to write different code for each of the OSs.

Listing 4.18. Write "Some Data" to file.txt in the home directory
import os                                    1
let path = getHomeDir() / "file.txt"         2
writeFile(path, "Some Data")                 3

  • 1 The os module defines the getHomeDir procedure as well as the / operator used on the second line.
  • 2 The getHomeDir proc returns the appropriate path to the home directory, depending on the current OS. The / operator is like the & concatenation operator, but it adds a path separator between the home directory and file.txt.
  • 3 The writeFile procedure is defined in the system module. It simply writes the specified data to the file at the path specified.

To give you a better idea of how a path is computed, take a look at table 4.3.

Table 4.3. The results of path-manipulation procedures

Expression

Operating system

Result

getHomeDir() Windows Mac OS Linux C:Usersusername /Users/username/ /home/username/
getHomeDir() / "file.txt" Windows Mac OS Linux C:Usersusernamefile.txt /Users/username/file.txt /home/username/file.txt
The joinPath procedure

You can use the equivalent joinPath instead of the / operator if you prefer; for example, joinPath(getHomeDir(), "file.txt").

The os module includes other procedures for working with file paths including splitPath, parentDir, tailDir, isRootDir, splitFile, and others. The code in listing 4.19 shows how some of them can be used. In each doAssert line, the right side of the == shows the expected result.

Listing 4.19. Path-manipulation procedures
import os                                                                   1
doAssert(splitPath("usr/local/bin") == ("usr/local", "bin"))                2
doAssert(parentDir("/Users/user") == "/Users")                              3
doAssert(tailDir("usr/local/bin") == "local/bin")                           4
doAssert(isRootDir("/"))                                                    5
doAssert(splitFile("/home/user/file.txt") == ("/home/user", "file", ".txt"))6

  • 1 Imports the os module to access the procedures used next.
  • 2 Splits the path into a tuple containing a head and a tail
  • 3 Returns the path to the parent directory of the path specified
  • 4 Removes the first directory specified in the path and returns the rest
  • 5 Returns true if the specified directory is a root directory
  • 6 Splits the specified file path into a tuple containing the directory, filename, and file extension

The os module also defines the existsDir and existsFile procedures for determining whether a specified directory or file exists. There are also a number of iterators that allow you to iterate over the files and directories in a specified directory path.

Listing 4.20. Displaying the contents of the home directory
import os                                                        1
for kind, path in walkDir(getHomeDir()):                         2
  case kind                                                      3
  of pcFile: echo("Found file: ", path)                          4
  of pcDir: echo("Found directory: ", path)                      5
  of pcLinkToFile, pcLinkToDir: echo("Found link: ", path)       6

  • 1 Imports the os module to access the walkDir iterator and the getHomeDir procedure
  • 2 Uses the walkDir iterator to go through each of the files in your home directory. The iterator will yield a value whenever a new file, directory, or link is found.
  • 3 Checks what the path variable references: a file, a directory, or a link
  • 4 When the path references a file, displays the message “Found file: “ together with the file path
  • 5 When the path references a directory, displays the message “Found directory: “ together with the directory path
  • 6 When the path references either a link to a file or a link to a directory, displays the message “Found link: “ together with the link path

The os module also implements many more procedures, iterators, and types for dealing with the filesystem. The Nim developers have ensured that the implementation is flexible and that it works on all OSs and platforms. The amount of functionality implemented in this module is too large to fully explore in this chapter, so I strongly recommend that you look at the os module’s documentation yourself (http://nim-lang.org/docs/os.html). The documentation includes a list of all the procedures defined in the module, together with examples and explanations of how those procedures can be used effectively.

4.5.2. Executing an external process

You may occasionally want your application to start up another program. For example, you may wish to open your website in the user’s default browser. One important thing to keep in mind when doing this is that the execution of your application will be blocked until the execution of the external program finishes. Executing processes is currently completely synchronous, just like reading standard input, as discussed in the previous chapter.

The osproc module defines multiple procedures for executing a process, and some of them are simpler than others. The simpler procedures are very convenient, but they don’t always allow much customization regarding how the external process should be executed, whereas the more complex procedures do provide this.

The simplest way to execute an external process is using the execCmd procedure. It takes a command as a parameter and executes it. After the command completes executing, execCmd returns the exit code of that command. The standard output, standard error, and standard input are all inherited from your application’s process, so you have no way of capturing the output from the process.

The execCmdEx procedure is almost identical to the execCmd procedure, but it returns both the exit code of the process and the output. The following listing shows how it can be used.

Listing 4.21. Using execCmdEx to determine some information about the OS
import osproc                               1
when defined(windows):                      2
  let (ver, _) = execCmdEx("cmd /C ver")    3
else:
  let (ver, _) = execCmdEx("uname -sr")     4
echo("My operating system is: ", ver)       5

  • 1 Imports the osproc module where the execCmdEx proc is defined
  • 2 Checks whether this Nim code is being compiled on Windows
  • 3 If this Nim code is being compiled on Windows, executes cmd /C ver using execCmdEx and unpacks the tuple it returns into two variables
  • 4 If this Nim code is not being compiled on Windows, executes uname -sr using execCmdEx and unpacks the tuple it returns into two variables
  • 5 Displays the output from the executed command

You can compile and run this application and see what’s displayed. Figure 4.7 shows the output of listing 4.21 on my MacBook.

Figure 4.7. The output of listing 4.21

Keep in mind that this probably isn’t the best way to determine the current OS version.

Getting the current OS

There’s an osinfo package available online that uses the OS API directly to get OS information (https://github.com/--nim-lang/osinfo).

Listing 4.21 also shows the use of an underscore as one of the identifiers in the unpacked tuple; it tells the compiler that you’re not interested in a part of the tuple. This is useful because it removes warnings the compiler makes about unused variables.

That’s the basics of executing processes using the osproc module, together with a bit of new Nim syntax and semantics. The osproc module contains other procedures that allow for more control of processes, including writing to the process’s standard input and running more than one process at a time. Be sure to look at the documentation for the osproc module to learn more.

The compile-time if statement

In Nim, the when statement (introduced in chapter 2) is similar to an if statement, with the main difference being that it’s evaluated at compile time instead of at runtime.

In listing 4.21, the when statement is used to determine the OS for which the current module is being compiled. The defined procedure checks at compile time whether the specified symbol is defined. When the code is being compiled for Windows, the windows symbol is defined, so the code immediately under the when statement is compiled, whereas the code in the else branch is not. On other OSs, the code in the else branch is compiled and the preceding code is ignored.

The scope rules for when are also a bit different from those for if. A when statement doesn’t create a new scope, which is why it’s possible to access the ver variable outside it.

4.5.3. Other operating system services

There are many other modules that allow you to use the services provided by OSs, and they’re part of the “Generic Operating System Services” category of the standard library. Some of them will be used in later chapters; others, you can explore on your own. The documentation for these modules is a good resource for learning more: http://nim-lang.org/docs/lib.html#pure-libraries-generic-operating-system-services

4.6. Understanding and manipulating data

Every program deals with data, so understanding and manipulating it is crucial. You’ve already learned some ways to represent data in Nim, both in chapter 2 and earlier in this chapter.

The most-used type for representing data is the string type, because it can represent just about any piece of data. An integer can be represented as "46", a date as "June 26th", and a list of values as "2, Bill, King, Programmer".

Your programs need a way to understand and manipulate this data, and parsers can help with this. A parser will look at a value, in many cases a text value of type string, and build a data structure out of it. There is the possibility of the value being incorrect, so a parser will check for syntax errors while parsing the value.

The Nim standard library is full of parsers. There are so many of them that there’s a full category named “Parsers.” The parsers available in the standard library can parse the following: command-line arguments, configuration files in the .ini format, XML, JSON, HTML, CSV, SQL, and much more. You saw how to use the JSON parser in chapter 3; in this section, I’ll show you how to use some of the other parsers.

The names of many of the modules that implement parsers begin with the word parse, such as parseopt and parsexml. Some of them have modules that implement a more intuitive API on top of them, such as these XML parsers: xmldom, xmltree, xmldomparser, and xmlparser. The latter two modules create a tree-like data structure out of the parsexml module’s output. The former two modules are then used to manipulate the tree-like data structures. The xmldom module provides a web DOM–like API, whereas the xmltree module provides a more idiomatic Nim API. The json module defines both a high-level API for dealing with JSON objects and a low-level parser that parses JSON and emits objects that represent the current data being parsed.

4.6.1. Parsing command-line arguments

Describing how each of these modules can be used for parsing would require its own chapter. Instead, I’ll present a specific data-parsing problem and show you some ways that this problem can be solved using the modules available in Nim’s standard library.

The problem we’ll look at is the parsing of command-line arguments. In chapter 3, you retrieved command-line arguments using the paramStr() procedure, and you used the returned string value directly. This worked well because the application didn’t support any options or flags.

Let’s say you want the application to support an optional port flag on the command line—one that expects a port number to follow. You may, for example, be writing a server application and want to give the user the option to select the port on which the server will run. Executing an application called parsingex with such an argument would look like this: ./parsingex --port=1234. The --port=1234 part can be accessed with a paramStr() procedure call, as follows.

Listing 4.22. Retrieving command-line arguments using paramStr
import os                             1

let param1 = paramStr(1)              2

  • 1 Imports the os module, which defines the paramStr procedure
  • 2 The command-line argument at index 1 will be equal to “--port=1234”, assuming the application is executed as in the preceding discussion.

Now you’ve got a string value in the param1 variable that contains both the flag name and the value associated with it. How do you extract those and separate them?

There are many ways, some less valid than others. I’ll show you a couple of ways, and in doing so I’ll show you many different ways that the string type can be manipulated and understood by your program.

Let’s start by taking a substring of the original string value with the substr procedure defined in the system module. It takes a string value, a start index, and an end index, with both indexes represented as integers. It then returns a new copy of the string, starting at the first index specified and ending at the end index.

More ways to manipulate strings

Nim strings can be modified at runtime because they’re mutable, which means they can be modified in place, without the need to allocate a new copy of the string. You can use the add procedure to append characters and other strings to them, and delete (defined in the strutils module) to delete characters from them.

Listing 4.23. Parsing the flag using substr
import os

let param1 = paramStr(1)
let flagName = param1.substr(2, 5)          1
let flagValue = param1.substr(7)            2

  • 1 Gets the substring of param1 from index 2 to index 5. This will result in “port”.
  • 2 Gets the substring of param1 from index 7 to the end of the string. This will result in “1234”.

Figure 4.8 shows how the indexes passed to substr determine which substrings are returned.

Figure 4.8. The substr procedure

The slice operator

A series of two dots, otherwise known as the .. operator, can be used to create a Slice object. A Slice can then be fed into the [] operator, which will return a substring. This is similar to the substr procedure, but it supports reverse indexes using the ^ operator.

doAssert("--port=1234"[2 .. 5] == "port")      1
doAssert("--port=1234"[7 .. ^1] == "1234")     2
doAssert("--port=1234"[7 .. ^3] == "12")       3

  • 1 Same as using substr(2, 5); returns a substring from index 2 to index 5
  • 2 Returns a substring from index 7 to the end of the string. The ^ operator counts back from the end of the string.
  • 3 Returns a substring from index 7 to the end of the string minus 2 characters

The code in listing 4.23 will work, but it is not very flexible. You might wish to support other flags, and to do that you will need to duplicate the code and change the indices.

In order to improve this, you can use the strutils module, which contains many definitions for working with strings. For example, toUpperAscii and toLowerAscii convert each character in a string to upper- or lowercase, respectively.[3] parseInt converts a string into an integer, startsWith determines whether a string starts with another string, and there are many others.

3

The procedures are named this way because they don’t support unicode characters. To get unicode support, you should use the toUpper and toLower procedures defined in the unicode module.

There’s a specific procedure that can help you split up the flag string properly, and it’s called split.

Listing 4.24. Parsing the flag using split
import os, strutils                        1

let param1 = paramStr(1)
let flagSplit = param1.split('=')          2
let flagName = flagSplit[0].substr(2)      3
let flagValue = flagSplit[1]               4

  • 1 Imports the strutils module, where the split procedure is defined
  • 2 Separates the param1 string value into multiple different strings at the location where an “=” character occurs. The split procedure returns a sequence of strings, in this case @[“--port”, “1234”].
  • 3 Grabs the first string in the sequence returned by split and removes the first two characters
  • 4 Grabs the second string in the sequence returned by split

This is still poor-man’s parsing, but it does work. There’s no error handling, but the code should work for many different flags. But what happens when requirements change? Say, for example, one of your users prefers to separate the flag name from the value using the : symbol. This change is easy to implement because the split procedure accepts a set[char], so you can specify {'=', ':'} and the string will be split on both = and :.

The split procedure works very well for parsing something as simple as this example, but I’m sure you can imagine cases where it wouldn’t be a good choice. For example, if your requirements change so that the flag name can now contain the = character, you’ll run into trouble.

We’ll stop here for now. You’ll learn more about parsing in chapter 6, where you’ll see how to use the parseutils module to perform more-advanced parsing.

Thankfully, you don’t need to parse command-line arguments like this yourself. As I mentioned previously, the Nim standard library contains a parseopt module that does this for you. The following listing shows how it can be used to parse command-line arguments.

Listing 4.25. Parsing the flag using parseopt
import parseopt                                                        1

for kind, key, val in getOpt():                                        2
  case kind                                                            3
  of cmdArgument:                                                      4
    echo("Got a command argument: ", key)
  of cmdLongOption, cmdShortOption:                                    5
    case key
    of "port": echo("Got port: ", val)
    else: echo("Got another flag --", key, " with value: ", val)
  of cmdEnd: discard                                                   6

  • 1 Imports the parseopt module, which defines the getOpt iterator
  • 2 Iterates over each command-line argument. The getOpt iterator yields three values: the kind of argument that was parsed, the key, and the value.
  • 3 Checks the kind of argument that was parsed
  • 4 If a simple flag with no value was parsed, displays just the flag name
  • 5 If a flag with a value was parsed, checks if it’s --port and displays a specific message if it is, showing the port value. Otherwise, displays a generic message showing the flag name and value.
  • 6 The command-argument parsing has ended, so this line does nothing.

This code is a bit more verbose, but it handles errors, supports other types of flags, and goes through each command-line argument. This parser is quite tedious, and, unfortunately, the standard library doesn’t contain any modules that build on top of it. There are many third-party modules that make the job of parsing and retrieving command-line arguments much easier, and these are available through the Nimble package manager, which I’ll introduce in the next chapter.

Compile and run the code in listing 4.25. Try to pass different command-line arguments to the program and see what it outputs.

This section should have given you some idea of how you can manipulate the most common and versatile type: the string. I’ve talked about the different parsing modules available in Nim’s standard library and showed you how one of them can be used to parse command-line arguments. I also introduced you to the strutils module, which contains many useful procedures for manipulating strings. Be sure to check out its documentation and the documentation for the other modules later.

4.7. Networking and the internet

The Nim standard library offers a large selection of modules that can be used for networking. You’ve already been introduced to the asynchronous event loop and the asynchronous sockets defined in the asyncdispatch and asyncnet modules, respectively. These modules provide the building blocks for many of the modules in the standard library’s “Internet Protocols and Support” category.

The standard library also includes the net module, which is the synchronous equivalent of the asyncnet module. It contains some procedures that can be used for both asynchronous and synchronous sockets.

The more interesting modules are the ones that implement certain internet protocols, such as HTTP, SMTP, and FTP.[4] The modules that implement these protocols are called httpclient, smtp, and asyncftpclient, respectively. There’s also an asynchttpserver module that implements a high-performance HTTP server, allowing your Nim application to serve web pages to clients such as your web browser.

4

For details on HTTP, SMTP, and FTP, be sure to view their respective Wikipedia articles.

The main purpose of the httpclient module is to request resources from the internet. For example, the Nim website can be retrieved as follows.

Listing 4.26. Requesting the Nim website using the httpclient module
import asyncdispatch                                         1
import httpclient                                            2

let client = newAsyncHttpClient()                            3
let response = waitFor client.get("http://nim-lang.org")     4
echo(response.version)                                       5
echo(response.status)                                        6
echo(waitFor response.body)                                  7

  • 1 The asyncdispatch module defines an asynchronous event loop that’s necessary to use the asynchronous HTTP client. It defines the waitFor procedure, which runs the event loop.
  • 2 The httpclient module defines the asynchronous HTTP client and related procedures.
  • 3 Creates a new instance of the AsyncHttpClient type
  • 4 Requests the Nim website using HTTP GET, which retrieves the website. The waitFor procedure runs the event loop until the get procedure is finished.
  • 5 Displays the HTTP version that the server responded with (likely, “1.1”)
  • 6 Displays the HTTP status that the server responded with. If the request is successful, it will be “200 OK”.
  • 7 Displays the body of the response. If the request is successful, this will be the HTML of the Nim website.

The code in listing 4.26 will work for any resource or website. Today, the Nim website is served over SSL, you’ll need to compile listing 4.26 with the -d:ssl flag in order to enable SSL support.

These modules are all fairly simple to use. Be sure to check out their documentation for details about the procedures they define and how those procedures can be used.

There may be protocols that the standard library misses, or custom protocols that you’d like to implement yourself. A wide range of networking protocols has been implemented as libraries outside the standard library by other Nim developers. They can be found using the Nimble package manager, which you’ll learn about in the next chapter.

4.8. Summary

  • A library is a collection of modules; modules, in turn, implement a variety of behaviors.
  • Identifiers in Nim are private by default and can be exported using *.
  • Modules are imported into the importing module’s global namespace by default.
  • The from module import x syntax can be used to selectively import identifiers from a module.
  • The standard library is organized into pure, impure, and wrapper categories.
  • The system module is imported implicitly and contains many commonly used definitions.
  • The tables module implements a hash table that can be used to store a mapping between keys and values.
  • The algorithms module defines a sort procedure that can be used for sorting arrays and sequences.
  • The os module contains many procedures for accessing the computer’s filesystem.
  • Web pages can be retrieved using the httpclient module.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.85.135