The previous ten chapters cover enough of D, its standard library, and the ecosystem for any programmer to use as a guide in implementing a variety of applications and libraries in D. The language and library features that were covered were either fundamental, such as those discussed in Chapter 2, Building a Foundation with D Fundamentals, and Chapter 3, Programming Objects the D Way, or used so frequently that they are encountered on a regular basis in D libraries, tutorials, and example code. A number of features were not covered, either because they do not fit into the categories of fundamental and frequently used, or because they aren't quite ready for prime time.
This chapter introduces several of the language and library features that were not covered elsewhere in the book. None of the features here are given in-depth coverage, only enough to provide a general overview of each. Consider this chapter a platform from which to launch further exploration of the D programming language to improve your knowledge and experience. Here are the things we'll be looking at in this chapter, which are organized in no particular order:
Once upon a time, multithreaded programming was the exclusive realm of people with pointy hats who uttered strange incantations. Mere mortals fell victim to the evils of race conditions and deadlocks too easily. Yet, in this age of multi-core processors, the arcane is on the verge of becoming the mundane. D's multifaceted support of concurrency is oriented toward giving programmers the tools to make it so.
The traditional model of multithreaded programming, lock-based synchronization and data sharing, began to fall out of favor even before multicore processors came along. Such code is difficult to properly implement, test, and maintain. Other models, such as thread-per-system, thread-per-task, and message passing, improved the situation, making it easier to design frameworks that hide the nasty details behind an interface that appears single-threaded. Recently, it has become easier to implement loops that operate on data in parallel in some languages through built-in support, libraries, and compiler extensions. As software gains access to more and more cores, both on the CPU and the GPU, this latter model becomes more important. D comes with support for each of these models, spread across the language, the runtime, and the standard library. This section presents a brief introduction to all of the support for concurrent programming in D, with suggestions on where to go to learn more.
The heart of any concurrent programming model in D is the
Thread
class found in the DRuntime package core.thread
. D's threads are heavyweight, meaning they map to kernel threads managed by the operating system. They carry all the baggage that comes from each thread having its own context that needs to be activated when a thread is given its time slice. A more lightweight option is the Fiber
class. Not only do fibers carry around less baggage, their execution is managed by the program rather than the operating system.
You may use the Thread
class to spawn new threads. Even in single-threaded programs, its static methods, such as sleep
or yield,
can be called to affect the execution of the current thread. New threads can be created either by subclassing Thread
, or by instantiating a Thread
instance directly. In both cases, a function that returns void
and takes no arguments can be passed to the Thread
constructor in the form of a delegate or function pointer.
import core.thread; import std.stdio; class MyThread : Thread { this() { super(&run); } private void run() { writeln("MyThread is running."); } } void myThreadFunc() { writeln("myThreadFunc is running."); } void main() { auto myThread1 = new MyThread; auto myThread2 = new Thread(&myThreadFunc); myThread1.start(); myThread2.start(); }
There are C libraries and frameworks that provide a platform-agnostic way to create threads. Sometimes, such as when the C library requires the use of a custom thread handle for certain functions, it is necessary to use the foreign API to create new threads. This usually requires a pointer to a function that the new thread will call when it is executed. Any such threads should usually be registered with DRuntime inside the thread function by calling thread_attachThis
.
extern(C) void threadFunc(void* data) { import core.thread : thread_attachThis, thread_detachThis; thread_attachThis(); scope(exit) thread_detachThis(); }
Registering foreign threads with DRuntime is necessary to ensure that all required thread-local initialization is done. It's also important if the thread touches GC-managed memory. Before the GC scans any particular block of memory, it pauses the execution of all active threads. If a thread has not been registered with DRuntime, then the GC can't pause it. For this reason, you should always prefer to use the Thread
class to create new threads in D, even when using C libraries. Threads should be created through third-party APIs only in the rare cases when it is unavoidable. Foreign threads should always be registered with DRuntime if they touch anything on the D side.
A Fiber
, an alternative implementation of a coroutine, can be spawned in the same manner as a Thread
, by subclassing or by instantiation with a function pointer.
The delegate or function pointer associated with a fiber is executed when the call
member function is called. Execution happens in the calling thread, which is blocked until the yield
function is called, as shown in the following example:
import core.thread; import std.stdio; void myFiberFunc() { writeln("Execution begun."); Fiber.yield(); writeln("Execution resumed."); } void main() { auto fiber = new Fiber(&myFiberFunc); fiber.call(); writeln("Execution paused."); fiber.call(); }
Always keep in mind the difference between a fiber and a thread. A Thread
instance represents a system resource. Each system thread has its own copies of thread-local data, so any mutations of such data through the run
member function of a Thread
instance happen on local copies and will not be visible in other threads. Non thread-local mutable data should be protected through synchronization primitives. A Fiber
instance does not represent a system resource, meaning it does not have its own copies of thread-local data. If there is any possibility that multiple threads can run the call
function on a Fiber
instance, then care must be taken to synchronize access to all data that can be accessed through that function. As long as the same thread executes the function every time, synchronization is only an issue with data that is not thread-local. We'll see a bit about synchronization in D shortly.
As we know from Chapter 2, Building a Foundation with D Fundamentals, all variables declared in D are thread-local by default, meaning each thread has its own copy of each variable. We've also seen brief mentions of the shared
and __gshared
attributes. Fundamentally, they both achieve the same end in that they flag a variable as being outside of thread-local storage, meaning it is shared by all threads. Other than that, they are quite different, each coming with its own guarantees and consequences.
Applying __gshared
to a module-scope variable in D is essentially the same as declaring any variable in C. It is entirely up to the programmer to ensure that access to the variable by multiple threads is properly guarded. The same holds true for member variables of aggregate types, with the added side effect that such variables are also static. For example, the declarations of shared1
and shared2
in the following snippet are equivalent:
class SharedMembers { __gshared static int shared1; __gshared int shared2; }
__gshared
is a necessity when declaring variables in C library bindings, but it should otherwise be a rarity in normal D code.
There are a few things to be aware of when applying the shared
attribute to a variable. First, it must be understood that shared
modifies the type.
int tlsVar; // type == int shared int sharedVar; // type == shared(int)
This has consequences in how shared variables are used as function arguments and assigned to other variables. While value types can convert just fine, this does not hold with reference types or pointers, for example, a shared(int)*
does not implicitly convert to int*
.
Second, shared
is transitive. Applying shared
to an instance of an aggregate type means all of its members are also shared
.
struct ShareMe { int* intPtr; } shared ShareMe sm; int x; sm.intPtr = &x; // Error!
Here, sm.inPtr = &x
fails because &x
yields int*
, not shared(int*)
, which is the type of sm.intptr
thanks to the declaration of sm
as shared
.
Third, the compiler prohibits any unprotected, non-atomic modification of a shared variable. In the following snippet, the second line is illegal:
shared int sharedInt; ++sharedInt;
In this case, the error can be avoided using a template function from the core.atomic
module in DRuntime.
import core.atomic : atomicOp; atomicOp!"+="(sharedInt, 1);
As I write, ++sharedInt
does not result in a compiler error. Instead, the compiler outputs the following message: Deprecation: read-modify-write operations are not allowed for shared variables. Use core.atomic.atomicOp!"+="(sharedInt, 1) instead. The code will still compile and the program will execute, but there's a good chance for a race condition to appear. At some point, this will become a compiler error. For now, it's necessary to pay attention to the compiler output to ensure that this sort of thing doesn't slip into any code using shared
variables.
Synchronization goes hand-in-hand with data sharing. Without the means to protect a variable from simultaneous access by multiple threads, strange things can happen (note that there is no need to protect data from multiple fibers; Fiber
instances are no different from any other class instance in that regard). Another option, as seen in the previous section, is to perform modifications of variables atomically; in one step, where possible. D has support for synchronization both in the language and in the runtime and for atomic operations in the runtime.
The synchronized
statement creates a scope in which all variable accesses are protected by a mutex. When the scope is entered, the mutex is acquired (locked). When the scope is exited, the mutex is released.
private int _someInt; void setSomeInt(int newVal) { synchronized { _someInt = newVal; } }
The compiler will allocate a new mutex object specifically for each synchronized
blocks. This behavior can be overridden by providing any expression that yields a class or interface instance for the synchronized
statement to use. Every class instance has its own mutex which the compiler will use instead of allocating a new one. That said, it's considered good practice to use an instance of std.mutex.Mutex
.
import std.mutex; auto mutex = new Mutex; synchronized(mutex) { ... }
synchronized
can be applied to class
(but never struct
) declarations. Doing so makes every member function of that class synchronized
and causes the mutex associated with each instance of the class to be used as the monitor, meaning that it's equivalent to adding a synchronized(this)
statement inside every function in the class. With this, only shared instances of the class can be instantiated and all member function calls will be serialized.
As I write, there are two issues to watch out for regarding synchronized classes. One is public member variables. Right now, it's possible to declare them in a synchronized class, but this can be problematic if they are mutable as it allows for non-synchronized mutation. It is expected that this will be deprecated at some point.
The second is the documentation at http://dlang.org/class.html#synchronized-classes says the following:
"Member functions of non-synchronized classes cannot be individually marked as synchronized. The synchronized attribute must be applied to the class declaration itself."
In practice, the compiler actually does allow synchronized
to be applied to individual member functions. Again, instances of the class must be declared as shared
. It is unlikely that this will change, as it is certain to break code in active projects. One such project is DWT, a port to D of the SWT library for Java (see https://github.com/d-widget-toolkit/dwt).
The DRuntime package
core.sync
contains several modules that expose primitives that can be used to manually implement synchronization for different behaviors. The package includes two types of mutexes, a generic recursive mutex in the mutex
module, and a mutex that allows for shared read access and exclusive write access in the rwmutex
module. Additionally, the modules condition
, semaphore
, and barrier
provide eponymous primitives. If you're looking to implement lock-based data sharing yourself, this is a good place to start.
An atomic operation is one that appears to happen instantaneously. Such operations are safe in multithreaded programming because there is an inherent guarantee that only one thread can perform the operation at a time, meaning that no locks are required. The core.atomic
module in DRuntime provides a handful of functions that allow for lock-free concurrency. Earlier, we observed how to use the template function atomicOp
to convert the non-atomic operation of adding 1
to a shared(int
) into an atomic one. Other functions in the module allow for atomic loads and stores, atomic compare and swap (cas), and atomic memory barriers (memory fences).
When using atomic operations, it's important to have a good grasp of memory ordering. Members of the enumeration core.atomic.MemoryOrder
can be used with the atomicLoad
and atomicStore
functions to specify the type of memory barrier instruction the CPU should use in carrying out the operation. Although it's a talk related to C++, a good place to start is Herb Sutter's two-part talk from C++ and Beyond 2012, titled, atomic<> Weapons: The C++ Memory Model and Modern Hardware at https://isocpp.org/blog/2013/02/atomic-weapons-the-c-memory-model-and-modern-hardware-herb-sutter.
Phobos provides foundational support for the message passing model of concurrent programming in the std.concurrency
module. This is the preferred way of handling concurrency in D; you should only turn to other models if std.concurrency
doesn't meet your needs. This module hides most of the raw details of concurrent programming behind a simplified API; rather than manipulating the Thread
class directly, programs call std.concurrency.spawn
and get a Tid
(thread ID) in return that is then used as a marker to identify messages sent and received between threads.
import std.concurrency; import std.stdio; void myThreadFunc(Tid owner) { receive( (string s) { writefln("Message to thread %s: %s", owner, s); } ); } void main() { auto child1 = spawn(&myThreadFunc, thisTid); auto child2 = spawn(&myThreadFunc, thisTid); send(child1, "Message for child1."); send(child2, "Message for child2."); }
Here, two new threads are created by passing a pointer to myThreadFunc
and the result of thisTid
, which returns the Tid
of the current thread, to spawn
. Then the parent thread sends a message to each child. The
send
function takes a Tid
followed by any number of parameters of any type. The receive
function is a template that takes any number of delegates as parameters, each of which can itself have different parameters and return types. The delegates are registered with the owning thread as message handlers; when a message is received, all of the registered delegates are searched to see if any have a parameter list that matches the parameters sent via the send function. In this example, one handler that accepts a string is registered for each child thread.
It's notable that
std.concurrency
deals in logical threads. In other words, a Tid
may represent an actual Thread
, or it may represent a Fiber
. By default, spawn
creates new kernel threads, but it's possible to implement a Scheduler
, such as the example std.concurrency.FiberScheduler
, that causes spawn
to create new fibers instead.
std.concurrency
contains variations of spawn
, send
, and receive
, as well as utility functions and types, which can be used as a foundation for a higher-level message passing API.
When processing large amounts of data, one way to utilize the power of multi-core processers is to break the data into chunks and process each chunk in parallel. D has support for this in the form of the Phobos module std.parallelism
.
The module is built around the Task
and TaskPool
types, with a few helper functions to make things more convenient to use. A Task
represents a unit of work. A TaskPool
maintains a queue of tasks and a number of worker threads. Member functions of TaskPool
can be called to process and apply algorithms to the data in the task queue. For example, the member functions map
and reduce
perform the same operations as their std.algorithm
counterparts, but do so across multiple threads in parallel. Another interesting member function of TaskPool
is parallel
, which allows the execution of a parallel foreach
loop. There is a convenience function, also called parallel
, which uses the default TaskPool
instance. The following example scales 100 million two-dimensional vectors. When compiled with –version=SingleThread
, it all happens on one thread.
struct Vec2 { float x = 1.0f, y = 2.0f; } void main() { import std.stdio : writeln; import std.datetime : MonoTime; auto vecs = new Vec2[](100_000_000); auto before = MonoTime.currTime; version(SingleThread) { foreach(ref vec; vecs) { vec.x *= 2.0f; vec.y *= 2.0f; } } else { import std.parallelism : parallel; foreach(ref vec; parallel(vecs)) { vec.x *= 2.0f; vec.y *= 2.0f; } } writeln(MonoTime.currTime - before); }
Given that there are only two multiplications and assignments per vector, there isn't enough work for a parallel foreach
loop to be beneficial with lower numbers of vector instances. Change the 100_000_000
to 100_000
, for example, and you may find that the parallel version is slower. It was for me. But with 100 million instances, the parallel version won out in multiple runs on my machine. If you need to process large datasets, particularly by performing complex operations, std.parallelism
makes it quite simple to take advantage of multiple cores and process the data in parallel.
The Phobos documentation at http://dlang.org/phobos/index.html is a source of more detailed information for most of the topics we've covered in this section. Additionally, the concurrency chapter of Andrei Alexandrescu's book The D Programming Language is available online at http://www.informit.com/articles/printerfriendly/1609144. The article Getting More Fiber in Your Diet at http://octarineparrot.com/article/view/getting-more-fiber-in-your-diet contains a more complex fiber example that is compared against an implementation using threads.
3.140.197.136