In this section, we're going to look at several compile-time features that don't fit snugly in the preceding sections.
static assert
is a compile-time assertion that is always enabled. It can't be turned off with -release
like a normal assert
can. It is not affected by runtime conditionals. When its condition is evaluated to false, an error is generated and compilation halted. The following static assert
always errors out because its condition is 0
.
void main() { if(0) static assert(0); }
Like a normal assert
, it's possible to give a static assert
a message to be printed on failure. The following example is a good use of that feature:
version(Windows) enum saveDirectory = "Application Name"; else version(Posix) enum saveDirectory = ".applicationName"; else static assert(0, "saveDirectory not implemented.");
The line number at which the assert
was encountered is always included in the output, but adding a custom message makes it easy to tell at a glance what caused the problem.
The examples above use 0
as the assert condition, but any Boolean expression that can be evaluated at compile time is eligible to fill that role. Consider the case where you want to restrict compilation to 64-bit. One approach would be to use a version
condition like so.
version(D_LP64) { // implement code } else static assert(0, "32-bit not supported.");
D_LP64
is predefined when compiling with the -m64
command-line switch. It means that pointers are 64-bits. The following is a single-line alternative:
static assert((void*).sizeof == 8, "32-bit not supported.");
The previous chapter introduced the is
operator, which performs an identity test on two variables. D also has a compile-time is
expression, which can be used in a few different ways. First, let's look at the most basic form.
enum alwaysTrue = is(int);
This checks that the argument is a well-formed type. If so, it returns true
. Directly pass a value or an expression, and the result is false
. However, both can be used with typeof
.
enum alwaysTrueToo = is(typeof(1+1));
The ==
operator can be used to test for a specific type.
enum isFloat = is(typeof(1+1) == float);
As written, this is going to set isFloat
to false
since the type of 1+1
is int
. An alternative is to test if one type is convertible to another.
enum canBeFloat = is(typeof(1+1) : float);
This evaluates to true
because int
is implicitly convertible to float
. Let's see a more complex example:
struct AType { int x; int addXTo(double d) { return x + cast(int)d; } } void main() { static if(is(typeof(AType.addXTo(30.0)))) { import std.stdio : writeln; AType t; writeln(t.addXTo(30.0)); } }
Accessing members at compile time
Don't let AType.addXTo
confuse you. It may look like we're calling a static member function with no static
in sight, but this isn't what's happening. This is the syntax used to access any member of a struct
or class
that can be known at compile time; it's happening at compile time.
The is(typeof())
operator doesn't check if code will compile; it only determines if the code produces a well-formed type. This is something that's often misunderstood, especially when considering that it's possible to put complex code in an is
expression.
static if(is(typeof({ AType t; t.addXTo(30.0); }))) { ... }
Notice the opening and closing braces inside the parentheses. With this syntax, any syntactically valid D code can be used with is(typeof())
, though it need not be semantically valid. This is an important distinction.
In addition to checking for types, the is
expression can be used to check against certain language constructs. For example, this static if
block determines if AType
is a struct
or a class
.
static if(is(AType == struct)) { writeln("It's a struct!"); } else static if(is(AType == class)) { writeln("It's a class!"); }
With the preceding declaration of AType
, the first writeln
is compiled into the executable. Change the declaration of AType
so that it's a class, and the second line is compiled in. Other specifiers can be used, such as function
, delegate
, const
, immutable
, and so on. For more on is
, refer to http://dlang.org/expression.html#IsExpression.
It may be difficult to see exactly how is
can be useful, given that in each of the above examples the type is already known. Where it really comes in handy, and where it's most often used, is when working with generic types. In the next chapter, where we cover templates, we'll start putting the is
expression to use.
Many C and C++ programmers will be familiar with data alignment and data structure padding given the emphasis placed on cache-friendly code these days. For those who aren't familiar with the topics, a good introduction for a broader understanding can be found at http://en.wikipedia.org/wiki/Data_structure_alignment. Here, we're going to focus on what it looks like in D. Consider the following example:
struct Packed { double x; float y; byte z; byte w; } void main() { import std.stdio : writeln; writeln(Packed.x.offsetof); writeln(Packed.y.offsetof); writeln(Packed.z.offsetof); writeln(Packed.w.offsetof); }
The Packed
data structure has its members ordered from largest to smallest. The result is that the compiler is able to align each member on byte boundaries that are tightly packed. This can be verified by querying the .offsetof
property available on all struct
and class
member variables. The preceding example results in the following output:
0 8 12 13
The first member of a struct
will always have an offset of 0
. Since x
is a double
and double.sizeof
is 8
, then the address of the variable y
is going to be the address of x
plus eight bytes. The following snippet shows this clearly.
Packed p; writeln(cast(void*)&p.y - cast(void*)&p.x);
Note that the casts to void*
are needed here, as float*
and double*
are incompatible types. This will print 8
to the console, matching the value of Packed.y.offsetof
.
Continuing on this line, since y
is a four-byte value, the offset of z
is four bytes past y
, or 12
. Since z
is a one-byte value, the offset of w
is one byte past z
, or 13
. But this is not really the full story. Change the type of w
from byte
to int
and the output becomes the following:
0 8 12 16
Even though z
is only a one-byte value, w
now begins four bytes past it rather than one. A more dramatic demonstration of this behavior can be seen in the following example, where the NotPacked
data structure has a different set of types.
struct NotPacked { int x; long y; byte z; double w; } void main() { import std.stdio : writeln; writeln(NotPacked.x.offsetof); writeln(NotPacked.y.offsetof); writeln(NotPacked.z.offsetof); writeln(NotPacked.w.offsetof); }
For this, the output is the following:
0 8 16 24
Ultimately, the offset of a member depends not only on the size of the preceding type, but on the default alignment of the member's type. In this example, x
is still four bytes in size but, more importantly, the default alignment of long
is 8
. We can see this by querying its .alignof
property (available on all types).
writeln(long.alignof);
This will print 8
. What it means is that the memory address of any long
variable must be a multiple of 8
. When the compiler generates the code for any instance of NotPacked
, the address four bytes past x
is not going to be divisible by 8
, so it puts y
another four bytes further on to satisfy that requirement. byte.alignof
gives us 1
, so it can always immediately follow whatever is in front of it. Therefore, the offset of z
is 16
, which is at the very end of y
. Finally, double.alignof
is 8
, so it's impossible for the offset of w
in NotPacked
to be 17
. Instead, the compiler moves it to the next address that is a multiple of 8
, so its offset becomes 24
.
The result of all of this is that there are four unused bytes between x
and y
, with a further seven unused bytes between z
and w
. These unused bytes are called padding and, though they inflate the size of the data structure, they make it much more efficient for the CPU to access members in memory. NotPacked.sizeof
is 32
, but reorder things so that the long
and double
are at the top, followed by the int
, then the byte
last, and the size becomes 24
(there will still be three padding bytes following the byte
at the end). Then it can be called a packed data structure, meaning there is no padding in the interior.
There are times when it may be desirable to pack a data structure without reordering the members. One common example is when reading and writing chunks from or to a file that follows a predefined format. Rather than reading or writing each member one at a time, it can be more efficient to transfer the entire structure in one go. Imagine a file format that specifies a header consisting of a byte value, followed by a four-byte value and ending with another byte, for a total of six bytes. As a struct
, it looks like:
struct FileHeader { byte version; int magic; byte id; }
Try to read or write an instance of FileHeader
directly and there's a problem. You can see it clearly in the following image:
There are three padding bytes between fmtVersion
and magic
, with another three padding bytes past id
, yielding a total of twelve bytes instead of six. Writing the entire structure directly means that the file header no longer follows the predefined format, as all the padding bytes will be written, too. Conversely, reading a properly formatted file will cause the first three bytes of magic
to go into the three padding bytes after fmtVersion
and id
to go into the second byte of magic
, resulting in all three fields having incorrect values.
Given that a struct
is a type, it has an .alignof
property like any other built-in type. However, the property has no predefined value; it takes on the alignment value of its largest member type. In FileHeader
, the largest member type is int
, with an alignment of 4
, so FileHeader
also has an alignment of 4
. The alignment of a struct
type does not just affect how instances of that type are aligned in relation to other variables, but also how memory is allocated for the instance itself. This is why FileHeader.sizeof
is 12
: its alignment is 4
, so members must be allocated on four-byte boundaries.
As the earlier image demonstrates, the number of four-byte blocks that need to be allocated in this case is three. fmtVersion
takes up only one of the bytes it was allocated. The next member is an int
, which requires four bytes and, therefore, another block of memory; its alignment of 4
dictates that it can't use the three free bytes of the first block. Finally, because magic
fills its entire block, a third block is allocated for id
, which again only needs one byte, leaving three unused. If a short
is added between fmtVersion
and magic, the size of FileHeader
does not change. Since short
has an alignment requirement of two bytes and also a size of two bytes, it fits snugly in the last two bytes of the first memory block. Similarly, another short
could be tacked on to the end and the size would remain twelve bytes.
In C or C++, the padding bytes could be eliminated by packing the data structure via a compiler-specific #pragma
preprocessor directive. In D, we get something that's defined by the language instead: align
. It can apply to one declaration or, using a colon or braces, multiple declarations. Let's make all of the FileHeader
members byte-aligned.
struct FileHeader { align(1): byte fmtVersion; int magic; byte id; }
Now all of the members directly follow each other in memory, with no padding between.
However, notice that there are still two empty bytes at the end. This is because FileHeader.sizeof
is still 4
, so memory for its members is still allocated on four-byte boundaries. To completely eliminate the padding bytes, it's necessary to add an align attribute to the FileHeader
type as well.
align(1) struct FileHeader {
align(1):
byte fmtVersion;
int magic;
byte id;
}
FileHeader.sizeof
now gives us 6
instead of 8
or 12
, meaning we have eliminated the padding bytes completely. We can now safely read and write entire arrays of FileHeader
instances in one go without fear of data becoming corrupt because of padding. Note, however, that FileHeader.alignof
is still 4
; we have not changed how instances of the type are aligned, only how memory is allocated for its members. As with any other variable, the alignment of struct
instances is changed by including an align
attribute in the variable declaration.
There's a big, giant caveat to go with align
. Data structures with erratic alignment are going to be more expensive to manipulate. Accessing a member at an odd boundary in memory can cause more work for the CPU. This is great for improving the speed of I/O, but the alignment of any data structures that are to be manipulated frequently throughout the life of a program should only be changed with great care, if at all. It's better to manually pack things by changing the order of declaration. Finally, align
is not exclusively for use with data structures; it can be prefixed to any variable declaration in any scope. It's rare to do so, however, and using it with data structures is more common.
Several existing languages, particularly those that run on a virtual machine, have support for runtime reflection. If you aren't familiar with the concept, see http://en.wikipedia.org/wiki/Reflection_(computer_programming) for an introduction. D has some support for runtime reflection now and there are plans afoot to expand upon it. More interesting for many D users is its support for compile-time reflection. This enables a number of possibilities for both generative and generic programming that wouldn't otherwise be possible.
A language feature that exists exclusively for compile-time reflection is the __traits
expression. It takes at least two arguments. The first is a keyword indicating the type of trait to query, followed by one or more types or expressions. Some examples:
enum a = __traits(isUnsigned, uint); enum b = __traits(isUnsigned, 10 + 11); enum c = __traits(isUnsigned, 10u + 11u); enum d = __traits(isUnsigned, uint, 10u + 11u, 10.0 - 9.0);
In the declaration of a
, the second argument to __traits
is a type. Since uint
is unsigned, a
is initialized to true
. The initialization of b
uses an expression. The literals 10
and 11
are both of type int
, so the result of the expression is also an int
. That means b
is set to false
. c
is initialized to true
, as the result of the expression is uint
, thanks to the u
suffix on the literals. Finally, in the declaration of d
, multiple arguments are passed after isUnsigned
. In this case, all of the arguments must pass the test in order for the entire expression to return true
. Since the last argument, 10.0 – 9.0
, results in a double
, d
is set to false
.
There are a number of Boolean traits, but there are others that return something completely different. Take, for example, the getMember
trait. This can be used to indirectly set or get a member variable in a struct
or class
.
struct Point { int x, y; } void main() { auto p = Point(10, 20); writeln(__traits(getMember, p, "x")); __traits(getMember, p, "y") = 33; writeln(p.y); }
Remember, __traits
is evaluated at compile time, so the two highlighted lines ultimately cause code to be generated that is the same as is generated when writeln(p.x)
and p.y = 33
are used. Some traits return a set of values; for example, allMembers
returns a set of string
s containing the names of each member of a struct
, class
, or enum
.
pragma(msg, __traits(allMembers, Point));
Compiling this with the Point
type prints:
tuple("x", "y")
There is a Phobos module, std.traits
, providing alternative implementations of several built-in traits. Generally, it's encouraged to use these over the built-ins. We'll take a look at std.traits
in the next chapter (and one convenience function before the end of this chapter). For more on built-in traits, refer to http://dlang.org/traits.html.
User-defined attributes, or UDAs, allow you to associate metadata with your variables and functions. The attributes can be examined at compile time to generate different code paths. There are different ways to implement a UDA. The simplest is to use a literal, such as the integer literal in this example.
@(1) int myVal;
To determine at compile time what attributes myVal
has, use the getAttributes
trait.
pragma(msg, __traits(getAttributes, myVal));
This will print the following:
tuple(1)
Integer literals aren't really the best option for implementing UDAs. There's absolutely no scoping for a literal and two libraries may interpret 1
quite differently. It's more appropriate to declare UDAs that have a name with some special meaning. Here's one possible approach:
enum NoPrint; struct Foo { int x; @NoPrint int y; }
A function (preferably a template) could be implemented that only prints data that isn't annotated with @NoPrint
. Literals and manifest constants can become UDAs because they are known at compile time. Aggregate types work as well.
struct NoPrint {} struct NoSave {} enum Decoration { none, italics, bold, } struct Decorated { Decoration decoration; } struct Data { @Decorated(Decoration.Bold) string name; @Decorated(Decoration.Italics) string occupation; @NoSave @NoPrint int temporary; }
All the attributes of a single member of a data structure can be examined with __traits
.
pragma(msg, __traits(getAttributes, Data.temporary));
Compile time reflection can be used to grab the attributes of every member.
foreach(member; __traits(allMembers, Data)) { enum name = "Data." ~ member; writef("Attributes of %s: ", name); foreach(attr; __traits(getAttributes, mixin(name))) { static if(is(typeof(attr) == Decorated)) { Decoration dec = __traits(getMember, attr, "decoration"); writef("Decoration.%s", dec); } else { writef(" %s", attr.stringof); } } writeln(); }
First up, a foreach
loop is being run on the members of Data
. It's worth noting that, because the return value from __traits
is a compile-time value, the loop is actually being unrolled at compile time. The same is true for the inner loop. In order to get the attributes of any specific member of Data
, the qualified form of the name must be used, such as Data.occupation
. The names returned by allMembers
are not qualified, so the qualified names have to be constructed manually: "Data." ~ member
.
When it's time to fetch the attributes, the qualified member name must be passed to __traits
as an identifier, and not as a string. It needs the name of the member as it is written in code, e.g. Data.occupation
and not "Data.occupation"
. A string mixin is used to generate an identifier from the string value of name
. As the attributes are iterated, a test is performed on each with is(typeof(attr) == Decorated)
.
Attributes can be types or values. This has consequences when a struct
is used to define the attribute. @NoSave
is a type attribute, whereas @NoSave()
is a value attribute. The former can be tested with is(attr == NoSave)
. The latter will fail that test, since a value can't be compared with a type. Therefore, the test in that case must be is(typeof(attr) == NoSave)
. Notice that Decorated
is used as a value attribute, initialized in each case with a member of the Decoration
enumeration. The loop uses static if
to determine if the current property is an instance of Decorated
and, if so, uses the getMember
trait to fetch the value of its decoration
member.
If the attribute is not a Decorated
instance, then the .stringof
property is used to get the name of the attribute as a string. A type can't be printed at runtime (though it can be at compile time with a msg
pragma), so each type must be converted to a string via .stringof
. This is the opposite of the problem solved by string mixins, where strings are converted to symbols.
When writing a function that looks for multiple attributes, the type/value dichotomy can make for some convoluted code. Thankfully, std.traits
provides a template function to hide all of the complexity.
import std.traits : hasUDA; static if(hasUDA!(Data.temporary, NoSave)) writeln("Data.temporary can't be saved!");
This will return true no matter if Data.temporary
was annotated with @NoSave
or @NoSave()
. As of DMD 2.069, std.traits
also includes the functions getUDAs
and getSymbolsByUDA
.
18.116.118.229