Odds and ends

In this section, we're going to look at several compile-time features that don't fit snugly in the preceding sections.

static assert

static assert is a compile-time assertion that is always enabled. It can't be turned off with -release like a normal assert can. It is not affected by runtime conditionals. When its condition is evaluated to false, an error is generated and compilation halted. The following static assert always errors out because its condition is 0.

void main() {
    if(0) static assert(0);
}

Like a normal assert, it's possible to give a static assert a message to be printed on failure. The following example is a good use of that feature:

version(Windows)
  enum saveDirectory = "Application Name";
else version(Posix)
  enum saveDirectory = ".applicationName";
else
  static assert(0, "saveDirectory not implemented.");

The line number at which the assert was encountered is always included in the output, but adding a custom message makes it easy to tell at a glance what caused the problem.

The examples above use 0 as the assert condition, but any Boolean expression that can be evaluated at compile time is eligible to fill that role. Consider the case where you want to restrict compilation to 64-bit. One approach would be to use a version condition like so.

version(D_LP64) {
  // implement code
} else static assert(0, "32-bit not supported.");

D_LP64 is predefined when compiling with the -m64 command-line switch. It means that pointers are 64-bits. The following is a single-line alternative:

static assert((void*).sizeof == 8, "32-bit not supported.");

The is expression

The previous chapter introduced the is operator, which performs an identity test on two variables. D also has a compile-time is expression, which can be used in a few different ways. First, let's look at the most basic form.

enum alwaysTrue = is(int);

This checks that the argument is a well-formed type. If so, it returns true. Directly pass a value or an expression, and the result is false. However, both can be used with typeof.

enum alwaysTrueToo = is(typeof(1+1));

The == operator can be used to test for a specific type.

enum isFloat = is(typeof(1+1) == float);

As written, this is going to set isFloat to false since the type of 1+1 is int. An alternative is to test if one type is convertible to another.

enum canBeFloat = is(typeof(1+1) : float);

This evaluates to true because int is implicitly convertible to float. Let's see a more complex example:

struct AType {
  int x;
  int addXTo(double d) {
    return x + cast(int)d;
  }
}
void main() {
  static if(is(typeof(AType.addXTo(30.0)))) {
    import std.stdio : writeln;
    AType t;
    writeln(t.addXTo(30.0));
  }
}

Tip

Accessing members at compile time

Don't let AType.addXTo confuse you. It may look like we're calling a static member function with no static in sight, but this isn't what's happening. This is the syntax used to access any member of a struct or class that can be known at compile time; it's happening at compile time.

The is(typeof()) operator doesn't check if code will compile; it only determines if the code produces a well-formed type. This is something that's often misunderstood, especially when considering that it's possible to put complex code in an is expression.

static if(is(typeof({
        AType t;
        t.addXTo(30.0);
    }))) {  ...  }

Notice the opening and closing braces inside the parentheses. With this syntax, any syntactically valid D code can be used with is(typeof()), though it need not be semantically valid. This is an important distinction.

In addition to checking for types, the is expression can be used to check against certain language constructs. For example, this static if block determines if AType is a struct or a class.

static if(is(AType == struct)) {
    writeln("It's a struct!");
} else static if(is(AType == class)) {
    writeln("It's a class!");
}

With the preceding declaration of AType, the first writeln is compiled into the executable. Change the declaration of AType so that it's a class, and the second line is compiled in. Other specifiers can be used, such as function, delegate, const, immutable, and so on. For more on is, refer to http://dlang.org/expression.html#IsExpression.

It may be difficult to see exactly how is can be useful, given that in each of the above examples the type is already known. Where it really comes in handy, and where it's most often used, is when working with generic types. In the next chapter, where we cover templates, we'll start putting the is expression to use.

Alignment

Many C and C++ programmers will be familiar with data alignment and data structure padding given the emphasis placed on cache-friendly code these days. For those who aren't familiar with the topics, a good introduction for a broader understanding can be found at http://en.wikipedia.org/wiki/Data_structure_alignment. Here, we're going to focus on what it looks like in D. Consider the following example:

struct Packed {
  double x;
  float y;
  byte z;
  byte w;
}
void main() {
  import std.stdio : writeln;
  writeln(Packed.x.offsetof);
  writeln(Packed.y.offsetof);
  writeln(Packed.z.offsetof);
  writeln(Packed.w.offsetof);
}

The Packed data structure has its members ordered from largest to smallest. The result is that the compiler is able to align each member on byte boundaries that are tightly packed. This can be verified by querying the .offsetof property available on all struct and class member variables. The preceding example results in the following output:

0
8
12
13

The first member of a struct will always have an offset of 0. Since x is a double and double.sizeof is 8, then the address of the variable y is going to be the address of x plus eight bytes. The following snippet shows this clearly.

Packed p;
writeln(cast(void*)&p.y - cast(void*)&p.x);

Note that the casts to void* are needed here, as float* and double* are incompatible types. This will print 8 to the console, matching the value of Packed.y.offsetof.

Continuing on this line, since y is a four-byte value, the offset of z is four bytes past y, or 12. Since z is a one-byte value, the offset of w is one byte past z, or 13. But this is not really the full story. Change the type of w from byte to int and the output becomes the following:

0
8
12
16

Even though z is only a one-byte value, w now begins four bytes past it rather than one. A more dramatic demonstration of this behavior can be seen in the following example, where the NotPacked data structure has a different set of types.

struct NotPacked {
  int x;
  long y;
  byte z;
  double w;
}
void main() {
  import std.stdio : writeln;
  writeln(NotPacked.x.offsetof);
  writeln(NotPacked.y.offsetof);
  writeln(NotPacked.z.offsetof);
  writeln(NotPacked.w.offsetof);
}

For this, the output is the following:

0
8
16
24

Ultimately, the offset of a member depends not only on the size of the preceding type, but on the default alignment of the member's type. In this example, x is still four bytes in size but, more importantly, the default alignment of long is 8. We can see this by querying its .alignof property (available on all types).

writeln(long.alignof);

This will print 8. What it means is that the memory address of any long variable must be a multiple of 8. When the compiler generates the code for any instance of NotPacked, the address four bytes past x is not going to be divisible by 8, so it puts y another four bytes further on to satisfy that requirement. byte.alignof gives us 1, so it can always immediately follow whatever is in front of it. Therefore, the offset of z is 16, which is at the very end of y. Finally, double.alignof is 8, so it's impossible for the offset of w in NotPacked to be 17. Instead, the compiler moves it to the next address that is a multiple of 8, so its offset becomes 24.

The result of all of this is that there are four unused bytes between x and y, with a further seven unused bytes between z and w. These unused bytes are called padding and, though they inflate the size of the data structure, they make it much more efficient for the CPU to access members in memory. NotPacked.sizeof is 32, but reorder things so that the long and double are at the top, followed by the int, then the byte last, and the size becomes 24 (there will still be three padding bytes following the byte at the end). Then it can be called a packed data structure, meaning there is no padding in the interior.

There are times when it may be desirable to pack a data structure without reordering the members. One common example is when reading and writing chunks from or to a file that follows a predefined format. Rather than reading or writing each member one at a time, it can be more efficient to transfer the entire structure in one go. Imagine a file format that specifies a header consisting of a byte value, followed by a four-byte value and ending with another byte, for a total of six bytes. As a struct, it looks like:

struct FileHeader {
  byte version;
  int magic;
  byte id;
}

Try to read or write an instance of FileHeader directly and there's a problem. You can see it clearly in the following image:

Alignment

There are three padding bytes between fmtVersion and magic, with another three padding bytes past id, yielding a total of twelve bytes instead of six. Writing the entire structure directly means that the file header no longer follows the predefined format, as all the padding bytes will be written, too. Conversely, reading a properly formatted file will cause the first three bytes of magic to go into the three padding bytes after fmtVersion and id to go into the second byte of magic, resulting in all three fields having incorrect values.

Given that a struct is a type, it has an .alignof property like any other built-in type. However, the property has no predefined value; it takes on the alignment value of its largest member type. In FileHeader, the largest member type is int, with an alignment of 4, so FileHeader also has an alignment of 4. The alignment of a struct type does not just affect how instances of that type are aligned in relation to other variables, but also how memory is allocated for the instance itself. This is why FileHeader.sizeof is 12: its alignment is 4, so members must be allocated on four-byte boundaries.

As the earlier image demonstrates, the number of four-byte blocks that need to be allocated in this case is three. fmtVersion takes up only one of the bytes it was allocated. The next member is an int, which requires four bytes and, therefore, another block of memory; its alignment of 4 dictates that it can't use the three free bytes of the first block. Finally, because magic fills its entire block, a third block is allocated for id, which again only needs one byte, leaving three unused. If a short is added between fmtVersion and magic, the size of FileHeader does not change. Since short has an alignment requirement of two bytes and also a size of two bytes, it fits snugly in the last two bytes of the first memory block. Similarly, another short could be tacked on to the end and the size would remain twelve bytes.

In C or C++, the padding bytes could be eliminated by packing the data structure via a compiler-specific #pragma preprocessor directive. In D, we get something that's defined by the language instead: align. It can apply to one declaration or, using a colon or braces, multiple declarations. Let's make all of the FileHeader members byte-aligned.

struct FileHeader {
align(1):
  byte fmtVersion;
  int magic;
  byte id;
}

Now all of the members directly follow each other in memory, with no padding between.

Alignment

However, notice that there are still two empty bytes at the end. This is because FileHeader.sizeof is still 4, so memory for its members is still allocated on four-byte boundaries. To completely eliminate the padding bytes, it's necessary to add an align attribute to the FileHeader type as well.

align(1) struct FileHeader {
align(1):
    byte fmtVersion;
    int magic;
    byte id;
}

FileHeader.sizeof now gives us 6 instead of 8 or 12, meaning we have eliminated the padding bytes completely. We can now safely read and write entire arrays of FileHeader instances in one go without fear of data becoming corrupt because of padding. Note, however, that FileHeader.alignof is still 4; we have not changed how instances of the type are aligned, only how memory is allocated for its members. As with any other variable, the alignment of struct instances is changed by including an align attribute in the variable declaration.

There's a big, giant caveat to go with align. Data structures with erratic alignment are going to be more expensive to manipulate. Accessing a member at an odd boundary in memory can cause more work for the CPU. This is great for improving the speed of I/O, but the alignment of any data structures that are to be manipulated frequently throughout the life of a program should only be changed with great care, if at all. It's better to manually pack things by changing the order of declaration. Finally, align is not exclusively for use with data structures; it can be prefixed to any variable declaration in any scope. It's rare to do so, however, and using it with data structures is more common.

Tip

Classes and .alignof

Remember that querying properties on a class type or instance is returning values for a class reference, not an entire data structure. .alignof for a class is always 4 in 32-bit and 8 in 64-bit, no matter the alignment of the members.

Compile-time reflection

Several existing languages, particularly those that run on a virtual machine, have support for runtime reflection. If you aren't familiar with the concept, see http://en.wikipedia.org/wiki/Reflection_(computer_programming) for an introduction. D has some support for runtime reflection now and there are plans afoot to expand upon it. More interesting for many D users is its support for compile-time reflection. This enables a number of possibilities for both generative and generic programming that wouldn't otherwise be possible.

A language feature that exists exclusively for compile-time reflection is the __traits expression. It takes at least two arguments. The first is a keyword indicating the type of trait to query, followed by one or more types or expressions. Some examples:

enum a = __traits(isUnsigned, uint);
enum b = __traits(isUnsigned, 10 + 11);
enum c = __traits(isUnsigned, 10u + 11u);
enum d = __traits(isUnsigned, uint, 10u + 11u, 10.0 - 9.0);

In the declaration of a, the second argument to __traits is a type. Since uint is unsigned, a is initialized to true. The initialization of b uses an expression. The literals 10 and 11 are both of type int, so the result of the expression is also an int. That means b is set to false. c is initialized to true, as the result of the expression is uint, thanks to the u suffix on the literals. Finally, in the declaration of d, multiple arguments are passed after isUnsigned. In this case, all of the arguments must pass the test in order for the entire expression to return true. Since the last argument, 10.0 – 9.0, results in a double, d is set to false.

There are a number of Boolean traits, but there are others that return something completely different. Take, for example, the getMember trait. This can be used to indirectly set or get a member variable in a struct or class.

struct Point {
  int x, y;
}
void main() {
  auto p = Point(10, 20);
  writeln(__traits(getMember, p, "x"));
  __traits(getMember, p, "y") = 33;
  writeln(p.y);
}

Remember, __traits is evaluated at compile time, so the two highlighted lines ultimately cause code to be generated that is the same as is generated when writeln(p.x) and p.y = 33 are used. Some traits return a set of values; for example, allMembers returns a set of strings containing the names of each member of a struct, class, or enum.

pragma(msg, __traits(allMembers, Point));

Compiling this with the Point type prints:

tuple("x", "y")

There is a Phobos module, std.traits, providing alternative implementations of several built-in traits. Generally, it's encouraged to use these over the built-ins. We'll take a look at std.traits in the next chapter (and one convenience function before the end of this chapter). For more on built-in traits, refer to http://dlang.org/traits.html.

User-defined attributes

User-defined attributes, or UDAs, allow you to associate metadata with your variables and functions. The attributes can be examined at compile time to generate different code paths. There are different ways to implement a UDA. The simplest is to use a literal, such as the integer literal in this example.

@(1) int myVal;

To determine at compile time what attributes myVal has, use the getAttributes trait.

pragma(msg, __traits(getAttributes, myVal));

This will print the following:

tuple(1)

Integer literals aren't really the best option for implementing UDAs. There's absolutely no scoping for a literal and two libraries may interpret 1 quite differently. It's more appropriate to declare UDAs that have a name with some special meaning. Here's one possible approach:

enum NoPrint;
struct Foo {
    int x;
    @NoPrint int y;
}

A function (preferably a template) could be implemented that only prints data that isn't annotated with @NoPrint. Literals and manifest constants can become UDAs because they are known at compile time. Aggregate types work as well.

struct NoPrint {}
struct NoSave {}
enum Decoration {
  none,
  italics,
  bold,
}
struct Decorated {
  Decoration decoration;
}
struct Data {
  @Decorated(Decoration.Bold) string name;
  @Decorated(Decoration.Italics) string occupation;
  @NoSave @NoPrint int temporary;
}

All the attributes of a single member of a data structure can be examined with __traits.

pragma(msg, __traits(getAttributes, Data.temporary));

Compile time reflection can be used to grab the attributes of every member.

foreach(member; __traits(allMembers, Data)) {
  enum name = "Data." ~ member;
  writef("Attributes of %s: ", name);

  foreach(attr; __traits(getAttributes, mixin(name))) {
    static if(is(typeof(attr) == Decorated)) {
      Decoration dec = __traits(getMember, attr, "decoration");
      writef("Decoration.%s", dec);
    } else {
      writef(" %s", attr.stringof);
    }
  }
  writeln();
}

First up, a foreach loop is being run on the members of Data. It's worth noting that, because the return value from __traits is a compile-time value, the loop is actually being unrolled at compile time. The same is true for the inner loop. In order to get the attributes of any specific member of Data, the qualified form of the name must be used, such as Data.occupation. The names returned by allMembers are not qualified, so the qualified names have to be constructed manually: "Data." ~ member.

When it's time to fetch the attributes, the qualified member name must be passed to __traits as an identifier, and not as a string. It needs the name of the member as it is written in code, e.g. Data.occupation and not "Data.occupation". A string mixin is used to generate an identifier from the string value of name. As the attributes are iterated, a test is performed on each with is(typeof(attr) == Decorated).

Attributes can be types or values. This has consequences when a struct is used to define the attribute. @NoSave is a type attribute, whereas @NoSave() is a value attribute. The former can be tested with is(attr == NoSave). The latter will fail that test, since a value can't be compared with a type. Therefore, the test in that case must be is(typeof(attr) == NoSave). Notice that Decorated is used as a value attribute, initialized in each case with a member of the Decoration enumeration. The loop uses static if to determine if the current property is an instance of Decorated and, if so, uses the getMember trait to fetch the value of its decoration member.

If the attribute is not a Decorated instance, then the .stringof property is used to get the name of the attribute as a string. A type can't be printed at runtime (though it can be at compile time with a msg pragma), so each type must be converted to a string via .stringof. This is the opposite of the problem solved by string mixins, where strings are converted to symbols.

When writing a function that looks for multiple attributes, the type/value dichotomy can make for some convoluted code. Thankfully, std.traits provides a template function to hide all of the complexity.

import std.traits : hasUDA;
static if(hasUDA!(Data.temporary, NoSave))
    writeln("Data.temporary can't be saved!");

This will return true no matter if Data.temporary was annotated with @NoSave or @NoSave(). As of DMD 2.069, std.traits also includes the functions getUDAs and getSymbolsByUDA.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.136.22.179