Chapter 2. Building a Foundation with D Fundamentals

In this chapter and the next, we're going to look at the fundamental building blocks of D programming. There's a lot of information to cover, so our focus in both chapters will primarily be on the syntax, differences from other C-family languages, and how to avoid common beginner mistakes.

If you enter the code snippets into a text editor and try to compile them as you work through this chapter and the rest of the book, please keep the following in mind. Many of the snippets make use of one or more functions from std.stdio. In order to be successfully compiled, they all require a main function. However, both declarations are often missing from the snippets listed in the book in the interest of saving space. Use the following as a template to implement any such snippets yourself:

import std.stdio;
void main() {
    // Insert snippet here
}

Here's how this chapter is going to play out:

  • The very basics: Identifiers, scope, modules, comments, variable declarations, and initialization
  • Basic types: Integral and floating-point types, aliases, properties, and operators
  • Derived data types: Pointers, arrays, strings, and associative arrays
  • Control flow statements: Loops, conditionals, scope, and go to statements
  • Type qualifiers: Immutable and const
  • Functions: Everything to do with functions
  • MovieMan: The first steps

The very basics

With the exception of source code comments, everything in this section is required knowledge for anyone who intends to successfully compile a D program.

Identifiers

The names of variables, functions, user-defined types, and so on, are all identifiers. Identifiers are case-sensitive and can consist of any combination and number of Universal Character Names (UCN), underscores, and digits. D does not itself define what constitutes a valid UCN. Instead, it refers to the list of valid UCNs specified in Annex D of the C99 standard. Aside from the English alphabet, characters from several languages are valid UCNs. Henceforth, I will refer to UCNs as letters. Identifiers in this book will be constrained to the English alphabet.

There are a few rules to follow when choosing identifiers:

  • The first character in an identifier can only be a letter or an underscore.
  • The use of two leading underscores is reserved for the compiler implementation. This is currently not enforced by the compiler; barring any conflicts, it will happily compile any symbols that begin with two underscores. However, this raises the chance that such code will stop compiling with future compiler releases.
  • Certain keywords are reserved by the language and attempting to use them as identifiers will cause compilation to fail. A list of all reserved identifiers can be found at http://dlang.org/lex.html under the Keywords section.
  • The standard library defines several identifiers in the global namespace, which precludes using them as custom identifiers.

    Tip

    Source file encoding

    D source files can be in any one of the ASCII, UTF-8, UTF-16, or UTF-32 encodings. Both big- and little-endian versions of the latter two are accepted.

A note about scope

As we work through this chapter and the next, we'll see several types of declaration. We saw one already in the first chapter: import declarations, where module scope was also mentioned. There are a couple of things to keep in mind about scope when making any declaration.

First, anything declared in module scope is visible anywhere in that module no matter at what point in the module it is declared. Consider the following:

void main() {
  writeln("Scope this!");
}
import std.stdio;

Putting the import declaration before or after main makes no difference. The writeln function is visible inside main either way. This also applies to type and variable declarations.

Note

Some with C or C++ experience may sometimes refer to D's module scope as global scope. Others will argue that this is inaccurate. A myVar variable declared in the mymod module can be accessed as mymod.myVar, so the module name (and package name) can serve as a namespace. However, this syntax is not generally enforced, so the variable can also be accessed simply as myVar. Just understand that, when you see someone use the term global scope in D, they are probably using it as a synonym for module scope.

Drilling down to a lower level, every function has an associated function scope. Then there are block scopes, which are automatically created by some statements, such as foreach, and can also be created manually inside a function scope or another block scope using braces: { and }. Almost any declaration that can be made in module scope can also be made in a local scope, but the declaration must come before the point of use.

void main() {
  writeln("Scope this!");
  import std.stdio;
}

Try to compile this and the compiler will complain that writeln is not defined. Declarations can be made at any point in a local scope—top, middle, or bottom—as long as they are not used before the point of declaration. We'll revisit scope when we go through aggregate types in Chapter 3, Programming Objects the D Way.

More on modules

D's module system is very simple to understand once you've taken the time to do so. Unfortunately, new D users often expect D modules to behave like C headers, or imports to be Java-like. The purpose of this section is to disabuse you of any such notions. This is not the last we'll see of modules. We'll revisit them in the next chapter.

Module declarations

In the first chapter, we saw that a D source file saved as hello.d automatically becomes a module named hello when the compiler parses it. We can explicitly name a module using a module declaration. If present, the module declaration must be the first declaration in the file. There can be only one module declaration per file. By convention, module names are all lowercase.

Module declarations are optional, but they become a necessity when packages are involved. Let's experiment with D's package feature now. Create a subdirectory in $LEARNINGD/Chapter02 called mypack. Then save the following as mypack/greeting.d:

module mypack.greeting;
import std.stdio;
void sayHello() {
  writeln("Hello!");
}

Now save the following as Chapter02/hello2.d:

import mypack.greeting;
void main() {
  sayHello();
}

Open a command prompt, cd to $LEARNINGD/Chapter02, and try to compile hello2.d. Here's what it looks like for me on Windows:

Module declarations

The very first word of the output starts with OPTLINK, indicating a linker error. The phrase Symbol Undefined is another hint that it's a linker error. This tells us that the compile stage was successful; both hello2.d and, because it was imported, greeting.d, were parsed just fine, and then hello2.d was compiled into an object file. The compiler then passed the object file to the linker, but the linker could not find a symbol. Specifically, the missing symbol is sayHello, which is the function in greeting.d.

This is a common mistake made by many new D programmers, especially those who are well acquainted with Java. There's a misconception that simply importing a module will cause it to automatically be compiled and linked into the final executable. Import declarations are solely for the compiler to know which symbols are available for use in the module it is currently compiling. The compiler does not automatically compile imported modules. In order for imported modules to be compiled and linked, they should be passed to the compiler as well.

dmd hello2.d mypack/greeting.d

So how did the compiler find greeting.d when we didn't pass it on the command line? The compiler works on the assumption that package names correspond to directory names and module names are filenames. By default, it will search the current working directory for any packages and modules it encounters in import statements. Since the current working directory in our example is $LEARNINGD/Chapter02 and it has a subdirectory, mypack, which matches the package name in the import declaration mypack.greeting, the compiler easily finds greeting.d in the mypack subdirectory. If you change the name of greeting.d to say.d and compile only hello2.d again, you'll get a compiler error instead of a linker error:

hello2.d(1): Error: module greeting is in file 'mypackgreeting.d' which cannot be read

Again, passing both modules to the compiler will eliminate the error. Before checking the file system, the compiler will first check all of the modules passed on the command line to see if any of them match the name in an import declaration. In this case, as long as there is a module declaration, the name of the file plays no role. This breaks when you are compiling source files individually (with the -c command-line option), or using third-party libraries, so when using packages it's best to always put module declarations in every source file and match the module names to the filenames. It doesn't matter in which order source files are fed to the compiler, but by default the name of the first file will be used as the name of the executable. This can be overridden with the -of command line switch.

More about import declarations

We introduced standard import declarations, selective imports, and local imports in the first chapter. There are other options for import declarations. First up: public imports.

hello2 imports mypack.greeting, which imports std.stdio. Imports are private by default, so nothing from std.stdio is visible inside hello2. You can verify this by adding a writeln function to main in hello2.d. Compiling will yield the following error:

hello2.d(4): Error: 'writeln' is not defined, perhaps you need to import std.stdio; ?

Note

The compiler will often make recommendations for specific symbols when it encounters one that it doesn't know about—which is a good way to catch spelling errors—but it doesn't generally recommend imports. writeln and friends are a special case.

Make one small modification to mypack/greeting.d: put public in front of the import.

public import std.stdio;

Now, when hello2 imports mypack.greeting, it also makes all the symbols from std.stdio visible in hello2. There are three syntax options for public imports, which are shown in the following code:

public import std.stdio;
public {
  import std.stdio;
}
public:
  import std.stdio;

In the first line, public applies only to that declaration. In the second line, it applies to everything between the braces. The last one applies to everything in the module until a new protection attribute is encountered. In any of those lines, you can replace public with private to explicitly replicate the default behavior. Note that public imports can only be declared in module scope.

Tip

Not just for protection

The three different syntaxes seen here with protection attributes can also be used with other D attributes that we will see later in this and subsequent chapters, even if it is not explicitly mentioned. Regarding the colon syntax, there is one key point to be aware of. If a public: is followed by a private:, all subsequent declarations will be private, that is, the public is "turned off." This is not the case with all attributes, as many are independent and do not have a counter attribute to "turn them off." In those cases, the colon syntax makes the attribute valid until the end of the scope in which it is declared.

Now let's change hello2.d:

import mypack.greeting;
void main() {
  mypack.greeting.sayHello();
}

Here, we're calling sayHello with its Fully Qualified Name (FQN). This is always possible, but it's only a requirement when two imported modules contain conflicting symbols. The FQN can be used to specify the symbol that is desired. One way to minimize the chance of conflict is to use static imports.

static import mypack.greeting;

Static imports force the use of the FQN on symbols. Calling sayHello without the FQN will fail to compile unless there is another symbol with the same name in scope. Another approach is to use named imports.

import greet = mypack.greeting;

Essentially, this works the same as the static import, except that now symbols in the imported module are accessed through an identifier of your choosing. In this example, calls to sayHello must now be made as greet.sayHello.

Tip

Public imports and FQNs

When a module that contains public imports is imported, symbols in the publicly imported module gain an alternate FQN. When mypack.greeting publicly imports std.stdio, then mypack.greeting.writeln becomes an alias for std.stdio.writeln. Both can be used in modules that import mypack.greeting.

Finally, all import declarations except selective imports support multiple, comma-separated module names. A single selective import can follow other imports, but it must be at the end. Multiple modules in a static import are all static. Standard imports and named imports can be mixed freely.

import std.stdio, std.file, // OK
       std.conv;
import IO = std.stdio, std.file, // OK: Two named imports and
       Conv = std.conv;          // one standard import
import std.stdio : writeln, std.conv; // Error: selective import
                                      // in front
import std.file, std.stdio : writeln; // OK: selective import at
                                      // at the end
import std.stdio : writeln, writefln; // OK: Selective import with
                                      // multiple symbols
static import std.stdio, std.file, // OK: All three imports
              std.conv;            // are static

The special package module

package.d is D's approach to importing multiple modules from a package in one go. In it, the package maintainer can publicly import modules from the package as desired. Users may then import all of them at once using the package name. The compiler will load package.d as if it were any other module. Given a package somepack, with the modules one, two, and three, we can save the following as somepack/package.d:

module somepack;
public import somepack.one, somepack.two, somepack.three;

With this in place, all three modules can be imported at once with the following declaration:

import somepack;

Any modules that are not imported in package.d are not visible unless imported explicitly. package.d is still a D source module, so any valid D code is allowed.

Comments

D supports the same types of comment syntax found in other C-family languages. C-style block comments open with /* and close with */, with any number of lines in between. They are not allowed to nest. Single-line comments open with // and terminate with the end of the line.

/* Hi! I'm a C-style,
 block comment. */
// I'm a single-line comment.
/* Nested block comments… /* like this one */ are illegal */

D also has a block comment syntax that supports nesting. This is great for quickly commenting out multiple lines of code that might already contain block comments. This syntax opens with /+ and closes with +/. Any number of lines and any number of comments can fall in between. The first +/ encountered matches the most recent /+.

/+ This is the first line of the comment.
   /* This is a nested comment. */
   /+ This is another nested comment. +/
The is the last line +/

Finally, D allows documentation comments, or Ddoc comments. This type of comment opens with /** and closes with */, or the equivalent /++ and +/. The single-line version is ///. Ddoc is used to generate source code documentation. When placed before any declaration in module scope, the comment becomes the documentation for that declaration. The Ddoc output, which is HTML file by default, can be generated with DMD by passing -D on the command line.

/**
The function to which control is passed from DRuntime.

This implementation prints to stdout the command used to execute this program. It ignores errors.

Params:
    args - the command line arguments passed from DRuntime.
*/
void main(string[] args) {
   writeln(args[0]);
}

Note

You can head over to http://dlang.org/ddoc.html when you're ready to begin writing source code documentation. It covers everything you need to know to get started. Refer to the Phobos source for examples.

Variable declaration and initialization

Variables in D must be declared somewhere before they can be used. Variable declarations generally take the same form as in other C-family languages: the name of a type followed by an identifier. The following two lines declare four variables named someNumber, a, b, and c, all of which are of type int:

int someNumber;
int a, b, c;

Earlier in this chapter, we looked at how scopes affect declarations. With variable declarations, one thing that must be kept in mind is shadowing. This occurs when a variable in a given scope is named the same as a variable in a parent scope. Shadowing module scope variables is legal; every reference to the symbol before the local declaration refers to the module scope variable and every reference after the local declaration (within the same local scope) refers to the local variable. It is an error for a local variable to shadow a variable in any outer scope except module scope.

int x;
void main() {
    writeln(x); // OK: refers to module scope x
    int x; // OK: shadowing module scope variables allowed
    writeln(x); // OK: refers to local x
    int y;      // Function scope
    // Opening a new scope
    {       
        int y; // Error: local y is shadowing main.y
        int z;
    }
    // Opening a new scope
    {
        // OK: This scope and the one above are independent of
        // each other.
        int z;
    }
}

Variables in any function or block scope can be declared static, causing the value of the variable to persist beyond the lifetime of the scope. Such variables are only accessible within the scope in which they are declared. Applying static to a variable in module scope has no meaning.

Static variables and variables in module scope can be explicitly initialized to any value that can be known at compile time. This includes literals and constant values (along with any expression that can be evaluated at compile time, as you'll learn in Chapter 4, Running Code at Compile Time). Variables in function and block scopes can additionally be initialized to runtime values. Variables that are not explicitly initialized are default-initialized to a type-specific value. In explicit initialization, the type name can be replaced with auto to trigger type inference. Consider the following example:

auto theAnswer = 42;
int noMeaning, confused = void;

Here, theAnswer is explicitly initialized to the integer literal 42 and the compiler infers it to be of type int. The variable noMeaning is initialized to the default value for the int type, which is 0. Poor confused is not initialized at all. The keyword void instructs the compiler to turn off default initialization for this variable. However, it's still going to refer to whatever happens to be living at its address when the program starts up. Forgetting to initialize variables is a common source of bugs in many C programs. With default initialization, D makes it easy to either avoid such bugs completely, or track them down when they do appear.

One last thing to keep in mind about variable declarations is that all module scope variables are thread-local by default. This is very, very different from how other C-family languages do things. This means that, if you are using multiple threads, every thread gets its own copy of the variable. One thing that will certainly surprise C and C++ programmers is that even static variables in a function or block scope are thread-local. Typically in C, functions intended to be used in multiple threads have to avoid static variables like the plague. This is not the case in D. We'll come back to static declarations when we go through functions later in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.217.220