Operator overloading

The time has come to return to user-defined types and see how to implement operator overloads. It's important to note before we get started that D does not support operator overloads as free functions; they must be part of a class or struct declaration. We'll be turning away from templates for part of this discussion; some operator overloads are required to be templates, but others can either be templates or normal member functions. We'll look at the latter group first. For the official documentation on operator overloading, pay a visit to http://dlang.org/operatoroverloading.html.

Non-templated operator overloads

There are a handful of operator overloads that are not required to be templates. These cover the comparison operators, the function call operator, and the assignment operator, as well as the index, index-assign, and dollar operators. We'll visit them each in that order.

Comparison overloads – opEquals and opCmp

The equality operators == and != are overloaded with opEquals. The comparison operators >, <, >= and <= are all overloaded with opCmp. There are some important considerations to keep in mind when implementing these overloads, but before we dig into that, let's look at the syntax and usage of each.

opEquals

The signature of opEquals is going to differ, depending on whether it's being implemented for a class or a struct. For classes, it's an override of a default implementation in Object. It should look like this:

class EqualClass {
  override bool opEquals(Object o) {...}
}

In a class, the very first thing any opEquals implementation ought to do is to test whether the argument can be cast to the enclosing type, in this case EqualClass:

if(auto ec = cast(EqualClass)o) {
  // Return true if both refer to the same instance.
  if(ec is this) return true;
  // Now test any members here.
}
return false;

There are multiple possible signatures for opEquals on a struct. Some possibilities:

struct EqualStruct {
  bool opEquals(const(EqualStruct) es) {...}
  bool opEquals(ref const(EqualStruct) es) {...}
  bool opEquals(const(EqualStruct) es) const {...};
}

Note that, if the type is intended to be used in an associative array, one of the first two versions must be used. All of these could be replaced with a template form that takes no template parameters and uses auto ref on the function parameter:

struct EqualStruct {
  bool opEquals()(auto ref const(EqualStruct) es) const {...}
}

Given two instances a and b, when either a == b or a != b is encountered, the following sequence is initiated:

  • If the expression is a != b, it is rewritten as !(a == b).
  • If both operands are class instances, the expression is rewritten as .object.opEquals(a, b), which has the following implementation:
    bool opEquals(Object a, Object b) {
        if (a is b) return true;
        if (a is null || b is null) return false;
        if (typeid(a) == typeid(b)) return a.opEquals(b);
        return a.opEquals(b) && b.opEquals(a);
    }
  • For non-class instances, a.opEquals(b) and b.opEquals(a) are both attempted. If both resolve to the same opEquals implementation, then a.opEquals(b) is selected; if one is a better match than the other, it is selected; if one compiles and the other doesn't, it is selected.
  • No match has been found and an error is emitted.

opCmp

opCmp should be declared like this for a class:

class CmpClass {
  override int opCmp(Object o) {...}
}

This again, is overriding a default implementation in Object. As with opEquals, a struct can have a number of possible overloads, such as:

struct CmpStruct {
  int opCmp(const(CmpStruct) es) {...}
  int opCmp(ref const(CmpStruct) es) {...}
  int opCmp(ref const(CmpStruct) cs) const {...}
}

Alternatively, the condensed template form:

struct CmpStruct {  
  int opCmp()(auto ref const(CmpStruct) es) const {...}
}

opCmp should return a negative value if the ordering of this is lower than that of the argument, a positive number if it is higher, and 0 if they are equal. Given objects a and b, when an expression containing one of the comparison operators is encountered, each is rewritten twice as shown in the following table:

Expression

Rewrite 1

Rewrite 2

a < b

a.opCmp(b) < 0

b.opCmp(a) > 0

a <= b

a.opCmp(b) <= 0

b.opCmp(a) >= 0

a > b

a.opCmp(b) > 0

b.opCmp(a) < 0

a >= b

a.opCmp(b) >= 0

b.opCmp(a) <= 0

Both rewrites are tried and:

  • If only one compiles, it is selected
  • If both resolve to the same function, the first rewrite is selected
  • If they resolve to different functions, the best match is selected
  • An error is emitted

Considerations

When an object does not have an opEquals, a default implementation is used for any equality comparisons. For structs, this implementation does a member-wise comparison on each instance; for classes, it's a simple identity comparison, for example a is b. Attempting an ordering comparison on any struct instance for which opCmp is not defined results in a compiler error; it's a runtime error for classes.

Often, the behavior of the default opEquals is exactly what is required for a struct type. Consider a 2D point object, or an RGB color object. Both are types where it makes sense for equality to mean member-wise comparison. More importantly, neither type has any standard concept of ordering. For classes, an identity comparison is rarely the desired behavior, so opEquals should usually be implemented for any class that requires comparison.

When ordering is necessary, it's important to ensure that opCmp and opEquals are consistent. For example if a.opCmp(b) returns 0, then a.opEquals(b) should return true. If not, this can introduce subtle bugs that can be difficult to track down. Best practice dictates that, when implementing one, you should implement the other.

Function call overloads

opCall allows a user-defined type to be callable like a function. It can be declared to have any return type, and any number and combination of parameters. It can also be static. Here's an example:

struct PrintAction {
  void opCall(string arg1, int arg2) {
    import std.stdio : writefln;
    writefln(`Taking action on "%s" and %s`, arg1, arg2);
  }
}
void main() {
  PrintAction print;
  print("A Number", 42);
}

Imagine a function template that accepts, and calls in certain circumstances, anything that is callable: a function pointer, a delegate, or a struct or class with opCall. Such a template opens up many options in how you design your program; you aren't restricted to only using delegates, or only using classes that extend an interface. Note that implementing opCall on a struct disables all struct literals for that type.

Assignment overloads

The assignment operator is overloadable with opAssign. Generally, it can take any sort of parameter, with one restriction. On classes, the identity assignment is prohibited. In other words, given a class C, it is illegal to declare an opAssign that accepts another C or any type that is implicitly convertible to C. This is because classes have reference semantics, meaning that the reference on the left-hand side would rebind to the reference on the right-hand side. In other words, in myC = yourC, the original instance referred to by myC would have its opAssign run, but myC would no longer refer to it; myC and yourC now refer to the same instance. Structs, being value types, have no such restriction:

class C {
  private int _x;
  void opAssign(int x) { _x = x; }  // OK
  // Error: Identity assignment overload is illegal
  // void opAssign(C c) { _x = c._x; }
}
struct S {
  private int _x;
  void opAssign(int x) { _x = x; }  // OK
  void opAssign(S s) { _x = s._x; }  // OK
}
void main() {
  S s1, s2;
  s1 = 10;
  s2 = s1;
  writeln(s2);
  auto c = new C;
  c = 10;
  writeln(c._x);
}

In this example, the opAssign declarations all return void, but it's often a good idea to return this in structs to enable assignment chaining: a = b = c.

Index overloads

When a user-defined type needs to behave like an array, there are a handful of overloads that can be implemented. We'll look at two of them here: opIndex and opIndexAssign.

opIndex

There are different ways to use opIndex, two of which we'll cover here. First, we'll consider the form that takes one or more integral parameters, preferably of type size_t. When there is only one parameter, it corresponds to the index of a one-dimensional array, while two parameters are the indexes of a two-dimensional array, and so on. As we saw in Chapter 2, Building a Foundation with D Fundamentals, D does not have built-in support for multi-dimensional arrays, but opIndex allows adding multi-dimensional access to user-defined types. The syntax is [m,n] rather than [m][n].The function can return whatever type is appropriate, preferably by reference to allow direct modification:

struct Matrix3 {
  double[3][3] values;
  ref double opIndex(size_t i, size_t j) {
    return values[i][j];
  }
}

The second use case of opIndex is to add support for the empty slice operator. Given a type T that needs to support slicing, the empty slice operator can be overloaded by implementing opIndex with no arguments. The following example does just that:

struct Numbers(T) {
  T[] _values;
  T[] opIndex() {
    return _values[];
  }
}

This following snippet slices a Numbers instance in order to iterate over it:

auto nums = Numbers!int([10, 20, 30, 40]);
foreach(n; nums[]) {
  writeln(n);
}

Of course, there's more to slicing than just the empty slice. For that, we have the opSlice function. When we cover it later in the chapter, we'll also see a third use case for opIndex.

opIndexAssign

When a user-defined type needs to accept assignment to an index, it can implement opIndexAssign. This allows assignments of the form t[i] = x. Like opIndex, multiple indexes are supported, but the first parameter is the assigned value. Revisiting the Matrix example, here's an implementation that takes two indexes:

double opIndexAssign(double val, size_t i, size_t j) {
  return _values[i][j] = val;
}

With this, it's now possible to assign a value to a matrix such as m[0, 1] = 10.0. We're going to revisit both opIndex and opIndexAssign later in the chapter when we discuss opSlice.

opDollar

This is not an index overload, but it's closely related. Recall that inside the array and slice operators, $ is a shortcut for the .length property of the current array. User-defined types can override this with opDollar:

struct Numbers(T) {
  T[] _values;
  T[] opIndex() {
    return _values[];
  }
  size_t opDollar() { return _values.length; }
}

Templated operator overloads

With the power of templates, it's possible to configure a single function at compile time to overload multiple operators, or to take different code paths for different operators. We're going to cover unary operators, binary operators, the cast operator, the op-assign operators, and the slice operator.

Unary overloads

In any expression with a unary operator applied to an object a, the expression is rewritten as a.opUnary!"op"(), where op is one of -, +, ~, *, ++, and --. This takes no function parameters and can return any value (even void, but that diminishes its usefulness). It requires one template value parameter, a string representing the overloaded operator.

A common approach to implement this is to use template constraints on the value. This example does just that to implement everything but the pointer dereference operator, *:

struct Number(T) {
  T value;
  T opUnary(string op)() if(op != "*") {
    mixin("return " ~ op ~ "value;");
  }
}

Notice the string mixin in the function body. This is used in order to generate the actual code for the correct expression. Without that, it would be necessary to use a static if chain to compare op against each supported operator and manually implement the expression for each. To verify it works as expected:

auto num = Number!int(10);
writeln(-num);
writeln(++num);
writeln(--num);
writeln(+num);
writeln(~num);

As opUnary is a template, all of the template options are at your disposal. Don't like having a single implementation for all of those operators? No problem. Go ahead and implement multiple versions of opUnary with different constraints. Or maybe forgo constraints altogether and use static if inside the body, or use specialization instead: opUnary(string op : "*")(). There's no one right way to do it. Note that the compiler uses opUnary for both prefix and postfix increment and decrement operators. It's not possible, nor is there a need, to distinguish between them inside opUnary.

Binary overloads

Given two objects in an expression a op b, where op is one of +, -, *, /, %, ^^, &, |, ^, <<, >>, >>>, ~, or in, the expression is rewritten as both a.opBinary!"op"(b) and b.opBinaryRight!"op"(a), and the best match selected. If both equally match, there is an error. They can return any value and the function parameter can be any type. As they are templates, everything that held true in the discussion of opUnary applies here as well. Consider this partial implementation of a 3D vector:

struct Vector3 {
  float x, y, z;
  Vector3 opBinary(string op)(auto ref const(Vector3) rhs)
  if(op == "+" || op == "-")
  {
    mixin(`return Vector3(
      x` ~ op ~ `rhs.x,
      y` ~ op ~ `rhs.y,
      z` ~ op ~ `rhs.z);`
    );
  }
  Vector3 opBinary(string op : "/")(float scalar) {
    return this * (1.0f/scalar);
  }
  Vector3 opBinary(string op : "*")(float scalar) {
    return Vector3(x*scalar, y*scalar, z*scalar);
  }
  Vector3 opBinaryRight(string op : "*")(float scalar) {
    return this * scalar;
  }
}

The first opBinary handles both addition and subtraction with Vector3. For this case, it doesn't make sense to implement opBinaryRight. That would actually cause both rewrites to match equally and lead to a compiler error. The body is implemented using a WYSIWYG string in a simple string mixin. The second and third implementations handle division and multiplication by scalars. A single implementation could have handled both operators using a static if block, but that is more verbose. Finally, opBinaryRight is implemented only for the scalar multiplication. It's reasonable to accept 2.0f * vec to be the same as vec * 2.0f. The same does not hold for division. The following verifies that all works as expected:

auto vec1 = Vector3(1.0f, 20f, 3.0f);
auto vec2 = Vector3(4.0f, 2.0f, 5.0f);
writeln(vec1 + vec2);
writeln(vec1 - vec2);
writeln(vec2 * 2.0f);
writeln(2.0f * vec2);
writeln(vec1 / 2.0f);

Cast overloads

Given a cast of any user-defined type a to any type T, the compiler rewrites the expression to a.opCast!(T). Additionally, given any circumstance where a user-defined type can be evaluated to bool, such as if(a) or if(!a), the compiler will attempt to cast the type to bool with a.opCast!(bool) and !a.opCast!(bool). Implementations of opCast should take no function parameters and return a value of a type that matches that of the template parameter. The following is a simple Number type that supports casting to bool and any numeric type:

struct Number(T) {
  import std.traits : isNumeric;
  T value;
  bool opCast(C)() if(is(C == bool)) const {
    return value != 0;
  }
  C opCast(C)() if(isNumeric!C) const {
    return cast(C)value;
  }
}

The following snippet shows opCast in action:

auto num1 = Number!int(10);
Number!int num2;
writeln(cast(bool)num1);
writeln(cast(bool)num2);
writeln(cast(byte)num1);

Operator assignment overloads

Given two objects in an expression a op= b, where op is one of +, -, *, /, %, ^^, &, |, ^, <<, >>, >>>, ~, or in, the expression is rewritten as a.opOpAssign!"op"(b). As an example, let's add support for +=, -=, *=, and /= to the previous Vector3:

struct Vector3 {
  float x, y, z;
  ref Vector3 opOpAssign(string op)(auto ref Vector3 rhs)
  if(op == "+" || op == "-")
  {
    mixin("x" ~ op ~ "= rhs.x;
    y" ~ op ~ "= rhs.y;
    z" ~ op ~ "= rhs.z;");
    return this;
  }
}

Tip

opAssign versus opOpAssign

That these two operator overloads have such similar names makes it easy to mix them up. More than once I have unintentionally implemented opAssign when I really wanted opOpAssign. I even did it while implementing the Vector3 example. When your opOpAssign isn't working properly, the first thing to check is that you didn't type opAssign by mistake.

Slice operator overloads

Overloading the slice operator in D requires two steps: add an opSlice implementation, and a special version of opIndex. Before describing how the functions should be implemented, it will help to show how they are used. The following lines show both a one-dimensional slice and a two-dimensional slice:

auto slice1 = oneD[1 .. 3];
auto slice2 = twoD[0 .. 2, 2 .. 5];

The compiler will rewrite them to look like this:

oneD.opIndex(opSlice!0(1, 3));
twoD.opIndex(opSlice!0(0, 2), opSlice!1(2, 5));

opSlice must be a template. The single-template parameter is a value representing the dimension that is currently being sliced. The two function parameters represent the boundaries of the slice. They can return anything the implementation requires in order to perform the slice.

opIndex is a normal function as before and should be declared to accept one parameter per supported dimension. What's different now is that the type of the parameter no longer needs to be an integral; it can be any type required to produce a slice. Additionally, the return value should be whatever type is produced from slicing this type.

Let's look at a one-dimensional array wrapper as a simple example:

struct MyArray(T) {
  struct SliceInfo {
    size_t start, end;
  }
  private T[] _vals;
  T opIndex(size_t i) {
    return _vals[i];
  }
  T[] opIndex(SliceInfo info) {
    return _vals[info.start .. info.end];
  }
  SliceInfo opSlice(size_t dim)(size_t start, size_t end) {
    return SliceInfo(start, end);
  }
}

The internally declared SliceInfo is the key to making the slice work. opSlice simply returns an instance initialized with the beginning and end indexes it's given. The slice overload of opIndex then takes that data and produces a slice:

auto ma = MyArray!int([10, 20, 30, 40, 50]);
writeln(ma[1 .. 3]);

This prints [20, 30] as expected. Support for multidimensional arrays works the same way, just with extra dimensions. Here's a custom two-dimensional array to demonstrate:

struct My2DArray(T) {
  struct SliceInfo {
    size_t start, end;
  }
  private T[][] _vals;
  this(T[] dim1, T[] dim2) {
    _vals ~= dim1;
    _vals ~= dim2;
  }
  T opIndex(size_t i, size_t j) {
    return _vals[i][j];
  }
  auto opIndex(SliceInfo info1, SliceInfo info2) {
    return My2DArray(
      _vals[0][info1.start .. info1.end],
      _vals[1][info2.start .. info2.end]
    );
  }
  SliceInfo opSlice(size_t dim)(size_t start, size_t end) {
    return SliceInfo(start, end);
  }
}

Notice that the template parameter in opSlice is never used at all; it's just not needed in this simple case. Also notice that opIndex is defined to return a My2DArray instance containing the sliced array data. That is likely the best return type to use in this specific case (after all, slicing T[] returns T[]), but there is enough flexibility to tailor the behavior for specific circumstances. We could just as easily implement it like this:

auto opIndex(SliceInfo info1, SliceInfo info2) {
  return _vals[0][info1.start .. info1.end] ~ _vals[1][info2.start .. info2.end];
}

This concatenates the two slices into a single slice, which it then returns. It could also return a range (which we will get to in the next chapter), or any other type that we need a slice to represent.

Other overloads

The aforementioned overloads are called every time a specific symbol in the source code is encountered, such as * or cast or (). This subsection covers overloads that are called in more narrow circumstances.

opDispatch

Given a variable t of type T and a call to a member function t.func, the compiler will report an error if T does not implement func. The templated opDispatch acts as a catch-all when an attempt is made to access any member that doesn't exist on a type. The name of the member is passed as a template value parameter in a process called forwarding:

struct NoMembers {
  void opDispatch(string s)() {
    import std.stdio : writeln;
    writeln("Attempted to access member ", s);
  }
}
void main() {
  NoMembers nm;
  nm.doSomething();
  nm.someProperty;
}

This gives the following output:

Attempted to access member doSomething
Attempted to access member someProperty

With a good mix of D's compile-time features, some creative things can be done with this. Take a look at the HodgePodge type:

struct HodgePodge {
  void printTwoInts(int a, int b) {
    writefln("I like the ints %s and %s!", a , b);
  }
  int addThreeInts(int x, int y, int z) {
    return x + y + z;
  }
}

The following snippet has an opDispatch implementation that can take any number of arguments and return any type. It uses compile-time reflection to determine whether the member function in the template argument exists in HodgePodge, ensures the number of function arguments match, and calls the function if they do:

struct Dispatcher {
  private HodgePodge _podge;
  auto opDispatch(string s, Args...)(Args args) {
    static if(__traits(hasMember, HodgePodge, s)) {
      import std.traits : ParameterTypeTuple;
      alias params = ParameterTypeTuple!(mixin("HodgePodge." ~ s));
      static if(params.length == args.length)
        mixin("return _podge." ~ s ~ "(args);");
    }
  }
}

auto allows for any type to be returned. The first template parameter s is bound to the missing member name. If the function call includes arguments, they will be passed after s. In this case, a tuple parameter is declared to catch all of them. In the body, static if and __traits are used to determine whether HodgePodge has a member named s. If so, ParameterTypeTuple from std.traits is used to get a tuple containing the types of all of the function parameters. A string mixin generates HodgePodge.memberName for the template instantiation. It's not the types we're interested in, but the number of them, so it checks whether the number of function arguments matches the number given to opDispatch. If so, a string mixin generates both the function call and return.

Note that this implementation doesn't support member variables or variadic member functions. Trying to access any of these through this implementation of opDispatch leads to an error message saying that Dispatcher doesn't have that missing member. Fixing that is left as an exercise for the reader.

opApply

In order to directly iterate a user-defined type in a foreach loop, it must either implement the opApply function or a range interface (something we'll see in the next chapter). opApply must be declared to return int and to take a delegate as its only parameter. The delegate should also return int, but can have multiple parameters of any type. The delegate is provided to the function from the runtime. The implementation of opApply should do whatever internal iteration it needs, call the delegate at each step of iteration, and if the delegate returns non-zero, immediately return that value. If the internal iteration runs its course, the function should return 0:

struct IterateMe {
  enum len = 10;
  int[len] values;
  void initialize() {
    foreach(i; 0..len) {
      values[i] = i;
    }
  }
  int opApply(int delegate(ref int) dg) {
    int result;
    foreach(ref v; values) {
      result = dg(v);
      if(result)
        break;
    }
    return result;
  }
}
void main() {
  IterateMe im;
  im.initialize();
  foreach(i; im)
    writeln(i);
}

Here's an example using multiple parameters with the delegate:

struct AA {
  int[string] aa;
  void initialize() {
    aa = ["One": 1, "Two":2, "Three": 3];
  }
  int opApply(int delegate(string, ref int) dg) {
    int result;
    foreach(key, val; aa) {
      result = dg(key, ref val);
      if(result)
        break;
    }
    return result;
  }
}
void main() {
  import std.stdio : writefln;
  AA aa;
  aa.initialize();
  foreach(k, v; aa)
    writefln("%s: %s", k, v);
}

To iterate the type with foreach_reverse, implement opApplyReverse in the same manner, but iterate over the internal array in the opposite direction.

toHash

toHash isn't an operator overload, but it's a function that any user-defined type can implement and it's tightly connected with opEquals. It's called on any type that is used as an associative array key, takes no parameters, and must return size_t. The signature for classes is:

override size_t toHash() @trusted nothrow;

And for structs or unions:

size_t toHash() const @safe pure nothrow;

The only requirement is that toHash and opEquals be consistent. Given objects a and b, if calling opEquals on them returns true, then their toHash functions must return the same value. If this requirement is not met, the object will not behave properly as an associative array key. All objects, even structs and unions, have a default toHash that is used when a custom version is not implemented; however, when overloading opEquals, it's best to also implement a custom toHash to ensure that they remain consistent.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.226.255