CHAPTER 18

image

Extensible Visitor Pattern Case Study

Playing games is one possible road that leads to the learning of API design skills. Another is performing a different kind of mental exercise: finding an interesting problem, analyzing it, finding potential solutions, and deciding which is better and why. While doing this kind of exercise, you can often find yourself discovering surprising results that deepen your insight into the problem.

Let’s go through one interesting example here. It started when I first heard about the JDK 1.6 proposal for a new API to model Java sources. An initial version of the API was included in JDK 1.5 in the apt tool, which could do postprocessing of annotations attached to classes, methods, and fields. It had to provide the model of the sources, which was more or less a reflection-like API, but focusing on sources. The apt tool used to have its packages in the com.sun.mirror package, which is not fully official. However, for JDK 1.6, the compiler team started a Java Specification Request (JSR) to define the official API placed in the javax.lang.model package. At that time, I found out about it and about its peer API that was about to define the model for the internals of method bodies.

What raised my interest in this topic was my awareness that the proposal for the evolution of the API claimed to be binary compatible, but source incompatible. Indeed, keeping source compatibility in Java is almost impossible: nearly any change in the API can break compilation of existing sources, as discussed in the section “Source Compatibility” in Chapter 4. However, this was a much bigger threat to compatibility: the plan was to add new methods into interfaces. This is binary compatible according to the virtual machine specification. However, for practical purposes it’s not compatible at all. The system will link after you add a method to an interface, but it will throw runtime errors when someone calls a method on an interface that is not implemented. Needless to say, I don’t like this kind of compatibility at all.

That is when I started my mental exercise to find a better solution. The compiler developers showed me a paper proposing a solution that would keep all kinds of compatibility based on generics; see “The Expression Problem Revisited.”1 However, they didn’t want to use it, and looking at it, I understood why. The solution is heavily based on generics and difficult for normal developers to understand. You would need to study type theory at university for several years to understand what it’s all about. This is an example of having a too-correct API: one that you don’t want to use because it might scare your users. No wonder the compiler team preferred the solution involving adding methods into interfaces. For practical reasons it’s easier, but as a mental exercise it’s insufficient. So, let’s look at other ways of solving the problem.

Although the javax.lang.model package initiated the problem, it’s not restricted to that domain. In fact, it appears whenever you use a visitor in an API. Evolving a visitor is difficult. The methods in the visitor interface are defined once. However, the data structures it operates on might need to evolve. The question is what to do in this situation. Adding methods to the visitor interface is kind of a solution. However, it’s a bit “bulldozer-like”: not something a rational mind would be proud of. Meanwhile, mental exercises exist to tease the mind and stimulate it to search for beautiful, truthful, and elegant solutions!

The visitor pattern is well known. It separates operations, which are usually on hierarchical data structures, from the representation of data. It allows one team to provide the definition of data structures and a way to traverse them, while other teams can construct the data. Still other teams write the operations to be executed on them. The pattern is simple and easy to use. On the other hand, it contains hidden issues related to the evolution of the data interfaces, as well as the traversal interfaces.

First let’s start with a simple example, slowly dig into its evolution problems, and change the solution to address them. In the end, a simple and elegant solution will address all the issues. However, the destination is not our only target, because the journey will be interesting as well. It will illustrate our chain of reasoning. It will also show valuable ideas that can be sufficient if you are not very strict and don’t need a “truly” extensible visitor, but just something that is good enough.

The first step in thinking about the problem is to create an example case that motivates you to solve the problem. Visitors can be used in many situations, but the most useful situations relate to traversing hierarchies of heterogeneous data structures. One of the most profound hierarchies is the one representing a tree of expressions, such as (1 + 2) * 3 - 8. Let’s imagine that you want to represent a model for this simple expression language:

public abstract class Expression {
   Expression() {}
   public abstract void visit(Visitor v);
}
public final class Plus extends Expression {
   private final Expression first;
   private final Expression second;

   public Plus(Expression first, Expression second) {
       this.first = first;
       this.second = second;
   }
   public Expression getFirst() { return first; }
   public Expression getSecond() { return second; }
   @Override
   public void visit(Visitor v) { v.visitPlus(this); }
}
public final class Number extends Expression {
   private final int value;
   public Number(int value) { this.value = value; }
   public int getValue() { return value; }
   @Override
   public void visit(Visitor v) { v.visitNumber(this); }
}

public interface Visitor {
   public void visitPlus(Plus s);
   public void visitNumber(Number n);
}

This gives you a model that can only represent numbers and the “plus” operation. However, it’s powerful enough to create a structure for expressions such as 1 + 1 or 1 + 2 + 3. It also lets you easily create write operations on such models. For example, you can print any expression by writing the following:

public class PrintVisitor implements Visitor {
   StringBuffer sb = new StringBuffer();

   public void visitPlus(Plus s) {
       s.getFirst().visit(this);
       sb.append(" + ");
       s.getSecond().visit(this);
   }

   public void visitNumber(Number n) {
       sb.append (n.getValue());
   }
}

@Test public void printOnePlusOne() {
   Number one = new Number(1);
   Expression plus = new Plus(one, one);

   PrintVisitor print = new PrintVisitor();
   plus.visit(print);

   assertEquals("1 + 1", print.sb.toString());
}

These are all well-known basics from any programming book. However, let’s simulate some problems now. Let’s evolve the language by improving it to support yet another construct. Imagine you want to add -. It’s natural to then introduce a new class:

/** @since 2.0 */
public final class Minus extends Expression {
   private final Expression first;
   private final Expression second;

   public Minus(Expression first, Expression second) {
       this.first = first;
       this.second = second;
   }
   public Expression getFirst() { return first; }
   public Expression getSecond() { return second; }
   public void visit(Visitor v) {
       /* now what? add new method to an interface!? */
       v.visitMinus(this);
   }
}

However, simply adding the data class is not enough. You also need to enhance the Visitor with a new visitMinus method. However, this is easier said than done because adding the method into an interface is not compatible evolution. It might not fail during linkage, although the old code is no longer compilable against the new version of the Visitor interface. The new version adds a new unimplemented method. Also, execution will fail with java.lang.AbstractMethodError as soon as someone tries to feed the old visitor with new data structures:

Number one = new Number(1);
Number two = new Number(2);
Expression plus = new Minus(one, two);

PrintVisitor print = new PrintVisitor();
plus.visit(print); // fails with AbstractMethodError

assertEquals("1 - 2", print.sb.toString());

It appears that an inevitable evolution problem is hidden in this popular design pattern. It might work fine for the design of in-house systems, but as soon as it appears in a publicly available API, it almost immediately renders itself nonextensible. That is why this pattern is not suitable for designing a universe, at least not in its current form. You need to find a clever reincarnation of a visitor that will be appropriate for the API design pattern constraints. The reincarnation will be discussed as part of the mental exercise that you’ll perform in the following sections.

Abstract Class

First, a simple and more or less workable solution is to turn the Visitor into an abstract class. Indeed, abstract classes are said to be useless in the section “Are Abstract Classes Useful?” in Chapter 6. However, if you need a trivial form of evolution, you cannot stop thinking about them. They are the most instantly available solutions in Java if you need subclassability and yet a relatively low risk of having new methods added. Instead of using an interface in the first version of the API, if you decide to use an abstract class with all methods abstract, you have the following opportunity to enhance the visitor in version 2.0, after introducing the Minus class:

/** @since 2.0 */
public final class Minus extends Expression {
   private final Expression first;
   private final Expression second;

   public Minus(Expression first, Expression second) {
       this.first = first;
       this.second = second;
   }
   public Expression getFirst() { return first; }
   public Expression getSecond() { return second; }
   public void visit(Visitor v) {
       v.visitMinus(this);
   }
}

public abstract class Visitor {
   public abstract void visitPlus(Plus s);
   public abstract void visitNumber(Number n);
   /** @since 2.0 */
   public void visitMinus(Minus s) {
       throw new IllegalStateException(
           "Old visitor used on new exceptions"
       );
   }
}

The Minus.visit(Visitor) calls the visitMinus method. The exception thrown in its default body is useful for developers who have written their own visitor in version 1.0 and want to use it with 2.0 data structures. As they have not overridden the method because they didn’t know it was about to be added, it’s relatively natural to get an exception, as this is an unexpected situation for the visitor. It’s better to terminate the execution than to silently continue without any warning. Otherwise it would be hard to find potential behavior problems.

You might argue that adding a method into a class is not fully compatible. Existing sub-classes might already have the method defined. The method might not be public—or even if it is, it can do something completely different from what it’s supposed to do from version 2.0 onwards. However, with visitors, this is not the case. Along with adding a new method, you also add a new class. Because the bytecode of the virtual machine encodes the parameter types into the method identification, the method that would match the newly defined method could not exist, even if it had the same name.

This new “abstract class”–based version of the visitor only minimally differs from the classical visitor pattern with an interface. It promises to solve most of the evolution problems. It guarantees enough source compatibility. It’s binary compatible. Of course, it throws exceptions when mixing old visitors with new language elements. However, that is just a small drawback. We’ve made significant progress; the visitor pattern is again in the game for helping us design our API universe.

Preparing for Evolution

There might be cases when throwing an exception is not appropriate default behavior. For example, when you want to verify whether the tree is valid for language version 1.0, you have to do the following:

private class Valid10Language extends Visitor/*version1.0*/ {
   public void visitPlus(Plus s) {
       s.getFirst().visit(this);
       s.getSecond().visit(this);
   }
   public void visitNumber(Number n) {
   }
}

public static boolean isValid10Language(Expression expression) {
   Valid10Language valid = new Valid10Language();
   try {
       expression.visit(valid);
       return true; // yes, no unknown elements
   } catch (IllegalStateException ex) {
       return false; // no, probably from visitMinus of Visitor/*2.0*/
   }
}

Writing this code is possible, but unusual. First of all, it uses exceptions for regular execution control and not for exceptional states. The second problem is that you have to know, at the time of writing, when only Visitor version 1.0 is available, all future methods are going to throw an exception. This can indeed be mentioned in documentation, but only if the author of the Visitor API is aware of its evolution problems. However, when the author is aware of them, there is a better approach. Moreover, there is also a third problem. You need to recognize that this exception is the right one, signaling a missing handler for the new element in the language. This might be difficult, as similar exceptions can also be thrown from other places. In short, this API is ugly, at least for this task.

Because you already know that the first version of an API is never perfect, it’s better to prepare for evolution right from the start. You should look for a solution that will allow developers to handle unknown elements themselves. They can choose to throw an exception or they can do something more appropriate, as in the case of language validation. That is why it’s much better if the first version of a Visitor immediately defines a fallback method:

public abstract class Visitor/*1.0*/ {
   public void visitUnknown(Expression exp) {
       throw new IllegalStateException("Unknown element faced: " + exp);
   }
   public abstract void visitPlus(Plus s);
   public abstract void visitNumber(Number n);
}

The visitUnknown method isn’t called in the first version of the interface. However, it’s there as an easily discoverable warning that there will be evolution. Then it will be the default fallback for all newly added expression constructs:

public abstract class Visitor/*2.0*/ {
   public void visitUnknown(Expression exp) {
       throw new IllegalStateException("Unknown element faced: " + exp);
   }
   public abstract void visitPlus(Plus s);
   public abstract void visitNumber(Number n);
   /** @since 2.0 */
   public void visitMinus(Minus s) {
       visitUnknown(s);
   }
}

Although the change is minimal and the default behavior remains the same—when an unknown element is faced, IllegalStateException is raised—this version is more polite to the API users who want to validate the expression tree. In this version you need no exception acrobatics:

private class Valid10Language extends Visitor/*version1.0*/ {
   boolean invalid;

   @Override
   public void visitUnknown(Expression exp) {
       invalid = true;
   }
   public void visitPlus(Plus s) {
       s.getFirst().visit(this);
       s.getSecond().visit(this);
   }
   public void visitNumber(Number n) {
   }
}

public static boolean isValid10Language(Expression expression) {
   Valid10Language valid = new Valid10Language();
   expression.visit(valid);
   return !valid.invalid;
}

By thinking in advance about evolution, you’ve allowed writers of the visitors to use the API in a much cleaner way, as the handling of all the unknown elements has been redirected into the visitUnknown method. This is where common handling can now be provided. This has greatly increased the capabilities of the API in this particular use case, while it has not negatively influenced any other existing usages. Only two abstract methods still need to be implemented, as before, and code that worked in the previous section continues to work without modifications.

Default Traversal

Remember that API users are always trying to stretch the design of your API to its limit, and that they always want to do more than you thought. It’s fine to tell them that these are not valid use cases and that the API was not intended for their corner cases at all. However, the more use cases you support without overcomplicating the API, the more users of your API will be satisfied. The more satisfied users are, the more useful your API will be. That is why it makes sense to ask what is beyond the horizon of the current API and to determine what is genuinely impossible. A simple problem that the visitUnknown method doesn’t support is the traversal of the whole structure of an expression while paying attention to selected nodes only. Indeed this is not a pure visitor pattern, it’s more of a scanner; however, this little terminology issue won’t prevent you from trying to find an optimal API to solve such problem. Such a scanner can easily be written for a static language. However, if the language evolves as the set of element grows, you’ll have a problem.

The obvious solution via visitors looks like the following:

private class CountNumbers extends Visitor/*version1.0*/ {
   int cnt;

   @Override
   public void visitUnknown(Expression exp) {
       // not a number
   }
   public void visitPlus(Plus s) {
       s.getFirst().visit(this);
       s.getSecond().visit(this);
   }
   public void visitNumber(Number n) {
       cnt++;
   }
}

public static int countNumbers(Expression expression) {
   CountNumbers counter = new CountNumbers();
   expression.visit(counter);
   return counter.cnt;
}

That solution might work well with language version 1.0, but as soon as you get language constructs containing Minus, everything goes wrong. You could override the visitUnknown method, but the question would be what to do in it? Probably nothing, but that is wrong, as the subtrees under Minus would be completely ignored and an expression such as 1 + (3 - 4) would be said to contain just one number:

Number one = new Number(1);
Number three = new Number(3);
Number four = new Number(4);
Expression minus = new Plus(one, new Minus(three, four));

assertEquals(
    "Should have three numbers, but visitor does not " +
    "know how to go through minus",
    3, CountNumbersTest.countNumbers(minus)
);

It’s fine to know that you’ve reached an unknown element, but for a visitor searching for specific elements (as CountNumbers does), you would like to have the ability to do a “default visit.” That would, in the case of Minus, only invoke getFirst().visit(this) and getSecond().visit(this). If you could use this “default visit” in visitUnknown, the amount of numbers in 1 + (3 - 4) would be correctly counted as 3, even if you used the old visitor written for the expression language version 1.0. The only question is how to invoke the “default visitor.”

A naive idea would be to put this as the default implementation of every Visitor method. You can do that as Visitor is now a class. This won’t work, because you cannot write Valid10 as I’ve discussed previously. There would be no place or situation where the visitUnknown method would be called. All the methods would have the default behavior, which is not what you want. You want to give users of the API a proper tool for various situations. They should be notified that there is an unexpected element in the expression tree. Only then should they decide what to do with it: whether to report it as an error, use it as an indication that their code should return false, or crawl through it and see what’s underneath it. All these possibilities are valid use cases and should be supported, preferably in a way that allows the users of the API to make conscious decisions at the time of writing their code and that remain valid even if the API is enhanced with new elements and functionality. A possible enhancement is to allow the visitor implementation to decide whether unknown elements should be traversed or not by a return value of the visitUnknown method:

public abstract class Visitor/*1.0*/ {
    public boolean visitUnknown(Expression e) {
        throw new IllegalStateException("Unknown element faced: " + e);
    }
    public void visitPlus(Plus s) {
        if (visitUnknown(s)) {
            s.getFirst().visit(this);
            s.getSecond().visit(this);
        }
    }
    public void visitNumber(Number n) {
        visitUnknown(n);
    }
}

This way you get the default behavior for unknown elements, but can also decide to do nothing or even perform a “deep” default visit. With an API like this, you can perform every task I’ve asked for so far, even counting the amount of numbers in the expression tree:

private class CountNumbers extends Visitor/*version1.0*/ {
   int cnt;

   @Override
   public boolean visitUnknown(Expression exp) {
       return true;
   }
   @Override
   public void visitNumber(Number n) {
       cnt++;
   }
}

public static int countNumbers(Expression expression) {
   CountNumbers counter = new CountNumbers();
   expression.visit(counter);
   return counter.cnt;
}

This code not only works with one version of the API, but is ready to work forever. If the tree grows with new elements, the return true provided from the default method guarantees that all the elements will be processed. The actual traversal is left up to the provider of the API. The provider is also in control of all the API elements, which guarantees the consistency and proper behavior of this task.

Clean Definition of a Version

So far I’ve discussed what can happen when the Visitor is turned into an abstract class. Though everything has worked, notice that our original simple abstract class changed from having just a few abstract methods into a complicated combination of methods providing relatively complicated default behavior. The set of methods will grow with each new language version. Each new method will contain an @since description. It will be increasingly difficult to accept only language 7.0 and ignore expression elements from any other version of the language. Yet this seems like a reasonable API requirement. If I were to write a tool to process elements of a language, I would want to have a view of the structure of the elements as created for that particular version. Instead of subclassing one large class with visit methods accumulated over multiple releases and searching its Javadoc to identify which methods have been added for a particular version, I would want a clean interface that has only the methods I am interested in. This way, the compiler would give me confidence that I’ve implemented all the methods that are necessary during compilation. This is a much easier, and in fact, a more clueless approach than subclassing some methods and running the program, only to discover that some other method needs to be implemented as well, overriding it, and so on indefinitely. From this point of view, it would be much better if you had a separate interface for each version of the language. So you would have the following:

public interface Visitor {
   public void visitUnknown(Expression e);
}
public interface Visitor10 extends Visitor {
   public void visitPlus(Plus s);
   public void visitNumber(Number n);
}
/** @since 2.0 */
public interface Visitor20 extends Visitor10 {
   public void visitMinus(Minus s);
}

The first version of the API includes one visitor only, while the second version needs to add a new element, as well as a new type of visitor. The subsequent changes to the API would increasingly add more. As a result, there would be many other interfaces, each for the traversal over expressions supported by the actual version of the language. However, this has a clear advantage: when you decide to implement visiting for version 7.0, you only implement Visitor70. That’s just like when you want to print elements of language version 2.0: you can implement the right interface and it’s guaranteed that all necessary visit methods for version 2.0 are provided:

class PrintVisitor20 implements Visitor20 {
    StringBuffer sb = new StringBuffer();

    public void visitUnknown(Expression exp) {
        sb.append("unknown");
    }

    public void visitPlus(Plus s) {
        s.getFirst().visit(this);
        sb.append(" + ");
        s.getSecond().visit(this);
    }

    public void visitNumber(Number n) {
        sb.append (n.getValue());
    }
    public void visitMinus(Minus m) {
        m.getFirst().visit(this);
        sb.append(" - ");
        m.getSecond().visit(this);
    }
}

Number one = new Number(1);
Number two = new Number(2);
Expression plus = new Minus(one, two);

PrintVisitor20 print = new PrintVisitor20();
plus.visit(print);

assertEquals("1 - 2", print.sb.toString());

The API is clearer, which is good. However, you are forced to complicate the implementation slightly. Now each expression element needs to contain the logic for correct dispatch to the correct visit method:

public void visit(Visitor v) {
    if (v instanceof Visitor20) {
        ((Visitor20)v).visitMinus(this);
    } else {
        v.visitUnknown(this);
    }
}

You’ve had to use runtime inspection to determine what you need to call. Again, you are giving up on compiler checks. You won’t determine whether you’ve made a mistake until the appropriate code is run. Of course, this code is part of the implementation of the language structure, which in this example is written only once, by you. That is why it’s possible to selectively pay enough attention to it to make it correct. Still, these ugly and dangerous pieces of code will be spread throughout many classes in their visit methods. As a result, it becomes difficult to keep track of where you are in your code. Yet another problem is that the length of the code can grow with each new language version, especially if you allow nonmonotonic evolution.

Nonmonotonic Evolution

Usually the evolution goes in just one direction: it’s monotonic. However, imagine a situation where you create a new version of your expression language that considers integers useless and treats all numbers as double. Of course, you could still support integers, but let’s suppose that your code assumes no visitor needs to use integers anymore. As a result, the Number class has no place in the data structures representing a model of version 3.0:

/** @since 3.0 */
public final class Real extends Expression {
   private final double value;
   public Real(double value) {
       this.value = value;
   }
   public double getValue() {
       return value;
   }
   public void visit(Visitor v)
}

/** @since 3.0 */
public interface Visitor30 extends Visitor {
   public void visitPlus(Plus s);
   public void visitMinus(Minus s);
   public void visitReal(Real r);
}

Note that not only have you defined a new element in the model, but you’ve also introduced a new visitor. The visitor is special: it doesn’t extend any of the previous visitors defined for versions 1.0 and 2.0 of the language. Subclassing doesn’t make sense: Visitor30 doesn’t define visitNumber, because there are no integer numbers in version 3.0 of the expression language. Only real numbers are supported.

This works fine and satisfies our previous requirement to clearly define the language. As soon as you decide to accept the 3.0 version, you have to implement Visitor30. However, by doing this you’ve complicated the implementation of the visit methods even more. Now all the data elements need to perform runtime checks for yet another visitor type. For example, the already complicated implementation in Minus would get even more complicated:

/** @since 2.0 */
public final class Minus/*3.0*/ extends Expression {
   private final Expression first;
   private final Expression second;

   public Minus(Expression first, Expression second) {
       this.first = first;
       this.second = second;
   }
   public Expression getFirst() { return first; }
   public Expression getSecond() { return second; }

   public void visit(Visitor v) {
       if (v instanceof Visitor20) {
           ((Visitor20)v).visitMinus(this);
       } else if (v instanceof Visitor30) {
           ((Visitor30)v).visitMinus(this);
       } else {
           v.visitUnknown(this);
       }
   }
}

The internal implementation of version 3.0 is certainly uglier than version 2.0. With every new nonmonotonic version, this is going to become even worse.

Data Structure Using Interfaces

Throughout the discussion of visitors, it has been assumed that the data structure is represented using classes that were, as far as possible, final. As a result, there would always be only one implementation of each element and the usage of runtime instanceof checks would be limited to one library.

However, sometimes it’s better not to use classes at all. The main reason for this is related to performance. Imagine that you create thousands or even millions of instances. Every byte counts and having one object instead of two can make a significant difference. For example, compilers and other language processors usually need to do more than create instances of model classes such as Number. They also need to keep additional associated information, such as the offset of the element in the text. To let the existing types be enhanced with new data without the overhead of delegation, you need to allow subclassing. This can also be useful if the implementor wants to provide an implementation of additional interfaces using multiple inheritance. If you need to optimize memory consumption and prevent the creation of additional model object instances, using interfaces for the model definition makes sense.

However, as the model represents a client API, turning classes into something that developers can implement might seem to be against all the recommendations put forward in this chapter. Yes, if you use interfaces for the model, you must get them right the first time. After that, there is no way of adding additional methods without breaking all the existing implementations. Surprisingly, this doesn’t mean that there is no way to evolve the APIs. The visitor pattern makes this possible nicely.

First, you can create new model elements in new releases of the API, like you created Minus in version 2.0. Adding new classes into the API is fully binary compatible, and almost source compatible, so this is an absolutely correct evolution strategy. If the new classes extend the Expression base class, you can always pass them into at least visitUnknown(Expression). Then it’s simply a question of adding a new visit method into the visitor interface. You’ve already seen this is possible in both scenarios. With abstract classes, you can add new methods, while with interfaces you can define a new visitor interface for the new language version. Moreover, there is always the option of using nonmonotonic evolution. This approach can be useful when you need to recover from a wrong model interface containing elements that are no longer needed, exactly as you did with Real and Visitor30, which don’t mention Number at all.

It’s always better to get the interface right when releasing the first version. However, with visitors there seems to be a way to recover from possible design mistakes. Even using interfaces for representing the expression nodes seems to be an acceptable style. As evolution is allowed, this can be classified as a proper API design pattern.

Client and Provider Visitors

There is one rather ugly problem associated with the use of interfaces: you’ve seen how ugly the visit method can become after a few language revisions. This was said to be acceptable, as there is only one implementation and you can get it right. However, with interfaces, when there is no default implementation, the ugly code needs to be present in every implementation of each interface. Also, these implementations need to be kept in sync. With every new version of the API that defines a new visitor, all the implementations need to be updated to have proper checks for v instanceof VisitorXY. This is bad, as the likelihood that things will get out of sync is significant.

To solve this inconvenient situation you need an API improvement trick. This trick is essential for this chapter and is based on separating the client and provider APIs as outlined in Chapter 8. All the previous examples used the VisitorXY types for two different purposes. Some developers implement it to walk through the model data classes or interfaces. In other cases, one or more implementations of the model classes use it to correctly dispatch from inside their implementations of the visit methods. This is the root cause of the evolution problems you’ve seen with the visitor pattern so far. Luckily the solution is simple: split the class in two. Here is the improved version of the expression language 1.0:

public interface Expression {
   public abstract void visit(Visitor v);
}
public interface Plus extends Expression {
   public Expression getFirst();
   public Expression getSecond();
}
public interface Number extends Expression {
   public int getValue();
}

public abstract class Visitor {
   Visitor() {}

   public static Visitor create(Version10 v) {
       return create10(v);
   }

   public interface Version10 {
       public boolean visitUnknown(Expression e);
       public void visitPlus(Plus s);
       public void visitNumber(Number n);
   }

   public abstract void dispatchPlus(Plus p);
   public abstract void dispatchNumber(Number n);
}

In this example, the model classes are represented by interfaces, but final classes would work as well if there is no need for performance optimizations. Expression defines the visit method that takes the “client visitor,” which maybe should be called “dispatcher” instead. This is the visitor that no external module implements but on which the correct dispatch methods can be called. Wanting to write a visitor, the users of the API implement the Visitor.Version10 “provider visitor” interface, converting it using Visitor.create. They can then use it for expression.visit(v) calls. As a result, it’s easy to accommodate evolution requirements. The evolution to language version 2.0 that adds the Minus expression would make the following additions:

/** @since 2.0 */
 public interface Minus extends Expression {
    public Expression getFirst();
    public Expression getSecond();
 }

 public abstract class Visitor {
    Visitor() {}
    /** @since 2.0 */
    public static Visitor create(Version20 v) {
        return create20(v);
    }
    /** @since 2.0 */
    public interface Version20 extends Version10 {
        public void visitMinus(Minus m);
    }


    /** @since 2.0 */
    public abstract void dispatchNumber(Number n);
}

All the other methods and classes would remain. A new “provider visitor” interface would extend the old one. A factory method would also convert the visitor to a “client visitor” with a new method that implementations of Minus.visit(Visitor) might call to do its dispatching.

Similar changes would happen to the 3.0 version that replaces integers with reals. However, the Version30 interface would not extend anything that exists, because 3.0 is a nonmonotonic change in the model and you don’t need all the previous methods, just a few of them. See the following:

/** @since 3.0 */
 public interface Real extends Expression {
    public double getValue();
}
public abstract class Visitor {
   Visitor() {}


   /** @since 3.0 */
   public static Visitor create(Version30 v) {
       return create30(v);
   }


   /** @since 3.0 */
   public interface Version30 {
       public boolean visitUnknown(Expression e);
       public void visitPlus(Plus s);
       public void visitMinus(Minus s);
       public void visitReal(Real r);
   }

   /** @since 3.0 */
   public abstract void dispatchReal(Real r);

}

Triple Dispatch

The visitor pattern is often called “double dispatch” because the actual method in the visitor that is called when expression.visit(visitor) is invoked depends on the dispatch of the call to the actual expression subtype, as well as the dispatch to the visitor. The “client and provider visitor” pattern described here could also be called “triple dispatch” because the actual method that gets called in the provider visitor is a function of the expression, the language version, and the implementation of the visitor. Let’s look at how the Visitor.create methods are implemented for version 3.0. Here is the method for visitors written against version 1.0:

static Visitor create10(final Visitor.Version10 v) {
    return new Visitor() {
        @Override
        public void dispatchPlus(Plus p) {
            v.visitPlus(p);
        }

        @Override
        public void dispatchNumber(Number n) {
            v.visitNumber(n);
        }

        @Override
        public void dispatchMinus(Minus m) {
            if (v.visitUnknown(m)) {
                m.getFirst().visit(this);
                m.getSecond().visit(this);
            }
        }

        @Override
        public void dispatchReal(Real r) {
            v.visitUnknown(r);
        }
    };
}

Only Plus and Number elements were present in language 1.0, so only these are dispatched, while all other elements call the visitUnknown method. Moreover, the handling of the Minus element checks the returned value, and if true, performs a “deep” visit to support the CountNumbers visitor that I discussed earlier. The handling of visitors written for the 2.0 language model is a bit simpler, because there is no need to provide a fallback for the Minus handling code:

static Visitor create20(final Visitor.Version20 v) {
    return new Visitor() {
        @Override
        public void dispatchPlus(Plus p) {
            v.visitPlus(p);
        }

        @Override
        public void dispatchNumber(Number n) {
            v.visitNumber(n);
        }

        @Override
        public void dispatchMinus(Minus m) {
            v.visitMinus(m);
        }

        @Override
        public void dispatchReal(Real r) {
            v.visitUnknown(r);
        }
    };
}

The support for visitors of the 3.0 language model is more complicated, but still possible. The reason for this is that it’s reasonable to convert integers in old models to reals, which requires a bit of additional work:

static Visitor create30(final Visitor.Version30 v) {
    return new Visitor() {
        @Override
        public void dispatchReal(Real r) {
            v.visitReal(r);
        }

        @Override
        public void dispatchNumber(final Number n) {
            class RealWrapper implements Real {
                public double getValue() {
                    return n.getValue();
                }
                public void visit(Visitor v) {
                    n.visit(v);
                }
            }
            v.visitReal(new RealWrapper());
        }

        @Override
        public void dispatchPlus(Plus p) {
            v.visitPlus(p);
        }

        @Override
        public void dispatchMinus(Minus m) {
            v.visitMinus(m);
        }
    };
}

The new element in this solution is the RealWrapper class that decorates an object representing an integer and allows it to be seen as a real. In this way, the Version30 can traverse older versions of models that provide integers.

A Happy End for Visitors

The “client and provider visitor” pattern is a good answer to any questions or problems faced so far. It’s also the real “extensible visitor”:

  • It’s possible to add new elements into the model.
  • It supports the visitUnknown method.
  • It supports default deep traversal over unknown elements.
  • Language model versions are clearly separated as each has its own visitor interface, while it’s possible to freely mix models and visitor versions and iterate with any visitor over any model.
  • There is support for nonmonotonic evolution.
  • The solution is type safe. It doesn’t need to use any reflection or introspection.
  • Usage of interfaces for model classes is possible and doesn’t prevent type-safe evolution.

You couldn’t ask for much more! You’ve reached a happy end. All this came about just because you followed one important rule: You separated the client and provider interface into two parts as advised by Chapter 8. Old wisdom claims that “any problem in computer science can be solved with another layer of indirection.” This looks like it is true also in the API design world.

Syntactic Sugar

In spite of our happy end, more needs to be discussed. When rewriting the Print visitor example to the new style, you find that traditional code doesn’t compile in the visitPlus method:

public class PrintVisitor implements Visitor.Version10 {
    StringBuffer sb = new StringBuffer();

    final Visitor dispatch = Visitor.create(this);

    public void visitPlus(Plus s) {
        // s.getFirst().visit(this); // does not compile, we need:
        s.getFirst().visit(dispatch);
        sb.append(" + ");
        s.getSecond().visit(dispatch);
    }

    public void visitNumber(Number n) {
        sb.append (n.getValue());
    }

    public boolean visitUnknown(Expression e) {
        sb.append("unknown");
        return true;
    }
}

The Expression.visit method expects Visitor and the PrintVisitor class only implements the Version10 interface. To type correctly, you need to always create new “dispatchers,” or keep a reference to one such dispatcher, exactly as you did with the final Visitor dispatch variable. This variable is needed anyway to start the dispatch, so this change simply moves it into the visitor implementation, so that it can be accessible from the visit methods. However, this requires a double reference to get the Visitor object, as it’s an instance variable of the Print class. Also, finding this trick might be difficult. On the other hand, a note in the Javadoc and a snippet of sample usage should solve these problems. However, this is a traditional fix without any syntactic sugar, without any new improvements to the visitor pattern.

The other solution enhances the interfaces so that the correct way of writing a visitor is easy to find. Each method in the VersionXY interface gets a new parameter, Visitor self:

public abstract class Visitor {
    Visitor() {}

    public static Visitor create(Version10 v) {
        return create10(v);
    }

    public interface Version10 {
        public boolean visitUnknown(Expression e, Visitor self);
        public void visitPlus(Plus s, Visitor self);
        public void visitNumber(Number n, Visitor self);
    }

    public abstract void dispatchPlus(Plus p);
    public abstract void dispatchNumber(Number n);
}

self is always the Visitor that was passed to the Expression.visit(Visitor) method and can be used instead of this in the old Print visitor example:

public class PrintVisitor implements Visitor.Version10 {
    StringBuffer sb = new StringBuffer();

    public void visitPlus(Plus s, Visitor self) {
        s.getFirst().visit(self);
        sb.append(" + ");
        s.getSecond().visit(self);
    }

    public void visitNumber(Number n, Visitor self) {
        sb.append (n.getValue());
    }

    public boolean visitUnknown(Expression e, Visitor self) {
        sb.append("unknown");
        return true;
    }
}

@Test public void printOnePlusOne() {
   Number one = newNumber(1);
   Expression plus = newPlus(one, one);

   PrintVisitor print = new PrintVisitor();
   plus.visit(Visitor.create(print));

   assertEquals("1 + 1", print.sb.toString());
}

Which of these styles to use is the decision of the pattern user. The first is harder to discover but otherwise simple. The second is more complex. However, the second is needed by anyone trying to use the visitor recursively. Anyway, regardless of the actually chosen style, it’s clear that a properly implemented visitor is a good API design pattern that can serve as a building block in our API universe.

________________

1.    Mads Torgersen, “The Expression Problem Revisited” in ECOOP 2004 – Object Oriented Programming: 18th European Conference Oslo, Norway, June 14–18 2004, Proceedings, ed. Martin Odersky, 123–146 (Berlin: Springer-Verlag, 2004).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.134.104.173