Chapter 3. The drive toward DSLs

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3. The drive toward DSLs

In this chapter

Understanding DSL types
Building a DSL for scheduling tasks
Fleshing out a DSL syntax
Taking a DSL apart to see what makes it tick
Implementing the Scheduling DSL

In this chapter, we’re going to look at the different types of DSLs that we can build and at how and when we should use them. We’ll then take a problem—the need for a scheduling system—and begin solving it with a DSL—the Scheduling DSL. We’ll start by looking at the problem, follow the reasoning that leads us to solving it using a DSL, and then decide on the syntax and implementation.

In this chapter, we’ll build several implementations of the Scheduling DSL and go through the entire process of building a DSL for real use, although it will be a rather basic one.

3.1. Choosing the DSL type to build

The first step in building a DSL is deciding what type of a DSL to build. Our sample scenario will be scheduling tasks, so we’ll take a look at several approaches to building a scheduling DSL, which will help us compare fluent interfaces and DSLs.

First, we need to consider the steps involved in scheduling tasks, from the usage perspective: defining named tasks, defining the repeatability of the task, and defining the set of actions that will occur when the task is executed. Here are some examples of each:

Define named tasks

+	Crawl site
+	Back up database
+	Check that service is online

Define when task should be executed

+	Once a day
+	Once a week
+	Every hour

Define actions to occur when task is executed

+	Check URL response
+	Send email
+	Generate report

When we get around to writing the engine responsible for taking all three parts and making them into an executable, we’ll focus on how to schedule tasks, how to handle errors, how to verify that tasks are completed, and so on. But from the point of view of the domain we’re working on, those details aren’t meaningful. When we define a new task, we should just deal with the scheduling semantics, not with implementation mechanics.

We could get much of this abstraction—separating the scheduling semantics from the scheduling implementation—by building facades and hiding the implementation details under the facades’ abstraction, and we could get a fairly good syntax by using a fluent interface. Why build a DSL?

3.1.1. The difference between fluent interfaces and DSLs

If you have a fluent interface, you already have a DSL. It’s a limited one, admittedly, but it’s still a DSL for all intents and purposes. There are significant differences in readability between a fluent interface and a DSL, because you have a lot of freedom when you define the language for the DSL, but fluent interfaces have to work within the limits of a typically rigid language to work.^[1]

¹ This, of course, assumes that we’re talking about common static languages. Fluent interfaces in dynamic languages are a different matter, and are much closer to a DSL.

Because of those limitations, I tend to use DSLs and fluent interfaces for different tasks. I use a fluent interface when I need a touch of language-oriented programming, and I go with a DSL when I need something with a bit more flexibility.

We’ll take a look at some code, and that should demonstrate the differences between fluent interfaces and DSLs. Please don’t concern yourself with the implementation details for now; just look at the syntax.

First, listing 3.1 shows an example of a fluent interface.

Listing 3.1. Using a fluent interface to define a task to execute

new FluentTask("warn if website is down")
    .Every( TimeSpan.FromMinutes(3) )
    .StartingFrom( DateTime.Now )
    .When(() => new WebSite("http://example.org").IsNotResponding )
    .Execute(() => Notify("[email protected]", "server down!"))
    .Schedule();

Listing 3.2 shows the DSL equivalent.

Listing 3.2. Using a DSL to define a task to execute

task "warn if website is down":
    every 3.Minutes()
    starting now
    when WebSite("http://example.org").IsNotResponding
    then:
        notify "[email protected]", "server down!"

As you can see, the DSL code doesn’t have to work to make the compiler happy, nor does it have the ugly lambda declaration in the middle of the code or all the syntactic baggage (parentheses). It would be difficult to take anything away from the DSL example without losing some meaning. There’s little noise there, and noise reduction is important when we’re talking about language-oriented programming. The less noise we have, the clearer the code is.

Now, remember that a fluent interface is also a DSL. This means that we can make the fluent interface example clearer. Tim Wilde was kind enough to do that for us on his blog (http://www.midnightcoder.net/Blog/viewpost.rails?postId=38), reaching the syntax outlined in listing 3.3.

Listing 3.3. A better fluent interface for scheduling tasks

Schedule.Task( "warn if website is down" ).
      Repeat.Every( 3 ).Minutes.
      Starting( DateTime.Now ).
      If( Web.Site( "http://example.org" ).IsNotResponding() ).
      Notify( "[email protected]", "Site down!" );

But the catch, and there is always a catch, is that the complexity of the fluent interface implementation grows significantly as we try to express richer and richer concepts. In this case, the backend of the implementation got to eight classes and six interfaces, all for five lines of code, whereas the DSL implementation is considerably simpler. Toward the end of this chapter, we’ll look at how to implement the Scheduling DSL, and the whole thing will be less than 150 lines, most of which are dedicated to enabling testing and running the backend engine.

Fluent interfaces tend to be harder to scale than DSLs. We haven’t actually built a DSL yet, but we’ll do that later in this chapter and you’ll be able to judge for yourself. Does this mean that we should stop using fluent interfaces altogether? As usual, the answer is that it depends, but in general the answer is no.

3.1.2. Choosing between a fluent interface and a DSL

I’ll admit that I tend to favor building DSLs over fluent interfaces, precisely because I leaned the other way in the past and got badly burned by trying to maintain something of that complexity in a rigid language. But there is a time and place for everything.

I tend to ask the following questions when I’m deciding which to choose:

When will this DSL or fluent interface be used?
Who will use the DSL or fluent interface?
How flexible does the syntax need to be?

Fluent interfaces are usually useful only during the development process. A fluent interface is probably a good choice if you intend to use it while you write the code, because you won’t have to switch between two languages and can take advantage of the IDE tooling. In contrast, if you want to allow modifications outside development (for example, in production), a DSL tends to be a much better choice.

This leads to the question of who will use the DSL or fluent interface. If domain experts are going to be the users, you’ll probably want a full-blown DSL in place, because that will make it easier to work with the concepts of the domain. If the target audience is programmers, and the expected usage is during normal development, a fluent interface would be appropriate.

Last, but certainly not least, is the issue of the syntax that you want to use. DSLs are more expressive. Getting a fluent interface to be expressive can be prohibitive in terms of time and complexity.

One of most important differences between a DSL and a fluent interface shows up when you want to perform actions outside the direct development cycle. While a fluent interface is strongly tied to the development cycle, if only because we require an IDE, a build process, and to push binaries to production, with a DSL we are dealing with standalone scripts. It is very easy to treat them as such, edit them using a simple text editor (or using a dedicated tool, as discussed in chapter 10), and simply upload them to the production environment.

I am not a fan of treating DSL changes as mere configuration changes and thus skipping the test cycle, but in many organizations the ability to push such changes rapidly is a major selling point for a DSL over a fluent interface. Chapter 12 discusses pushing DSL changes to production in a disciplined, auditable, and controlled manner.

How to deal with language hopping

Some people have a theoretical issue with using more than one language at a time. I say “theoretical” because it’s usually more theoretical than practical.

In the web world, people have little problem hopping between HTML, JavaScript, and server-side programming, so this disconnect isn’t a big problem. In many enterprise applications, important parts of applications are written in SQL (stored procedures, triggers, complex queries, and the like), and here too we need to frequently move between code in C#, Java, or VB.NET and SQL queries. In my experience, language hopping has rarely caused any confusion (it has other issues, but this isn’t the place to talk about them).

If you remain consistent in your domain terminology in both languages, and keep your language short and to the point, you shouldn’t need to worry about problems as a result of language hopping.

Last, but not least, there is a hard limit to how much you can get from a fluent interface in terms of language-oriented programming. A DSL is much more flexible in this regard, and that is a major reason to favor a DSL over a fluent interface. On the plus side for fluent interfaces, you can take advantage of your existing tool chain (IDE, IntelliSense, and so on) instead of having to build your own.

Coming back to DSLs, let’s explore the reasons and motives for building different types of DSLs, and how that affects the DSLs we’re building.

3.2. Building different types of DSLs

As I mentioned in chapter 1, different types of DSLs are written for different purposes:

Technical DSLs— Created to solve technical issues. You might use such DSLs to create build scripts, set up system configuration, or to provide a natural syntax for a technical domain (a DSL for behavior-driven design or for creating state machines).
Business DSLs— Created to provide a common, executable language for a team (including domain experts) in a specific domain.
Extensibility DSLs— Created to allow an application to be extended externally without modifying the application itself but by providing external scripts to add additional functionality. Usages include scripting, customer modifications, and so on.

These DSL types have different constraints and requirements that affect how you need to approach building them and where and how they are used in applications. We’ll look at each type in depth, starting with technical DSLs.

3.2.1. Building technical DSLs

Technical DSLs are generally used to solve problems of clarity and complexity.

A build tool is an obvious example. Compiling all the code is a single step in the build process; copying all referenced assemblies to the output location is another step. We arrange those steps into groups that are called targets, which have dependency relationships between them. The job of the build script is to define the targets, the steps within the targets, and the dependencies between the different targets.

The process of building software is a complex one. Components’ dependency management, complier options, platform choices, and many other issues need to be dealt with. In large projects needing a clear build process, the build scripts are a black art, feared by many and understood by few. This make them an excellent target for a technical DSL, because you can use the DSL to encapsulate the complexity and give even those who aren’t familiar with the build the ability to understand and modify it.

Technical DSLs are built for technical people, usually developers. Having technical people as the audience greatly simplifies the task of building a good DSL, but creating a clear and expressive language is just as important for a technical DSL as for any other.

As a case in point, we can look at Apache Ant, a popular build tool in the Java world. When James Duncan Davidson created Apache Ant, it was perfectly logical to use an XML file to specify the syntax for the build script. After all, that’s what XML is for, and using it avoided the need to create yet another parser. The problem is that this will work for awhile, and then you’ll realize that you need conditionals and iterations and the like. The result can look like listing 3.4.

Listing 3.4. An Ant XML script with a `foreach` loop and not the best syntax

<foreach item="File" property="filename">
    <in>
        <items>
            <include name="${finished_spec.dir}*.dll" />
        </items>
    </in>
    <do>
        <exec program="${build.dir}document.exe"
            commandline="${filename} ${build.dir}docs"/>
    </do>
</foreach>

James has reconsidered the decision to go with XML (http://weblogs.java.net/blog/duncan/archive/2003/06/ant_dotnext.html).

For comparison, listing 3.5 has the same functionality, but expressed using the Bake DSL.

Listing 3.5. A Bake script with a `for` loop

for file in Directory.GetFiles(finished_spec, "*.dll"):
      exec(Path: "${buildDir}\document.exe", CmdLine: "${file} ${buildDir}\docs")

Listing 3.4 and listing 3.5 perform the same action, executing a program for each DLL in a specific directory. The Ant script takes three times as many lines to express the same thing as the Boo Bake script, and the actual meaning of the script is lost in the XML noise.

This difference between the two styles can seem minor in such a small example, but consider how it affects your ability to read and understand what the script is doing in a real-world scenario.

Tip

Look at what we’re iterating on in listing 3.5. We’re using the standard CLR API in the DSL. This means that we have tremendous power in our hands; we don’t need to supply everything to the DSL. We can make use of anything that already exists on the CLR, and, even more importantly, making use of the available API requires no additional effort.

As another example of a technical DSL where clarity is critical, consider listing 3.6, taken from a Binsor configuration file. (Binsor is a Boo DSL for configuring the Castle Windsor container.) This piece of code scans a set of assemblies and finds all the types that implement the IController interface.

Listing 3.6. Binsor script to scan a set of assemblies

for type in AllTypesIn("MyApp.Controllers", "MyApp.Helpers"):
          continue unless typeof(IController).IsAssignableFrom(type)
          component type

You can probably tell what the code in listing 3.6 is doing, even without a good understanding of how Binsor works. The combination between just enough DSL syntax and the ability to use standard programming constructs makes for a very powerful approach when building a technical DSL.

You aren’t limited to just using loops—you have the full power of a programming language in your hands, which means that you can execute logic as part of configuration scripts. Even for simple configuration tasks, the ability to execute logic is invaluable. Consider the case of selecting a connection string for test or development. You could have several connection strings and select from them manually in the application code, or you could have an if statement in your configuration make the decision automatically.

The advantages of building a technical language are that you get the benefits of a DSL (clearer semantics and a higher focus on the task) and keep most of the power that you’re used to having.

The disadvantage is that technical DSLs are still pretty close to programming languages, and as such tend to require programming skills to use. This is fine if you’re targeting developers, but not if you want a DSL that a business expert can use. For those scenarios, you need a business DSL.

3.2.2. Building business DSLs

When you’re building a DSL for a business scenario, rather than a technical one, you need to ask yourself a few questions. Who is going to write scripts using this DSL, and who is going to read those scripts?

Talking with domain experts

In one of my projects, I had an ex-developer as a business analyst. The domain was a complex one, and we often ran into subtleties. Being able to ping the analyst and go over code with him was invaluable. He could tell me if I was doing it right, because he could understand what the code was doing.

This was an extremely rare situation, but it made my work much easier, because we could communicate on a level that both of us understood.

Having a shared language and terminology with the domain experts is invaluable, but we can take it a few steps further by making that shared language be directly executable by the machine. With a business DSL, there is no translation gap between what the domain expert has told the developers and what the computer is executing.

Not only that, but we can have the domain experts review the executable instructions (because they are written in a DSL that they can read) and tell us whether this is good or bad. Many domain experts already do some level of programming in the form of VBA or Microsoft Office programming. If they can work with that, they should be able to write using a DSL.

A business DSL doesn’t necessarily have to be writable by the domain experts (business users, analysts, and the like). Building a DSL doesn’t mean that you can offload all the work to the domain experts and leave it at that. The main purpose of a DSL is to facilitate better communication between the developers and the businesspeople.

Examples of business DSLs can be seen in rules engines of various dispositions. Usually those tools cost quite a bit of money and come with “easy to use and intuitive” designers and wizards.

Usually, business rules are simple condition and action statements. Imagine that we have a store and we want to calculate the final pricing on an order. The pricing business rules change frequently, and we’d like to have the businesspeople’s direct input on those. Listing 3.7 shows how we can specify the rules for calculating the final pricing of an order using a DSL syntax.

Listing 3.7. A sample DSL for defining order-processing rules

when User.IsPreferred and Order.TotalCost > 1000:
    AddDiscountPercentage 5
    ApplyFreeShipping
when not User.IsPreferred and Order.TotalCost > 1000:
    SuggestUpgradeToPreferred
    ApplyFreeShipping
when User.IsNotPreferred and Order.TotalCost > 500:
    ApplyFreeShipping

Any businessperson could read and understand these rules. Getting this type of DSL to work takes about 10 minutes of work (the backend code for this DSL is only 68 lines long). The ability to easily define such a DSL means that you get a lot of flexibility for little cost.

Note

Look at the difference between the second and third conditions in listing 3.7. The second uses not User.IsPreferred and the third uses User.IsNotPreferred. When building a DSL, you need to put aside some of your notions about good API design. What works for developers doesn’t necessarily work well for language-oriented programming. Reading not User.IsPreferred is awkward for some people, so User.IsNotPreferred is better from a readability standpoint.

3.2.3. Building Extensibility DSLs

You can use a DSL to extend an application. Consider the macro feature in Visual Studio or VBA in Excel. They’re useful tools, and most big applications have something of that sort. Visual Studio has macros, Emacs has LISP, Office has VBA, many games use scripting for the “game logic,” and so on.

Most of the approaches that we’ll look at in this book could be called extensibility mechanisms, but true extensibility DSLs usually focus on enabling as much as possible, whereas in most DSLs we want a language that’s expressive in a narrow domain. We’ll talk about the implications of extensibility DSLs for an application in chapter 5.

Once you’ve decided to create a DSL, what’s next? How do you go from the wish to be clear and expressive to having a DSL in hand?

3.3. Fleshing out the syntax

Let’s imagine we haven’t already seen the DSL syntax for the Scheduling DSL, and that we need to start building such a thing from scratch. Before we begin the actual implementation, we need to know what we want to do in our DSL:

Define named tasks
Specify what happens when a task is executed
Define when a task should execute
Describe the conditions for executing the task
Define the recurrence pattern (how often we will repeat the task)

We also need to look at those goals from the appropriate perspective—the end user’s. The client will pay for the DSL, but the end users are the people who will end up using the DSL. There is a distinct difference between the two. Identifying who the end user is can be a chore, but it’s important to accurately identify who the users of the DSL will be.

One of the major reasons to build a DSL is to hide the complexities of the implementation with a language that makes sense to the domain experts. If you get the wrong idea about who is going to use the DSL, you will create something that is harder to use. The budgeting people generally have a much fuzzier notion about what their company is doing than the people actually doing the work. Once you have some idea about what the end users want, you can start the design and implementation.

I try to start using a declarative approach. It makes it easier to abstract all the details that aren’t important for the users of the DSL when they are writing scripts. That means deciding what the DSL should do. After I have identified what I want to use the DSL for, I can start working on the syntax. It’s usually easier to go from an example of how you want to specify things to the syntax than it is to go the other way.

One technique that I have found useful is to pretend that I have a program that can perfectly understand intent in plain English and execute it. For the Scheduling DSL, the input for that program might look like the following:

Define a task named: "warn if website is down", starting from now, running
     every 3 minutes. When website "http://example.org" is not alive, then
     notify "[email protected]" that the server is down.

This syntax should cover a single scenario of using the Scheduling DSL, not all scenarios. The scenario should also be very specific. Notice that I’ve included the URL and email address in the scenario, to make it more detailed.

You should flesh out the DSL in small stages, to make it easier to implement and to discover the right language semantics. You should also make it clear that you’re talking about a specific usage instance, and not the general syntax definition.

Once you have the scenario description, you can start breaking into lines, and indenting by action groups. This allows you to see the language syntax more clearly:

Define a task named: "warn if website is down",
        starting from now,
        running every 3 minutes.
        When web site "http://example.org" is not alive
        then notify "[email protected]" that the server is down.

Now it looks a lot more structured, doesn’t it? After this step, it’s a matter of turning the natural language into something that you can build an internal DSL on. This requires some level of expertise, but mostly it requires knowing the syntax and what you can get away with.

We’ve already become somewhat familiar with the syntax of the Boo language and all the ways we can work with it in chapter 2. We’ll look at more advanced options in chapter 6, and the syntax reference in appendix B can help you get familiar with what types of syntactic sugar you can build into your DSL.

3.4. Choosing between imperative and declarative DSLs

There are two main styles for building DSLs: imperative and declarative. These styles are independent of the DSL types we discussed in chapter 1 (external DSLs, graphical DSLs, and internal DSLs). Each of the three DSL types can be implemented using either style, although there is a tendency to use a more imperative approach for technical DSLs and a more declarative approach for business DSLs.

An imperative DSL specifies a list of steps to execute (to output text using a templating DSL, for example). With this style, you specify what should happen.
A declarative DSL is a specification of a goal. This specification is then executed by the supporting infrastructure. With this style, you specify the intended result.

The difference is really in the intention. Imperative DSLs usually specify what to do, and declarative DSLs specify what you want done.

SQL and regular expressions are examples of declarative DSLs. They both describe what you want done, but not how to do it. Build scripts are great example of imperative DSLs. It doesn’t matter what build engine you use (NAnt, Rake, Make), the build script lists actions that need to be executed in a specified order. There are also hybrid DSLs, which are a mix of the two. They are DSLs that specify what you want done, but they also have some explicit actions to execute.

Usually, with declarative DSLs, there are several steps along the way to the final execution. For example, SQL is a DSL that uses the declarative style. With SQL you can specify what properties you want to select and according to what criteria. You then let the database engine handle the loading of the data.

When you use an imperative DSL, the DSL directly dictates what will happen, as illustrated in figure 3.1.

Figure 3.1. Standard operating procedure for imperative DSLs

When you use a declarative DSL, the DSL specifies the desired output, and there is an engine that takes any actions required to make it so. There isn’t necessarily a one-to-one mapping between the output that the DSL requests and the actions that the engine takes, as illustrated in figure 3.2.

Figure 3.2. Standard operating procedure for declarative DSLs

You have to decide which type of DSL you want to build. Imperative DSLs are good if you want a simple-to-understand but open-ended solution. Declarative DSLs work well when the problem itself is complex, but you can express the specification for the solution in a clear manner.

Regardless of which type of DSL you decide to build, you need to be careful not to leak implementation details into the DSL syntax. Doing so will generally make it harder to modify the DSL in the long run, and likely will confuse the users. DSLs should deal with the abstract concepts, such as applying free shipping, or suggesting registration as a preferred customer, and leave the implementation of those concepts to the application itself. This is an important concept that we’ll come back to when we talk about unit testing in chapter 8.

Sometimes I build declarative DSLs, and more often hybrid DSLs (more on them in a minute). Usually the result of my DSLs is an object graph describing the intent of the user that I can feed into an engine that knows how to deal with it. The DSL portion is responsible for setting this up, and not much more.

I rarely find a use for imperative DSLs. When I use them, it’s usually in some sort of helper functionality: text generation, file processing, and the like. A declarative DSL is more interesting, because it’s usually used to express the complex scenarios.

I don’t write a lot of purely declarative DSLs. While those are quite interesting in the abstract, getting them to work in real-world scenarios can be hard. But mixing the styles, creating a hybrid DSL, is a powerful combination.

A hybrid DSL is a declarative DSL that uses imperative programming approaches to reach the final state that’s passed to the backend engine for processing. For example, consider this rule: “All preferred customers get 2 percent additional discount on large orders on Sunday.” That rule is expressed in listing 3.8 using a hybrid of declarative and imperative styles (look at the third line):

Listing 3.8. A hybrid DSL, using both imperative and declarative concepts

when User.IsPreferred and Order.TotalCost > 1000:
    AddDiscountPercentage  5
    AddDiscountPercentage  2 if today is sunday
    ApplyFreeShipping

Note that this example uses the same syntax as before, but we’re adding additional conditionals to the mix—we’re mixing both styles. This is a silly example of the power of hybrid DSLs, but the ability to express control flow (loops and if constructs) and to have access to declarative concepts makes a hybrid DSL a natural for specifying behavior in more complex scenarios, and it can do so coherently.

Before we move on, listing 3.9 shows another approach, arguably a more declarative one, for the same problem.

Listing 3.9. A more declarative approach to specifying rules

applyDiscount 5.percent:
    when User.IsPreferred and Order.TotalCost > 1000
suggestPreferred:
     when not User.IsPreferred and Order.TotalCost > 1000
freeShipping:
     when Order.TotalCost > 500 and User.IsNotPreferred
     when Order.TotalCost > 1000 and User.IsPreferred

I find the example in listing 3.9 to be more expressive, because it explicitly breaks away from the developer mentality of ifs and branches and forces you to think about actions and triggers, which is probably a better model for this particular problem.

The importance of clarity

In the initial draft of this book, one of the reviewers pointed out an inconsistency between listings 3.7 and 3.9. I’ve left the inconsistency in place to show how different syntaxes can change the way we understand the system.

If you look at the rules for free shipping, you can see that there’s an interesting inconsistency. Preferred users get free shipping for orders above $1,000, whereas non-preferred users get free shipping for orders above $500.

In listing 3.7, you have to look at all the rules in order to understand what is going on. In listing 3.9, this inconsistency is explicit. In chapter 13, we’ll talk extensively about how to make such concepts explicit.

I have been in situations where laying out the existing business rules close to one another (in a format like listing 3.9) has highlighted logical problems in what the business was doing, though sometimes they went ahead with the inconsistency. I try to avoid using the term business logic, because I rarely find any sort of logic in it.

Nevertheless, both examples perform the exact same operations, and are equivalent in terms of complexity and usage. In fact, there is a one-to-one mapping between the two.

That’s enough theory; let’s pull the concepts of a DSL apart, and see how it works.

3.5. Taking a DSL apart—what makes it tick?

We’ve looked at building DSLs from the point of view of the outward syntax—how we use them. What we haven’t done is cover how they’re structured internally—how we build and integrate them into our applications.

In general, a DSL is composed of the building blocks shown in figure 3.3.

Figure 3.3. A typical DSL structure

A typical DSL is usually split into several distinct parts:

Syntax— This is the core language or the syntax extensions that you create.
API— This is the API used in the DSL; it is usually built specifically to support the DSL and its needs.
Model— This is the existing code base we reuse in our DSL (usually using a facade). The difference between the API and the model is that the model usually represents the notions in our application (such as Customer, Discount, and so on), whereas the API focuses on providing the DSL with convenient ways to access and manipulate the model.
Engine— This is the runtime engine that executes the DSL and processes its results.

The language and the API can be intrinsically tied together, but there is a fine line separating the two. The API exposes the operations that DSL users will use in the application. Usually you’ll expose the domain operations to the DSL. You express those operations through the language, but the API is focused on enabling a good syntax for the operations, not on providing the operations themselves.

We’ll deal with language construction in the next two chapters, and we’ll see an example of it in the next section. Broadly, we need to understand what features of the language we can use and what modifications we’re able to make to the language to better express our intent. Often, this is directly related to the API that we expose to the DSL. As I mentioned earlier, if you’re working in a domain-driven design manner, you’re in a good position to reuse the same domain objects in your DSL (although that causes problems, such as versioning, which we’ll look at in chapter 9). Often, though, the API will be composed of facades over the application, to provide the DSL with coarse-grained access into the application (fine-grained control is often too fine grained and is rarely useful in a DSL).

Keeping the layers separated

Several times in the past I have tried to combine different parts of the DSL—typically the syntax and the API—usually to my regret. It’s important to keep each layer to itself, because that brings several advantages.

It means you can work on each layer independently. Enhancing your API doesn’t break the syntax, and adding a method call doesn’t require dealing with the internals of the compiler.

You can use the DSL infrastructure from other languages, as well. Why would you want to do that? Because this will avoid tying your investment in the DSL into a single implementation of the syntax, and that’s important. You may want to have several dialects of a single DSL working against a single infrastructure, or you may decide that you have hit the limits of the host language and you need to build an external DSL (or one using a different host language). You’ll still want to use the same infrastructure across all of them. Having an infrastructure that is not tied to a specific language implementation also means that you can use this infrastructure without any DSL, directly from your application.

A typical example of using the DSL infrastructure without a DSL language would be an infrastructure that can also be used via a fluent interface to the application and via a DSL for external extensibility.

The execution engine is responsible for the entire process of selecting a DSL script and executing it, from setting up the compiler to executing the compiled code, from setting up the execution environment to executing the secondary stages in the engine after the DSL has finished running (assuming you have a declarative DSL).

Extending the Boo language itself is probably the most powerful way to add additional functionality to a DSL, but it’s also the most difficult. You need to understand how the compiler works, to some extent. Boo was built to allow that, but it’s usually easier to extend a DSL by adding to the API than by extending the Boo language. When you need to extend Boo to enrich your DSL, those extensions will also reside in the engine and will be managed by it.

The API is part of the DSL. Repeat that a few times in your head. The API is part of the DSL because it composes a significant part of the language that you use to communicate intent.

Having a clear API, one that reflects the domain you’re working in, will make building a DSL much easier. In fact, the process of writing a DSL is similar to the process of fleshing out a domain model or ubiquitous language in domain-driven design. Like the domain itself, the DSL should evolve with your understanding of the domain and the requirements of the application.

DSLs and domain-driven design are often seen together, for that matter.

Use iterative design for your DSLs

When sitting down to design a DSL, I take one of two approaches. Either I let it grow organically, as new needs arise, or I try to think about the core scenarios that I need to handle, and decide what I want the language to look like.

There are advantages to both approaches. The first approach is the one I generally use when I am building a language for myself, because I already have a fairly good idea what kind of a language I want.

I use the second approach if I’m building a DSL for general consumption, particularly to be used by non-developers. This isn’t to say you need to spend weeks and months designing a DSL. I still very much favor the iterative approach, but you should seek additional input before you start committing to a language’s syntax. Hopefully, this input will come from the expected audience of the DSL, which can help guide you toward a language that’s well suited for their needs. Then, once you start, assume that you’ll not be able to deliver the best result in the first few tries.

We’ll tackle the problem of DSL maintenance and versioning in chapter 9, and the techniques described there will help you build DSLs that can be modified in response to your increasing understanding of the domain and the requirements that you place on the DSL.

If you build a DSL when you’re just starting to understand the domain, and you neglect to maintain it as your understanding of the domain and its needs grows, it will sulk and refuse to cooperate. It will no longer allow you to easily express your intent, but rather will force you to awkwardly specify your intentions.

3.6. Combining domain-driven design and DSLs

Domain-driven design (DDD) is an approach to software design that’s based on the premise that the primary focus should be on the domain and the domain logic (as opposed to focusing on technological concerns) and that complex domain designs should be based on a model.

If you aren’t familiar with DDD, you may want to skip this section, because it focuses specifically on the use of DSLs in DDD applications.

Tip

If you’re interested in DDD, I highly recommend that you read Domain-Driven Design by Eric Evans and Applying Domain-Driven Design and Patterns by Jimmy Nilsson. Those books do an excellent job of describing how to flesh out and maintain a domain model.

3.6.1. Language-oriented programming in DDD

The reason for using language-oriented programming is that humans are good at expressing ideas using a spoken language. While spoken language is generally very imprecise, people usually settle on a set of terms and phrases that have specific meanings in a particular context.

Ubiquitous language and DSLs

Ubiquitous language is a term used in DDD to describe the way we talk about the software. The ubiquitous language is a spoken language that’s structured around the domain model and is used by all members of the team when talking about the domain.

A ubiquitous language isn’t a DSL, and a DSL isn’t a ubiquitous language. A ubiquitous language is used to make communication clearer. Terms from the ubiquitous language are then used in the code of the system.

A DSL, on the other hand, can be seen as taking the ubiquitous language and turning it into an executable language. A DSL isn’t always about a business domain, but when it is, and when you’re practicing DDD, it’s almost certain that your DSL will reflect the ubiquitous language closely.

In short, the ubiquitous language is the language of communication inside a team, whereas a DSL is a way to express intent. The two can (and hopefully will) be merged in many scenarios.

In some fields, the domain terms are very explicit. In a Sarbanes-Oxley tracking system, the domain terms are defined in the law itself. In many fields, some of the terms are well defined (such as in accounting) but other terms are often more loosely defined and can vary in different businesses or even different departments. The term customer is probably the quintessential example of a loosely defined term. I once sat in a meeting with two department heads, watching them fight for 3 hours over how the system would define a customer, without any satisfactory result.

When you’re building software, you usually need to talk to the domain experts. They can help clarify what the domain terms are, and from there you can build the ubiquitous language that you’ll use in the project.

Once you have the ubiquitous language, you can start looking at what you want to express in the DSL, and how you can use the ubiquitous language to express that. From there, you follow the same path we outlined in section 3.3: break it up according to the semantics, and then see what the language will allow you to get away with.

We’ll spend chapters 4 and 5 mostly dealing with how much we can get away with. But before we get into that, let’s look at the result of combining DSLs and DDD. You may have heard that the whole is greater than the sum of its parts.

3.6.2. Applying a DSL in a DDD application

It seems natural, when thinking about DSLs, to add DDD to the mix, doesn’t it?

Figure 3.4 shows a set of DSLs in a domain-driven application. In most applications, you’ll have a set of DSLs, each of them targeted at one specific goal. You’ll also usually have a DSL facade of some kind that will translate the code-driven API to a more language-oriented API.

Figure 3.4. DSLs used in a DDD context

There are quite a few domains where DDD doesn’t make sense. In fact, most of the DSLs that I use daily aren’t tied to a DDD implementation. They’re technical DSLs, used for such things as templating, configuration, ETL (extract, transform, and load), and so on.

Technical DSLs rarely require a full-fledged domain model or a ubiquitous language because the model used is usually implicit in the assumptions that we have as software developers. A templating DSL doesn’t need anything beyond text-processing instructions, for example. A configuration DSL needs little beyond knowing what it configures.

But when it comes to business DSLs, we’re in a much more interesting position. Let’s look at an example and start by assuming that we’ve defined a domain using the techniques that Eric Evans suggests in his book, Domain-Driven Design. Assuming that we have a CLR application (written in C#, VB.NET, or Boo) and assuming we’re writing the DSL in Boo, we have immediate and unlimited access to the domain. This means that, by default, our DSL can immediately take advantage of all the work that went into building the ubiquitous language and the domain model.

All the ideas about the domain model and ubiquitous language are directly applicable and exposed to the DSL. Think back to the business DSL example in listing 3.7, repeated here in listing 3.10.

What if I don’t know DDD already?

If you haven’t read Evans’ book or are not familiar with the terminology used, DDD calls for creating a ubiquitous language shared by all the stakeholders in the project (which explicitly includes the developers and the businesspeople).

The ubiquitous language is not used solely for communication with the businesspeople; it is part and parcel of the actual structure of the code. The more closely the language matches the way the businesspeople think about the processes to be performed, the more closely the software will meet the needs of the business.

Listing 3.10. A DSL that uses an existing DDD-based domain model

when User.IsPreferred and Order.TotalCost > 1000:
    AddDiscountPercentage  5
    ApplyFreeShipping
when User.IsNotPreferred and Order.TotalCost > 500:
    ApplyFreeShipping

Notice that we’re using both IsPreferred and IsNotPreferred—having both of them means that you get better readability. But consider the actions that are being performed when the condition specified in the when clause is matched. We aren’t modifying state, like this:

Order.TotalCost = Order.TotalCost - (Order.TotalCost * 0.05) #apply discount

That would probably work, but it’s a bad way to do it. It’s completely opaque, for one thing. The code is clear about what it does, but there is no hint about the business logic and reasoning behind it. There is a distinct difference between applying a discount for a particular sale offer and applying a discount because of a coupon, for example, and this code doesn’t explain that. It’s also probably wrong from the domain perspective, because you will almost certainly want to keep track of your discounts.

In the domain, we probably would have something like this:

Order.ApplyDiscountPercentage(5)

That would be valid code that we could put into action as well. But in the DSL, because we already know what the applicable operations are, we can make it even more explicit by specifying the discount as an operation with a known context. This makes those operations into part of the language that we use when writing functionality with the DSL.

Now, let’s get back to the Scheduling DSL that we started to build at the beginning of this chapter. Let’s dive into the implementation details.

3.7. Implementing the Scheduling DSL

Listing 3.11 will refresh your memory about what the Scheduling DSL looks like.

Listing 3.11. Sample code from the Scheduling DSL

task "warn if website is down":
    every 3.Minutes()
    starting now
    when WebSite("http://example.org").IsNotResponding
    then:
        notify "[email protected]", "server down!"

It doesn’t look much like code, right? But take a look at the class diagram in figure 3.5.

Figure 3.5. Class diagram of `BaseScheduler`, the implicit base class for the Scheduling DSL

This is the implicit base class for the Scheduling DSL. An implicit base class is one of the more common ways to define and work with a DSL. We’ll spend some time talking about this in chapter 4.

For now, please assume that the DSL code you see is being magically placed in the Prepare() method of a derived class. This means that you have full access to all the methods that the BaseScheduler exposes, because those are exposed by the base class.

What this means, in turn, is that you can now look at the DSL and the class diagram and suddenly understand that most of what goes on here involves plain old method calls. Nothing fancy or hard to understand—we’re merely using a slightly different syntax to call them than you usually do.

We’re adding a minor extension to the language here. Two methods in the BaseScheduler aren’t part of the API, but rather are part of the language extension:

Minutes()—This is a simple extension method that allows us to specify 3.Minutes(), which reads better than TimeSpan.FromMinutes(3), which is how we would usually perform the same task.
when(Expression)—This is a meta-method, which is a method that can modify the language. It specifies that the expression that’s passed to it will be wrapped in a delegate and stored in an instance variable. We’ll see exactly how this works in chapter 4.

That doesn’t make much sense right now, I know, so let’s start taking this DSL apart. We’ll use the exact opposite approach from what we do when we’re building the DSL. We’ll add the programming concepts to the existing DSL until we fully understand how this works.

Let’s start by adding parentheses and removing some compiler syntactic sugar. Listing 3.12 shows the results of that.

Listing 3.12. The Scheduling DSL after removing most of the syntactic sugar

task("warn if website is down", do() :
    self.every( self.Minutes(3) )
    self.starting ( self.now )
    self.when( WebSite("http://example.org").IsNotResponding)
    self.then( do():
        notify( "[email protected]", "server down!")
    )
)

A couple of notes about this before we continue:

self in Boo is the equivalent of this in C# or Java or of Me in VB.NET.
do(): is the syntax for anonymous delegates in Boo.

That looks a lot more like code now (and a lot less like a normal language). But we’re not done yet. We still need to resolve the when meta-method. When we run that, we’ll get the result shown in listing 3.13.

Listing 3.13. The Scheduling DSL after resolving the `when` meta-method

task("warn if website is down", do() :
    self.every( self.Minutes(3) )
    starting ( self.now )
    condition = do():
        return WebSite("http://example.org").IsNotResponding
    then( do():
        notify( "[email protected]", "server down!")
    )
)

As you can see, we completely removed the when method, replacing it with an assignment of an anonymous delegate for the instance variable. This is the only piece of compiler magic we’ve performed. Everything else is already in the Boo language.

Meta-methods and anonymous blocks

Take a look at the when and then methods. Both of them end up with a similar syntax, but they’re implemented in drastically different ways. The when method is a meta-method. It changes the code at compilation time. The then method uses an anonymous block as a way to pass the delegate to execute.

The reason we have two different approaches that end up with nearly the same end result (passing a delegate to a method) has to do with the syntax we want to achieve.

With the when method, we want to achieve a keyword-like behavior, so the when method accepts an expression and transforms that to a delegate. The then keyword has a different syntax that accepts a block of code, so we use Boo’s anonymous blocks to help us out there.

We’ll talk about those things extensively in chapters 4 and 6.

Now we can take the code in listing 3.13 and make a direct translation to C#, which will give us the code in listing 3.14.

Listing 3.14. The Scheduling DSL code, translated to C#

task("warn if website is down", delegate
{
    this.every( this.Minutes(3) );
    this.starting ( this.now );
    this.condition = delegate
    {
        return new WebSite("http://example.org"). IsNotResponding;
    };
    this.then( delegate
    {
        this.notify( "[email protected]", "server down!");
    });
});

Take a look back at the original DSL text in listing 3.11, and compare it to listing 3.14. In terms of functionality, they’re the same, but the syntactic differences between them are huge, and we want a good syntax for our DSL.

We’ve skipped one important part; we haven’t talked yet about what the implicit base class will do. The result of the implicit base class resolving its base class is shown in listing 3.15, and the details of what the implicit base class is doing are discussed in section 3.8.

Listing 3.15. The full class that was generated using the implicit base class

public class MyDemoTask ( BaseScheduler ):
            def override Prepare():
                 task("warn if website is down"), def():
                     # the rest of the code

Now that we have a firm grasp of what code we’re getting out of the DSL, we need to get to grips with how we can run this code.

3.8. Running the Scheduling DSL

So far we’ve focused on the transformations we’re putting the code through, but we haven’t talked yet about how to compile and execute a DSL. Remember, we aren’t dealing with scripts in the strict sense of the word; we have no interpreter to run. We’re going to compile our DSL to IL, and then execute this IL. The code that it takes to do this isn’t difficult, just annoying to write time after time, so I wrapped it up in a common project called Rhino DSL.^[2]

² Rhino [Project Name] is a naming convention that I use for most of my projects. You may be familiar with Rhino Mocks, for example, which is part of the same group of projects as Rhino DSL. There is no connection to Mozilla’s Rhino project, which is a JavaScript implementation in Java.

The Rhino DSL project

The Rhino DSL project is a set of components that turned out to be useful across many DSL implementations. It contains classes to aid in building a DSL engine, implicit base classes, multifile DSLs, and so on.

We’re going to use Rhino DSL throughout this book; it’s an open source project, licensed under the BSD license, which means that you can use it freely in any type of application or scenario. We’re also going to spend chapter 7 dissecting Rhino DSL, to ensure that you understand how it works, so you could implement it on your own, if you ever need to.

Compilation is expensive, and once we load an assembly in the CLR, we have no way of freeing the occupied memory short of unloading the entire AppDomain. To deal with these two problems, we need to do at least some caching up front. Doing this on a DSL-by-DSL basis is annoying, and it would be nice to get the cost of creating a DSL down as much as possible.

For all of those reasons, Rhino DSL provides the DslFactory class, which takes care of all of that. It works closely with the DslEngine, which is the class we derive from to specify how we want the compilation of the DSL to behave.

Again, none of this is strictly necessary. You can do it yourself easily, if you choose to, but using Rhino DSL makes it easier and allows us to focus on the DSL implementation instead of the compiler mechanics.

We’ve already looked at the BaseScheduler class. Now let’s take a peek at the SchedulingDslEngine class. Listing 3.16 shows the full source code of the class.

Listing 3.16. The implementation of `SchedulingDslEngine`

public class SchedulingDslEngine : DslEngine
{
    protected override void CustomizeCompiler(
        BooCompiler compiler,
        CompilerPipeline pipeline,
        string[] urls)
    {
        pipeline.Insert(1,
            new ImplicitBaseClassCompilerStep(
                typeof (BaseScheduler),
                "Prepare",
                // default namespace imports
                "Rhino.DSL.Tests.SchedulingDSL"));
    }
}

As you can see, it doesn’t do much, but what it does do is interesting. The method is called CustomizeCompiler, and you’re going to learn a whole lot more about customizing the compiler in chapter 4. For now, keep in mind that Boo allows you to move code around during compilation, and the ImplicitBaseClassCompilerStep does that.

The ImplicitBaseClassCompilerStep will create an implicit class that will derive from BaseScheduler. All the code in the file will be placed in the Prepare derived method. We can also specify default namespace imports. In listing 3.16, you can see that we add the Rhino.DSL.Tests.ShedulingDSL namespace. This namespace will be imported to all the DSL scripts, so we don’t have to explicitly import it. VB.NET users are familiar with this feature, using the project imports.

We’re nearly at the point when we can execute our DSL. The one thing that’s still missing is the DslFactory intervention. Listing 3.17 shows how we can work with that.

Listing 3.17. Executing a Scheduling DSL script

//initialization
DslFactory factory = new DslFactory();
factory.Register<BaseScheduler>(new SchedulingDslEngine());

//get the DSL instance
BaseScheduler scheduler = factory.Create<BaseScheduler>(
                             @"path/to/ValidateWebSiteUp.boo");

//This is where we run the code from the DSL file
scheduler.Prepare();

//Run the prepared scheduler
scheduler.Run();

First, we initialize the DslFactory, and then create and register a DslEngine for the specific base type we want. Note that you should only do this once, probably during the startup of the application. This usually means in the Main method in console and Windows applications, and in Application_Startup in web applications.

We then get the DSL instance from the factory. We pass both the base type we want (which is associated with the DslEngine that we registered and the return value of this method), and the path to the DSL script. Usually this will be a path in the filesystem, but I have seen embedded resources, URLs, and even source control links used.

Once we have the DSL instance, we can do whatever we want with it. Usually, this depends on the type of DSL it is. When using an imperative DSL, I would tend to call the Run() or Execute() methods. With a declarative DSL, I would usually call a Prepare() or Build() method, which would execute the code that we wrote using the DSL, and then I would call the Run() or Execute() method, which would take the result of the previous method call and act upon it. In more complex scenarios, you might ask a separate class to process the results, instead of having the base class share both responsibilities.

In the case of the Scheduling DSL, we use a declarative approach, so we call the Prepare() method to get whatever declarations were made in the DSL, and then we run the code. The Run() method in such a DSL will usually perform some sort of registration into a scheduling engine.

And that’s it—all the building blocks that you need to write a good DSL. We’re going to spend a lot more time discussing all the things we can do with DSLs, how we can integrate them into real applications, and version, test, and deploy them, but you should now have an overall understanding of what’s involved.

3.9. Summary

We’ve gone over quite a bit of information in this chapter. We contrasted the implementation of a simple problem (scheduling tasks) using both fluent interfaces in C# and a full-blown Boo-based DSL, and we saw that it’s very easy to take a DSL further than a fluent interface. And that’s aside from the syntactic differences between the two solutions.

We also explored why we might want to build DSLs and what types of DSLs we can build: technical, business, and extensibility DSLs.

Then we rolled up our sleeves and went to work building the Scheduling DSL, from the initial syntax, to implementing the DSL base class, to creating the DSL engine and running the code.

Along the way we took a quick peek at combining DSLs and DDD, explored the differences between imperative and declarative DSLs, and generally had fun. We covered (at a high level) just about everything you’ll need to create a useful DSL.

But not quite everything. We’re still focused at too high a level. It’s time to get down into the details and start practicing what we’ve discussed so far. That’s up next.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 3. The drive toward DSLs

Create new playlist

Sign In

Sign Up

Chapter 3. The drive toward DSLs

3.1. Choosing the DSL type to build

3.1.1. The difference between fluent interfaces and DSLs

Listing 3.1. Using a fluent interface to define a task to execute

Listing 3.2. Using a DSL to define a task to execute

Listing 3.3. A better fluent interface for scheduling tasks

3.1.2. Choosing between a fluent interface and a DSL

How to deal with language hopping

3.2. Building different types of DSLs

3.2.1. Building technical DSLs

Listing 3.4. An Ant XML script with a foreach loop and not the best syntax

Listing 3.5. A Bake script with a for loop

Tip

Listing 3.6. Binsor script to scan a set of assemblies

3.2.2. Building business DSLs

Talking with domain experts

Listing 3.7. A sample DSL for defining order-processing rules

Note

3.2.3. Building Extensibility DSLs

3.3. Fleshing out the syntax

3.4. Choosing between imperative and declarative DSLs

Figure 3.1. Standard operating procedure for imperative DSLs

Figure 3.2. Standard operating procedure for declarative DSLs

Listing 3.8. A hybrid DSL, using both imperative and declarative concepts

Listing 3.9. A more declarative approach to specifying rules

The importance of clarity

3.5. Taking a DSL apart—what makes it tick?

Figure 3.3. A typical DSL structure

Keeping the layers separated

Use iterative design for your DSLs

3.6. Combining domain-driven design and DSLs

Tip

3.6.1. Language-oriented programming in DDD

Ubiquitous language and DSLs

3.6.2. Applying a DSL in a DDD application

Figure 3.4. DSLs used in a DDD context

What if I don’t know DDD already?

Listing 3.10. A DSL that uses an existing DDD-based domain model

3.7. Implementing the Scheduling DSL

Listing 3.11. Sample code from the Scheduling DSL

Figure 3.5. Class diagram of BaseScheduler, the implicit base class for the Scheduling DSL

Listing 3.12. The Scheduling DSL after removing most of the syntactic sugar

Listing 3.13. The Scheduling DSL after resolving the when meta-method

Meta-methods and anonymous blocks

Listing 3.14. The Scheduling DSL code, translated to C#

Listing 3.15. The full class that was generated using the implicit base class

3.8. Running the Scheduling DSL

The Rhino DSL project

Listing 3.16. The implementation of SchedulingDslEngine

Listing 3.17. Executing a Scheduling DSL script

3.9. Summary

Table of Contents for
Chapter 3. The drive toward DSLs

Listing 3.4. An Ant XML script with a `foreach` loop and not the best syntax

Listing 3.5. A Bake script with a `for` loop

Figure 3.5. Class diagram of `BaseScheduler`, the implicit base class for the Scheduling DSL

Listing 3.13. The Scheduling DSL after resolving the `when` meta-method

Listing 3.16. The implementation of `SchedulingDslEngine`