Defining data structures as code in Groovy

An important and powerful part of Groovy is its implementation of the Builder pattern. This pattern was made famous by the seminal work Design Patterns: Elements of Reusable Object-Oriented Software; Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides.

With builders, data can be defined in a semi-declarative way. Builders are appropriate for the generation of XML, definition of UI components, and anything that is involved with simplifying the construction of object graphs. Consider:

Teacher t = new Teacher('Steve')
Student s1 = new Student('John')
Student s2 = new Student('Richard')
t.addStudent(s1)
t.addStudent(s2)

There are a few issues with the previous code; verbosity and the lack of a hierarchical relationship between objects. This is what we can do with a Builder in Groovy:

teacher ('Jones') {
  student ('Bob')
  student ('Sue')
}

Out of the box, Groovy includes a suite of builders for most of the common construction tasks that we might encounter:

  • MarkupBuilder for building an XML-style tagged output (see the Constructing XML content recipe in Chapter 5, Working with XML in Groovy)
  • DOMBuilder for constructing a WC3 DOM tree in memory from the GroovyMarkup-like syntax (http://groovy.codehaus.org/GroovyMarkup)
  • JsonBuilder for building data structures using the JSON format (see the Constructing JSON messages with JsonBuilder recipe in Chapter 6, Working with JSON in Groovy)
  • SwingBuilder to build Swing-based UIs
  • ObjectGraphBuilder to construct a graph of objects that follow the Java Beans rules

Builders are the fundamental blocks for creating DSLs (Domain Specific Languages) in Groovy. Martin Fowler, in his book Domain-Specific Languages, defines a DSL as a computer programming language of limited expressiveness focused on a particular domain. The limits of a DSL are not bound to its usefulness, but rather to its scope within the domain. A typical example of this contraposition is SQL: the language has enough expressiveness to operate on a database, but it lacks the eloquence to write an operating system.

A DSL is a small scale, specifically-focused language, rather than a general purpose language like Java. Chapter 9, Metaprogramming and DSLs in Groovy, contains two recipes that show how you to create DSLs in great detail.

In this recipe, we are going to explore how builders can simplify the creation of an object hierarchy for testing purposes.

Getting ready

Generating test data is a tedious task, especially when we need realistic data that can be used to simulate different situations in our code. Normally, it all starts from a domain model that has to be manually built and fed to some function:

Book book1 = new Book()
book.id = 200
book.title = 'Twenty Thousand Leagues Under the Sea'
book.author = 'Jules Verne'
Book book2 = new Book()
book.id = 201
...

...you get the idea. Test cases quickly become an endless list of hard-to-read object graph definitions and before you know it, your tests are very hard to maintain.

In this recipe, we want to create a simple DSL mainly based on Builders to draw our domain model without the Java Bean ceremony and, as a bonus, be able to generate random data using different strategies.

This is our domain model:

import groovy.transform.*

@Canonical
class ShoppingCart {
    List<Book> items = []
    User user
    Address shippingData
}

@Canonical
class Book {
   Long id
   String title
   BigDecimal price
}
@Canonical
class User {
   Long id
   String name
   Address address }

@Canonical
class Address {
   String street
   String city
   String country
}

It's a simplistic domain model for a books e-commerce site. Our goal is to build a DSL that uses the metaprogramming features of Groovy to express the object graph in a concise way.

def shoppingCart = new ECommerceTestDataBuilder().build {
  items(2) {
    title RANDOM_TITLE
    id RANDOM_ID, 100, 200000
    price 100
  }
  user {
    id RANDOM_ID, 1,500
    firstName RANDOM_STRING
    lastName RANDOM_STRING
    address RANDOM_US_ADDRESS
  }
}

The preceding snippet generates a ShoppingCart object containing two books. Each book has a random title fetched from the amazing Gutenberg project (http://www.gutenberg.org/), a random unique ID with values ranging from 100 to 200000, and a fixed price, set to 100.

How to do it...

First of all, let's create a new Groovy file named randomTestData.groovy and paste the domain model classes defined in the previous paragraph.

  1. In the same file, following the domain classes, add the definition for the new builder:
    class ECommerceTestDataBuilder {
    
    }
  2. Add the main builder method, build, to the body of the class:
    ShoppingCart shoppingCart
    def books = []
    
    ShoppingCart build(closure) {
      shoppingCart = new ShoppingCart()
      closure.delegate = this
      closure()
    
      shoppingCart.items = books
      shoppingCart
    }
  3. The next method to add is the one for defining the number of books to add to the shopping cart:
    void items (int quantity, closure) {
      closure.delegate = this
      quantity.times {
         books << new Book()
         closure()
      }
    }
  4. Add the methodMissing method, which is a key part of the DSL architecture, as explained in the next section:
    def methodMissing(String name, args) {
      Book book = books.last()
      if (book.hasProperty(name)) {
        def dataStrategy = isDataStrategy(args)
        if (dataStrategy) {
          book.@"$name" = dataStrategy.execute()
        } else {
          book.@"$name" = args[0]
        }
      } else {
        throw new MissingMethodException(
                name,
                ECommerceTestDataBuilder,
                args)
      }
    }
  5. The last method to add for the ECommerceTestDataBuilder class is required for adding random data generation strategies to our DSL:
    def isDataStrategy(strategyData) {
      def strategyClass = null
      try {
        if (strategyData.length == 1) {
          strategyClass = strategyData[0].newInstance()
        } else {
          strategyClass = strategyData[0].
                            newInstance(*strategyData[1,-1])
        }
        if (!(strategyClass instanceof
                  DataPopulationStrategy)) {
          strategyClass = null
        }
      } catch (Exception e) {
      }
      strategyClass
    }
  6. The builder code is complete. Now let's add a couple of strategy classes to the script and the main strategy interface:
    interface DataPopulationStrategy {
      def execute()
    }
    
    class RANDOM_TITLE implements DataPopulationStrategy {
    
      def titleCache = []
    
      def ignoredTitleWords = ['Page', 'Sort', 'Next']
    
      void getRandomBookTitles() {
        def slurper = new XmlSlurper()
        slurper.setFeature(
          'http://apache.org/xml/features/' +
          'nonvalidating/load-external-dtd',
          false)
        def dataUrl = 'http://m.gutenberg.org' +
                      '/ebooks/search.mobile'
        def orderBy = '/?sort_order=random'
        def htmlParser = slurper.parse("${dataUrl}${orderBy}")
        htmlParser.'**'.findAll{ it.@class == 'title'}.each {
          if (it.text().tokenize().disjoint(ignoredTitleWords)) {
            titleCache << it.text()
          }
        }
      }
    
      def execute() {
        if (titleCache.size==0) {
          randomBookTitles
        }
        titleCache.pop()
      }
    
    }
    
    class RANDOM_ID implements DataPopulationStrategy {
    
      Long minVal
      Long maxVal
    
      RANDOM_ID (min, max) {
        minVal = min
        maxVal = max
      }
    
      def execute() {
        double rnd = new Random().nextDouble()
        minVal + (long) (rnd * (maxVal - minVal))
      }
    
    }
  7. It is time to put our code to the test:
    def shoppingCart = new ECommerceTestDataBuilder().build {
      items(5) {
        title RANDOM_TITLE
        id RANDOM_ID, 100, 200000
        price 100
      }
    }
    
    assert shoppingCart.items.size == 5
    shoppingCart.items.each {
      assert it.price == 100
      assert it.id > 100 && it.id < 200000
    }

How it works...

The domain model's classes are standard Groovy Beans annotated with the @Canonical annotation. The annotation is discussed in detail in the Writing less verbose Java Beans with Groovy Beans recipe. In short, it adds an implementation of equals, hashCode, and toString, along with a tuple constructors, to a bean.

@Canonical
class Book {
   Long id
   String title
   BigDecimal price
}
Book b = new Book(2001, 'Pinocchio', 22.3)
println b.toString()

The preceding code snippet will print:

Book(2001, Pinocchio, 22.3)

The method build that was displayed at step 2 is the builder's entry method:

def shoppingCart = new ECommerceTestDataBuilder().build {
  ...
}

The build method takes a closure as only argument. The closure is where most of the magic happens. Let's dig into the closure code:

items(5) {
  title RANDOM_TITLE
  id RANDOM_ID, 100, 200000
  price 100
}

The items method that was defined at step 3 is invoked with two arguments: the number of books to create, and another closure where the random data strategies are defined. In Groovy, if the last argument of a method is a closure, it does not need to be inside the parentheses of the invoked method:

def doSomething(int i, Closure c) {
  c(i)
}
something(i) {
  // closure code
}

You may have noticed that both methods, build and item, have a call to the delegate method of the closure just before the closure is invoked:

closure.delegate = this
closure()

The delegate method allows you to change the scope of the closure so that the methods invoked from within the closure are delegated directly to the builder class.

Inside the items block, we define the (random) values that we want to be assigned to each property of the Book object. The properties are only defined in the Book object but are not visible by the Builder. So how is the Builder able to resolve a call to the title or price property of Book and assign a value? Every method invoked from inside the items block, in fact, does not exist.

Thanks to Groovy's metaprogramming capabilities, we can intercept method calls and create methods on the fly. In particular, the most common technique for intercepting calls in Groovy is to implement the methodMissing method on a Groovy class. This method can be considered as a net for undefined functions in a class. Every time a call is executed against a missing method, the runtime routes the call to the methodMissing routine, just before throwing a MissingMethodException exception. This offers a chance to define an implementation for these ghost methods.

Let's take a closer look:

title RANDOM_TITLE

The method title does not exist in the Builder code. When an invocation to this method is executed from within the closure, the dispatcher, before giving up and throwing a MissingMethodException, tries to see if methodMissing can be used to resolve the method.

Note

Groovy allows you to call a method and omit the parentheses if there is at least one parameter and there is no ambiguity. This is the case for the title method, which can be written as title(RANDOM_TITLE), but obviously the DSL is much more readable without parentheses.

Inside the Builder's methodMissing method, the code does the following:

  1. Fetches the last book created by the items method.
  2. Checks that the book has a property named as the missing method, using the hasProperty method on the object itself.
  3. If the missing method is named as one of the book's properties, the code either directly assigns the parameter passed after the missing method to the appropriate property of Book, or tries to resolve a random strategy through the isDataStrategy method (step 7).

The object property is accessed through the @ operator, which accesses the field directly bypassing the mutators (getters and setters):

book.@"$name" = (dataStrategy) ? dataStrategy.execute() : args[0]

In the previous code snippet, the field is populated with the value defined in the DSL (for example, price 100, or by the result of the random data strategy call). The random data strategy classes must implement the DataPopulationStrategy (step 8). The interface exposes only one execute method. If a strategy requires more arguments, these have to be passed through the constructor (see RANDOM_ID strategy, where the minimum and maximum values are set via the class' constructor). The method isDataStrategy is invoked for each field specified in the items block. The class accesses the argument passed after the field specification:

title RANDOM_TITLE

It tries to instantiate the class as a DataPopulationStrategy instance. The argument passed to the property must match the class name of the strategy class in order for the strategy resolution to work.

The isDataStrategy method employs a small trick to instantiate the random data strategy class in case the DSL specifies additional arguments, such as:

id RANDOM_ID, 1,500

In the previous snippet, the id field will be populated by the result of the RANDOM_ID strategy that will generate a random number between 1 and 500. If the strategy has no arguments, the class is instantiated with newInstance:

strategyClass = strategyData[0].newInstance()

The strategyData variable corresponds to the args variable of the methodMissing function, which is the caller. args is a list of arguments containing whatever values are passed from the DSL. For instance, id RANDOM_ID, 1, 500 corresponds to calling a method with the following signature:

def id(Object... args) { }

This is why we call newInstance on the first element of the strategyData variable. If the args list contains more than one argument, we assume that the strategy class requires the additional values, so the strategy class is instantiated in this fashion:

strategyClass = strategyData[0].newInstance(*strategyData[1,-1])

In this one-liner, we take advantage of several features of Groovy. The first one is the possibility of calling newInstance with an array of objects. Java makes creating classes dynamically with a constructor much more cumbersome. The second feature is the spread operator. The spread operator (*) is used to tear a List apart into single elements.

This can be used to call a method that has more than one argument and automatically assigns each element of the list to the values for the parameters. We know that strategyData contains the list of arguments specified in the DSL, and that the first item of the list should be ignored because it is the actual strategy class to instantiate. The remaining elements of the list must be used as arguments for the class' constructor. The spread does exactly that in conjunction with Groovy ranges, strategyData[1,-1].

The two strategies defined in the code (step 5) are outside the scope of this recipe. The first strategy simply fetches random book titles from the Internet. The second strategy generates random Long values between a specified range. In real life, random data would be probably pulled from a database or an existing source.

You may have also noticed that this recipe doesn't fully implement the DSL specified at the beginning. The code doesn't support creating users and addresses. We will leave this for the reader as an exercise to further understand builders and DSLs in Groovy.

See also

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.165.70