Test the Data Permutations

Data5 drives almost all software. When testing user interfaces and public APIs, boundary and validation conditions significantly impact the security and stability of your software. At more programmatic levels, various forms of data-controlled behaviors can comprise non-trivial portions of the functionality. Even in statically typed languages like Java and C#, higher levels of abstraction in your system design naturally decrease the effectiveness of code coverage as a guide for complete testing. Dynamic languages and features like reflection-based execution compound the challenge.

5. I will talk only about the values used in testing here. Database testing warrants the attention of a whole different book.

Boundary Conditions

One of the more common forms of data variations in software behavior arises from boundary conditions. Boundary conditions occur for a wide range of reasons. Your happy and alternate path tests verify the behavior within normal input values, but may not test all input values. Boundary condition tests verify how the software behaves

• At the edges of the normal inputs to detect problems like off-byone errors

• At the edges of the abnormal inputs also to detect off-by-one errors

• Using anticipated variations of abnormal inputs for concerns like security

• Using specifically dysfunctional abnormal inputs such as divide-by-zero errors or using inputs that trigger contextually determined limits such as numerical accuracy or representation ranges

You may have tested some boundary conditions when testing error paths. However, looking at the variations from the perspective of boundary conditions can highlight omissions in error-handling logic and drive more thorough test coverage.

Natural or pragmatic value and resource constraints provide a rich vein of boundary conditions. Natural limits occur when using values with a naturally finite set of states. True/false and yes/no are the most trivial of these. Menu picks that ask the user to choose from a limited number of options also provide contextually natural constraints. Pragmatic limits like field lengths yield a rich source of boundary conditions, especially when you manipulate or append to the input data internal to the software. At the resource-constrained or extreme end of the spectrum, you can test limits like memory and file size.

Numerical and mathematical variations can be thought of as natural or pragmatic but have a broad yet specialized enough affinity to deserve their own treatment and attention. Division-by-zero errors are perhaps the most common mathematical issues in programming, requiring attention regardless of representation format or size. Value limits due to discrete representations continue to factor into consideration, as the migration to wider representations is balanced by the inevitable increase in data volumes. Precision presents a more complicated set of conditions to test, as accuracy issues affect both the code being tested and the test code.

Standards- and convention-based formats yield structured and predictable, yet sometimes complex, patterns from which to derive boundary conditions, particularly as they evolve. For example, the syntactic rules of the Domain Name System (DNS)6 are relatively simple. However, you can find opportunities for startling variations even within this simplicity. Security concerns drive people to attempt to validate domains. Those who choose not to validate them through lookup, regardless of whether for good or bad reasons, must make assumptions about the rules of domain names that go beyond the syntactic conventions. I have seen code that assumes that all top-level domains (TLDs) must be two or three characters in length, as was true for most of the original set of TLDs. This ignores the originally allocated single-letter domains used for administrative purposes and does not automatically account for the longer TLDs that have been and will be added, such as .name and .info. Expansion of the DNS syntax to allow non-European character sets adds another wrinkle to validation.

6. See http://tools.ietf.org/html/rfc1035#section-2.3.1.

More ad hoc or unstructured sources provide some of the most challenging inputs to predict. Any free-form text field has numerous considerations to validate. The simplest may involve restrictions on or stripping of white space or selection from a limited character set. The more complex can include evaluating inputs to detect SQL injection or cross-site scripting attacks and natural language processing for semantic content.

Data-Driven Execution

Guiding tests by code coverage, particularly at the unit level, works well to test behavioral variations that derive from code structure. However, many constructs provide significant behavioral variations without explicit branches in the code. The so-called Fundamental Theorem of Software Engineering7 says, “We can solve any problem by introducing an extra level of indirection.”

7. Not really a theorem; there are conflicting attributions for this quote. See http://en.wikipedia.org/wiki/Fundamental_theorem_of_software_engineering and http://en.wikipedia.org/wiki/David_Wheeler__computer_scientist_ for two of them.

A common data-driven scenario arises when processing command-line or some remote-invocation interfaces in which a dispatcher uses an Abstract Factory to generate Command pattern [DP] objects for execution, as shown in Listing 3-2. The function of the CommandFactory and each of the available Command implementations should be tested in their own right, but the CommandDispatcher integrates the behaviors to create a larger set of behaviors that cannot be identified through static analysis or evaluated for coverage.

Listing 3-2: A dispatcher using an Abstract Factory in a data-driven way to create Command pattern objects to do the work

class CommandDispatcher {
  private CommandFactory commandFactory;

  public void dispatch(String commandName) {
    Command command =
      commandFactory.createCommand(commandName);
    command.execute();
  }
}

When testing these constructs at the unit level, we should verify the correctness of the dispatch mechanism. Ideally, the definition of the dispatch targets is dynamic or separate in a manner conducive to independent testing. We should test each of the dispatch targets independently.

For tests at a larger scope, like system or integration tests, we must test each of the dynamic variations to ensure thorough testing of the software. A dispatch mechanism that works generically at the unit level typically has a well-defined and finite set of possibilities when integrated into a component or system.

Run-Time and Dynamic Binding

Most languages that run in a virtual machine and/or are dynamically bound like scripting languages have a feature called reflection. Reflection provides the ability to inspect the program’s namespace at runtime to discover or verify the existence of elements like classes, functions, methods, variables, attributes, return types, and parameters and, where applicable, invoke them.

The ability to access or invoke arbitrary symbols resembles a built-in form of data-driven execution based on data maintained by the runtime system but with a higher degree of capability and flexibility than most applications will create on their own. The power of reflection has led many teams to discourage or outright ban it from their applications to avoid some justifiably distasteful uses. In languages like Java (Listing 3-3) or Perl, this will not inhibit most applications excessively. Languages like Smalltalk and JavaScript (Listing 3-4) suffer without the use of these features. Even if your team avoids writing reflection-based code, many frameworks, like Java Spring and Quartz, use reflection extensively to enable configuration-based application assembly and dependency injection.

Listing 3-3: Basic dynamic invocation in Java using reflection, omitting error handling and exceptions

class Invoker {
  public static void invokeVoidMethodNoArgs(String className,
      String methodName) {
    Class clazz = Class.forName(className);
    Object object = clazz.newInstance();
    Method method = class.getMethod(methodName, null);
    method.invoke(object, null);
  }
}

Listing 3-4: Basic dynamic invocation in JavaScript

function invokeNoArgsNoReturn(object, func) {
  if (object[func] && typeof object[func] === "function") {
    object[func]();
  }
}

Even less capable languages for reflection, such as C and C++, can exhibit some of the dynamic-binding properties of reflection-able language through POSIX dynamic library APIs like dlopen(3) as shown in Listing 3-5. This API gives the application the ability to load a shared library dynamically and to invoke functions within it, all by specifying the library and function names as strings under the constraint that the invocation signature is known.

Listing 3-5: Runtime binding with the POSIX dynamic library API in C without error handling

#include <dlfcn.h>

int main(int argc, char **argv)
{
  void *lib;
  void (*func)(void);
  lib = dlopen(argv[0], RTLD_LAZY);
  func = dlsym(lib, argv[1]);
  (*func)();
  dlclose(lib);
  return 0;
}

Just as in data-driven execution, tests need to verify that the mechanism for the dynamic invocation works at the unit level and that the assembled pieces work together at the higher levels.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.163.180