Chapter 5. Dependency management

This chapter covers

  • Understanding automated dependency management
  • Declaring and organizing dependencies
  • Targeting various types of repositories
  • Understanding and tweaking the local cache
  • Dependency reporting and version conflict resolution

In chapter 3, you learned how to declare a dependency on the Servlet API to implement web components for the To Do application. Gradle’s DSL configuration closures make it easy to declare dependencies and the repositories to retrieve them from. First, you define what libraries your build depends on with the dependencies script. Second, you tell your build the origin of these dependencies using the repositories closure. With this information in place, Gradle automatically resolves the dependencies, downloads them to your machine if needed, stores them in a local cache, and uses them for the build.

This chapter covers Gradle’s powerful support for dependency management. We’ll take a close look at key DSL configuration elements for grouping dependencies and targeting different types of repositories.

Dependency management sounds like an easy nut to crack, but can become difficult when it comes to dependency resolution conflicts. Transitive dependencies, the dependencies a declared dependency relies on, can be a blessing and a curse. Complex dependency graphs can cause a mix-up of dependencies with multiple versions resulting in unreliable, nondeterministic builds. Gradle provides dependency reports for analyzing the dependency tree. You’ll learn how to find answers to questions like “Where does a specific dependency come from?” and “Why was this specific version picked?” to resolve version conflicts.

Gradle rolls its own dependency management implementation. Having learned from the shortcomings of other dependency managers like Ivy and Maven, Gradle’s special concern is performance, build reliability, and reproducibility.

5.1. A quick overview of dependency management

Almost all JVM-based software projects depend on external libraries to reuse existing functionality. For example, if you’re working on a web-based project, there’s a high likelihood that you rely on one of the popular open source frameworks like Spring MVC or Play to improve developer productivity. Libraries in Java get distributed in the form of a JAR file. The JAR file specification doesn’t require you to indicate the version of the library. However, it’s common practice to attach a version number to the JAR filename to identify a specific release (for example, spring-web-3.1.3.RELEASE.jar). You’ve seen small projects grow big very quickly, along with the number of third-party libraries and modules your project depends on. Organizing and managing your JAR files is critical.

5.1.1. Imperfect dependency management techniques

Because the Java language doesn’t provide or propose any tooling for managing versioned dependencies, teams will have to come up with their own strategies to store and retrieve them. You may have encountered the following common practices:

  • Manually copying JAR files to the developer machine. This is the most primitive, nonautomated, and error-prone approach to handle dependencies.
  • Using a shared storage for JAR files (for example, a folder on a shared network drive), which gets mounted on the developer’s machine, or retrieving binaries over FTP. This approach requires the developer to initially establish the connection to the binary repository. New dependencies will need to be added manually, which potentially requires write permissions or access credentials.
  • Checking JAR files that get downloaded with the project source code into the VCS. This approach doesn’t require any additional setup and bundles source code and all dependencies as one consistent unit. Your team can retrieve changes whenever they update their local copy of the repository. On the downside, binary files unnecessarily use up space in the repository. Changing working copies of a library requires frequent check-ins whenever there’s a change to the source code. This is especially true if you’re working with projects that depend on each other.

5.1.2. Importance of automated dependency management

While all of these approaches work, they’re far from being sufficient solutions, because they don’t provide a standardized way to name and organize the JAR files. At the very least, you’ll need to know the exact version of the library and the dependencies it depends on, the transitive dependencies. Why is this so important?

Knowing the exact version of a dependency

Working with a project that doesn’t clearly state the versions of its dependencies quickly becomes a maintenance nightmare. If not documented meticulously, you can never be sure which features are actually supported by the library version in your project. Upgrading a library to a newer version becomes a guessing game, because you don’t know exactly what version you’re upgrading from. In fact, you may actually be downgrading without knowing it.

Managing transitive dependencies

Transitive dependencies are of concern even at an early stage of development. These are the libraries your first-level dependencies require in order to work correctly. Popular Java development stacks like the combination of Spring and Hibernate can easily bring in more than 20 additional libraries from the start. A single library may require many other libraries in order to work correctly. Figure 5.1 shows the dependency graph for Hibernate’s core library.

Figure 5.1. Dependency graph of Hibernate core library

Trying to manually determine all transitive dependencies for a specific library can be a real time-sink. Many times this information is nowhere to be found in the library’s documentation and you end up on a wild-goose chase to get your dependencies right. As a result, you can experience unexpected behavior like compilation errors and runtime class-loading issues.

I think we can agree that a more sophisticated solution is needed to manage dependencies. Optimally, you’ll want to be able to declare your dependencies and their respective versions as project metadata. As part of an automated process, they can be retrieved from a central location and installed for your project. Let’s look at existing open source solutions that support these features.

5.1.3. Using automated dependency management

The Java space is mostly dominated by two projects that support declarative and automated dependency management: Apache Ivy, a pure dependency manager that’s mostly used with Ant projects, and Maven, which contains a dependency manager as part of its build infrastructure. I’m not going to go into deep details of any of these solutions. Instead, the purpose of this section is to explain the concepts and mechanics of automated dependency management.

In Ivy and Maven, dependency configuration is expressed through an XML descriptor file. The configuration consists of two parts: the dependency identifiers plus their respective versions, and the location of the binary repositories (for example, an HTTP address you want to retrieve them from). The dependency manager evaluates this information and automatically targets those repositories to download the dependencies onto your local machine. Libraries can define transitive dependencies as part of their metadata. The dependency manager is smart enough to analyze this information and resolve those dependencies as part of the retrieval process. If a dependency version conflict is recognized, as demonstrated by the example of Hibernate core, the dependency manager will try to resolve it. Once downloaded, the libraries are stored in a local cache. Now that the configured libraries are available on your developer machine, they can be used for your build. Subsequent builds will first check the local cache for a library to avoid unnecessary requests to a repository. Figure 5.2 illustrates the key elements of automated dependency management.

Figure 5.2. Anatomy of automated dependency management

Using a dependency manager frees you from the burden of manually having to copy or organize JAR files. Gradle provides a powerful out-of-the-box dependency management implementation that fits into the architecture just described. It describes the dependency configuration as part of Gradle’s expressive DSL, has support for transitive dependency management, and plays well with existing repository infrastructures. Before we dive into the details, let’s look at some of the challenges you may face with dependency management and how to cope with them.

5.1.4. Challenges of automated dependency management

Even though dependency management significantly simplifies the handling of external libraries, at some point you’ll find yourself dealing with certain shortcomings that may compromise the reliability and reproducibility of your build.

Potential unavailability of centrally hosted repositories

It’s not uncommon for enterprise software to rely on open source libraries. Many of these projects publish their releases to a centrally hosted repository. One of the most widely used repositories is Maven Central. If Maven Central is the only repository your build relies on, you’ve automatically created a single point of failure for your system. In case the repository is down, you’ve stripped yourself of the ability to build your project if a dependency is required that isn’t available in your local cache.

You can avoid this situation by configuring your build to use your own custom in-house repository, which gives you full control over server availability. If you’re eager to learn about it, feel free to directly jump to chapter 14, which talks about how to set up and use open source and commercial repository managers like Sonatype Nexus and JFrog’s Artifactory.

Bad metadata and missing dependencies

Earlier you learned that metadata is used to declare transitive dependencies for a library. A dependency manager analyzes this information, builds a dependency graph from it, and resolves all nested dependencies for you. Using transitive dependency management is a huge timesaver and enables traceability for your dependency graph.

Unfortunately, neither the metadata nor the repository guarantees that any of the artifacts declared in the metadata actually exist, are defined correctly, or are even needed. You may encounter problems like missing dependencies, especially on repositories that don’t enforce any quality control, which is a known issue on Maven Central. Figure 5.3 demonstrates the artifact production and consumption lifecycle for a Maven repository.

Figure 5.3. Bad metadata complicates the use of transitive dependencies. Dependency metadata in Maven repositories is represented by a project object model (POM) file. If the library developer provides incorrect metadata, the consumer will inherit the problems.

Gradle allows for excluding transitive dependencies on any level of the dependency graph. Alternatively, you can omit the provided metadata and instate your own transitive dependency definition.

You’ll find that popular libraries will appear in your transitive dependency graph with different versions. This is often the case for commonly used functionality like logging frameworks. The dependency manager tries to find a smart solution for this problem by picking one of these versions based on a certain resolution strategy to avoid version conflicts. Sometimes you’ll need to tweak those choices. To do so, you’ll first want to find out which dependencies bring in what version of a transitive dependency. Gradle provides meaningful dependency reports to answer these questions. Later, we’ll see these reports in action. Now let’s see how Gradle implements these ideas with the help of a full-fledged example.

5.2. Learning dependency management by example

In chapter 3, you saw how to use the Jetty plugin to deploy a To Do application to an embedded Jetty Servlet container. Jetty is a handy container for use during development. With its lightweight container implementation, it provides fast startup times. Many enterprises use other web application container implementations in their production environments. Let’s assume you want to build support for deploying your web application to a different container product, such as Apache Tomcat.

The open source project Cargo (http://cargo.codehaus.org/) provides versatile support for web application deployment to a variety of Servlet containers and application servers. Cargo supports two implementations you can use in your project. On the one hand, you can utilize a Java API, which gives you fine-grained access to each and every aspect of configuring Cargo. On the other hand, you can choose to execute a set of preconfigured Ant tasks that wrap the Java API. Because Gradle provides excellent integration with Ant, our examples will be based on the Cargo Ant tasks.

Let’s revisit figure 5.1 and see how the components change in the context of a Gradle use case. In chapter 3 you learned that dependency management for a project is configured with the help of two DSL configuration blocks: dependencies and repositories. The names of the configuration blocks directly map to methods of the interface Project. For your use case, you’re going to use Maven Central because it doesn’t require any additional setup. Figure 5.4 shows that dependency definitions are provided through Gradle’s DSL in the build.gradle file. The dependency manager will evaluate this configuration at runtime, download the required artifacts from a central repository, and store them in your local cache. You’re not using a local repository, so it’s not shown in the figure.

Figure 5.4. Declaring a dependency on the Cargo libraries in a Gradle build

The following sections of this chapter discuss each of the Gradle build script configuration elements one by one. Not only will you learn how to apply them to the Cargo example, you’ll also learn how to apply dependency management to implement the requirements of your own project. Let’s first look at a concept that will become more important in the context of our example: dependency configurations.

5.3. Dependency configurations

In chapter 3, you saw that plugins can introduce configurations to define the scope for a dependency. The Java plugin brings in a variety of standard configurations to define which bucket of the Java build lifecycle a dependency should apply to. For example, dependencies required for compiling production source code are added with the compile configuration. In the build of your web application, you used the compile configuration to declare a dependency on the Apache Commons Lang library. To get a better understanding of how configurations are stored, configured, and accessed, let’s look at responsible interfaces in Gradle’s API.

5.3.1. Understanding the configuration API representation

Configurations can be directly added and accessed at the root level of a project; you can decide to use one of the configurations provided by a plugin or declare your own. Every project owns a container of class ConfigurationContainer that manages the corresponding configurations. Configurations are very flexible in their behavior. You can control whether transitive dependencies should be part of the dependency resolution, define the resolution strategy (for example, how to respond to conflicting artifact versions), and even make configurations extend to each other. Figure 5.5 shows the relevant Gradle API interfaces and their methods.

Figure 5.5. Configurations can be added and accessed through the Project instance

Another way of thinking of configurations is in terms of a logical grouping. Grouping dependencies by configuration is a similar concept to organizing Java classes into packages. Packages provide unique namespaces for classes they contain. The same is true for configurations. They group dependencies that serve a specific responsibility.

The Java plugin already provides six configurations out of the box: compile, runtime, testCompile, testRuntime, archives, and default. Couldn’t you just use one of those configurations to declare a dependency on the Cargo libraries? Generally, you could, but you’d mix up dependencies that are relevant to your application code and the infrastructure code you’re writing for deploying the application. Adding unnecessary libraries to your distribution can lead to unforeseen side effects at runtime and should be avoided at all costs. For example, using the compile configuration will result in a WAR file that contains the Cargo libraries. Next, I’ll show how to define a custom configuration for the Cargo libraries.

5.3.2. Defining a custom configuration

To clearly identify the dependencies needed for Cargo, you’ll need to declare a new configuration with the unique name cargo, as demonstrated in the following listing.

Listing 5.1. Defining a configuration for Cargo libraries

For now, you’re only dealing with a single Gradle project. Limiting the visibility of this configuration to this project is a conscious choice in preparation for a multiproject setup. If you want to learn more about builds consisting of multiple projects, check out chapter 6. You don’t want to let configurations spill into other projects if they’re not needed. The description that was set for the configuration is directly reflected when you list the dependencies of the project:

$ gradle dependencies
:dependencies

------------------------------------------------------------
Root project
------------------------------------------------------------

cargo - Classpath for Cargo Ant tasks.
No dependencies

After adding a configuration to the configuration container of a project, it can be accessed by its name. Next, you’ll use the cargo configuration to make the third-party Cargo Ant task public to the build script.

5.3.3. Accessing a configuration

Essentially, Ant tasks are Java classes that adhere to Ant’s extension endpoint for defining custom logic. To add a nonstandard Ant task like the Cargo deployment task to your project, you’ll need to declare it using the Taskdef Ant task. To resolve the Ant task implementation class, the Cargo JAR files containing them will need to be assigned. The next listing shows how easy it is to access the configuration by name. The task uses the resolved dependencies and assigns them to the classpath required for the Cargo Ant task.

Listing 5.2. Accessing the cargo configuration by name

Don’t worry if you don’t understand everything in the code example. The important part is that you recognize the Gradle API methods that allow you access to a configuration. The rest of the code is mostly Ant-specific configurations expressed through Gradle’s DSL. Chapter 9 will give you the inside scoop on using Ant tasks from Gradle. With the deployment task set up, it’s time to assign the Cargo dependencies to the cargo configuration.

5.4. Declaring dependencies

Chapter 3 gave you a first taste of how to tell your project that an external library is needed for it to function correctly. The DSL configuration block dependencies is used to assign one or more dependencies to a configuration. External dependencies are not the only dependencies you can declare for your project. Table 5.1 gives you an overview of the various types of dependencies. In this book we’ll discuss and apply many of these options. Some of the dependency types are explained in this chapter, but others will make more sense in the context of another chapter. The table references each of the use cases.

Table 5.1. Dependency types for a Gradle project

Type

Description

Where to Go for More Information

External module dependency A dependency on an external library in a repository including its provided metadata Section 5.4.2
Project dependency A dependency on another Gradle project Section 6.3.3
File dependency A dependency on a set of files in the file system Section 5.4.3
Client module dependency A dependency on an external library in a repository with the ability to declare the metadata yourself Not covered—refer to the online manual
Gradle runtime dependency A dependency on Gradle’s API or a library shipped with the Gradle runtime Section 8.5.7

In this chapter we’ll cover external module dependencies and file dependencies, but first let’s see how dependency support is represented in Gradle’s API.

5.4.1. Understanding the dependency API representation

Every Gradle project has an instance of a dependency handler, which is represented by the interface DependencyHandler. You obtain a reference to the dependency handler by using the project’s getter method getDependencies(). Each of the dependency types presented in table 5.1 is declared through a method of the dependency handler within the project’s dependencies configuration block. Each dependency is an instance of type Dependency. The attributes group, name, version, and classifier clearly identify a dependency. Figure 5.6 illustrates the relationship between the project, the dependency handler, and the actual dependencies.

Figure 5.6. Different types of dependencies can be added on the project level.

Let’s first look at how to declare external module dependencies, their notation, and how to configure them to meet your needs.

5.4.2. External module dependencies

In Gradle’s terminology, external libraries, usually in the form of JAR files, are called external module dependencies. They represent a dependency on a module outside of the project hierarchy. This type of dependency is characterized by attributes that clearly identify it within a repository. In the following section, we’ll discuss each attribute one by one.

Dependency attributes

When the dependency manager looks for a dependency on a repository, it locates it through the combination of attributes. At a minimum, a dependency needs to provide a name. Let’s review the dependency attributes with the help of the Hibernate core library we examined in section 5.1.2:

  • group: This attribute usually identifies an organization, company, or project. The group may use a dot notation, but it’s not mandatory. In the case of the Hibernate library, the group is org.hibernate.
  • name: An artifact’s name uniquely describes the dependency. The name of Hibernate’s core library is hibernate-core.
  • version: A library may be available in many versions. Many times the version string consists of a major and a minor version. The version you selected for Hibernate core is 3.6.3-Final.
  • classifier: Sometimes an artifact defines another attribute, the classifier, which is used to distinguish artifacts with the same group, name, and version, but it needs further specification (for example, the runtime environment). Hibernate’s core library doesn’t provide a classifier.

Now that we’ve reviewed some dependency attributes, we can look more closely at how Gradle expects them to be declared in the build script.

Dependency notation

To declare dependencies in your project, you can use the following syntax:

dependencies {
   configurationName dependencyNotation1, dependencyNotation2, ...
}

You first state the name of the configuration you want to assign the dependencies to and then a list of dependencies in the notation of your choice. The dependency notation comes in two flavors. You can either provide a map of attribute names and their values, or the shortcut notation as a string that separates each attribute by a colon (see figure 5.7). We’ll look at both notations in the example.

Figure 5.7. Dependency attributes in shortcut notation

After defining the configuration, you can easily use it to assign the relevant Cargo dependencies. To use Cargo in your project, you’ll need to provide JAR files containing the Cargo API, the core container implementations, and the Cargo Ant tasks. Thankfully, Cargo provides an UberJar, a single JAR file that packages the API and container functionality, which will make the dependency management easier. The following listing shows how to assign the relevant Cargo dependencies to the cargo configuration.

Listing 5.3. Assigning Cargo dependencies to cargo configuration

If you deal with a lot of dependencies in your project, it’s helpful to break out commonly used dependency attributes as extra properties. You do that in the example code by creating and using properties for Cargo’s dependency group and version attributes.

Gradle doesn’t select a default repository for you. Trying to run the task deployToLocalTomcat without configuring a repository would result in an error, as shown in the following console output:

$ gradle deployToLocalTomcat
:deployToLocalTomcat FAILED

FAILURE: Build failed with an exception.

* Where: Build file '/Users/benjamin/gradle-in-action/code/
 chapter5/cargo-configuration/build.gradle' line: 10

* What went wrong:
Execution failed for task ':deployToLocalTomcat'.
> Could not resolve all dependencies for configuration ':cargo'.
   > Could not find group:org.codehaus.cargo, module:cargo-core-
      uberjar, version:1.3.1.
     Required by:
         :cargo-configuration:unspecified
   > Could not find group:org.codehaus.cargo, module:cargo-ant,
      version:1.3.1.
     Required by:
         :cargo-configuration:unspecified

So far, we haven’t talked about different types of repositories and how to configure them. For the sake of getting this example running, add the following repositories configuration block:

repositories {
   mavenCentral()
}

There’s no need to fully understand the intricacies of this code snippet. The important point is that you configured your project to use Maven Central to download the Cargo dependencies. Later in this chapter, you’ll learn how to configure other repositories.

Inspecting the dependency report

When you run the dependencies help task, you can now see that the full dependency tree is printed. The tree shows the top-level dependencies you declared in the build script, as well as their transitive dependencies:

If you examine the dependency tree carefully, you’ll see that dependencies marked with an asterisk have been omitted. That means that the dependency manager selected either the same or another version of the library because it was declared as a transitive dependency of another top-level dependency. Interestingly, this is the case for the UberJar, so you don’t even have to declare it in your build script. The Ant tasks library will automatically make sure that the library gets pulled in. Gradle’s default resolution strategy for version conflicts is newest first—that is, if the dependency graph contains two versions of the same library, it automatically selects the newest. In the case of the library xml-apis, Gradle chooses version 1.3.03 over 1.0.b2, which is indicated by an arrow (->). As you can see, it’s very helpful to analyze the information exposed by the dependency report. When you want to find out which top-level dependency declares a specific transitive dependency and why a specific version of a library has been selected or omitted, the dependency report is a good place to start. Next, we’ll look at how to exclude transitive dependencies.

Excluding transitive dependencies

When dealing with a public repository like Maven Central, you may encounter poorly maintained dependency metadata. Gradle gives you full control over transitive dependencies, so you can decide to either fully exclude all transitive dependencies or selectively exclude specific dependencies. Let’s say you explicitly want to specify a different version of the library xml-apis instead of using the transitive dependency provided by Cargo’s UberJar. In practice, this is often the case when some of your own functionality is built on top of a specific version of an API or framework. The next listing shows how to use the exclude method from ModuleDependency to exclude a transitive dependency.

Listing 5.4. Excluding a single dependency

Notice that the exclusion attributes are slightly different from the regular dependency notation. You can use the attributes group and/or module. Gradle doesn’t allow you to exclude only a specific version of a dependency, so the version attribute isn’t available.

Sometimes the metadata of a dependency declares transitive dependencies that don’t exist in the repository. As a result, your build will fail. This is only one of the situations when you want to have full control over transitive dependencies. Gradle lets you exclude all transitive dependencies using the transitive attribute, as shown in the following listing.

Listing 5.5. Excluding all transitive dependencies
dependencies {
   cargo('org.codehaus.cargo:cargo-ant:1.3.1') {
      transitive = false
   }
   // Selectively declare required dependencies
}

So far, you’ve only declared dependencies on specific versions of an external library. Let’s see how to resolve the latest version of a dependency or the latest within a range of versions.

Dynamic version declaration

Dynamic version declarations have a specific syntax. If you want to use the latest version of a dependency, you’ll have to use the placeholder latest.integration. For example, to declare the latest version for the Cargo Ant tasks, you’d use org.codehaus .cargo:cargo-ant:latest-integration. Alternatively, you can declare the part of the version attribute you want to be dynamic by demarcating it with a plus sign (+). The following listing shows how to resolve the latest 1.x version of the Cargo Ant library.

Listing 5.6. Declaring a dependency on the latest Cargo 1.x version
dependencies {
   cargo 'org.codehaus.cargo:cargo-ant:1.+'
}

Gradle’s dependencies help task clearly indicates which version has been picked:

$ gradle –q dependencies
------------------------------------------------------------
Root project
------------------------------------------------------------
cargo - Classpath for Cargo Ant tasks.
--- org.codehaus.cargo:cargo-ant:1.+ -> 1.3.1
     --- ...

Another option is to select the latest within a range of versions for a dependency. To learn more about the syntax, feel free to check Gradle’s online manual.

When should I use dynamic versions?

The short answer is rarely or even never. A reliable and reproducible build is paramount. Choosing the latest version of a library may cause your build to fail. Even worse, without knowing it, you may introduce incompatible library versions and side effects that are hard to find and only occur at runtime of your application. Therefore, declaring the exact version of a library should be the norm.

5.4.3. File dependencies

As described earlier, projects that don’t use automated dependency management organize their external libraries as part of the source code or in the local file system. Especially when migrating your project to Gradle, you don’t want to change every aspect of your build at once. Gradle makes it easy for you to configure file dependencies. You’ll emulate this for your project by referencing the Cargo libraries in the local file system. The following listing shows a task that copies the dependencies resolved from Maven Central to the subdirectory libs/cargo under your user home directory.

Listing 5.7. Copying the Cargo dependencies to your local file system

After running the task, you’ll be able to declare the Cargo libraries in your dependencies configuration block. The next listing demonstrates how to assign all JAR files to the cargo configuration as a file dependency.

Listing 5.8. Declaring file dependencies
dependencies {
   cargo fileTree(dir: "${System.properties['user.home']}/libs/cargo",
                   include: '*.jar')
}

Because you’re not dealing with a repository that requires you to declare dependencies with a specific pattern, you also don’t need to define a repositories configuration block. Next, we’ll focus on the various repository types supported by Gradle and how they’re configured.

5.5. Using and configuring repositories

Gradle puts a special emphasis on supporting existing repository infrastructures. You’ve already seen how to use Maven Central in your build. By using a single method call, mavenCentral(), you configured your build to target the most popular Java binary repository. Apart from the preconfigured repository support, you can also assign an arbitrary URL of a Maven or Ivy repository and configure it to use authentication if needed. Alternatively, a simple file system repository can be used to resolve dependencies. If metadata is found for a dependency, it will be downloaded from the repository as well. Table 5.2 shows the different types of repositories and what section to go to next to learn more about it.

Table 5.2. Repository types for a Gradle project

Type

Description

Where To Go for More Information

Maven repository A Maven repository on the local file system or a remote server, or the preconfigured Maven Central Section 5.5.2
Ivy repository An Ivy repository on the local file system or a remote server with a specific layout pattern Section 5.5.3
Flat directory repository A repository on the local file system without metadata support Section 5.5.4

Feel free to jump to the section that describes the repository you want to use in your project. In the next section, we’ll look at Gradle’s API support for defining and configuring repositories before we apply each of them to practical examples.

5.5.1. Understanding the repository API representation

Central to defining repositories in your project is the interface RepositoryHandler, which provides methods to add various types of repositories. From the project, these methods are invoked within your repositories configuration block. You can declare more than one repository. When the dependency manager tries to download the dependency and its metadata, it checks the repositories in the order of declaration. The repository that provides the dependency first wins. Subsequent repository declarations won’t be checked further for the specific dependency. As shown in figure 5.8, each of the repository interfaces exposes different methods specific to the type of repository.

Figure 5.8. Relevant interfaces in Gradle’s API for configuring various types of repositories. Gradle supports repository implementations for flat directories, Maven, and Ivy.

Gradle doesn’t prefer any of the repository types. It’s up to your project’s needs to declare the repository most fitting. In the next section, we’ll look at the syntax to declare Maven repositories.

5.5.2. Maven repositories

Maven repositories are among the most commonly used repository types in Java projects. The library is usually represented in the form of a JAR file. The metadata is expressed in XML and describes relevant information about the library and its transitive dependencies, the POM file. Both artifacts are stored in a predefined directory structure in the repository. When you declare a dependency in your build script, its attributes are used to derive the exact location in the repository. The dot character in the group attribute of a dependency indicates a subdirectory in the Maven repository. Figure 5.9 shows how the Cargo Ant dependency attributes are mapped to determine the location of the JAR and POM files in the repository.

Figure 5.9. How a dependency declaration maps to artifacts in a Maven repository

The interface RepositoryHandler provides two methods that allow you to define preconfigured Maven repositories. The method mavenCentral() adds a reference to Maven Central to the list of repositories, and the method mavenLocal() refers to a local Maven repository in your file system. Let’s review both repository types and discuss when you’d use them in your project.

Adding the preconfigured Maven Central repository

Maven Central is a commonly used repository in a build. Gradle wants to make it as easy for the build developer as possible, and therefore provides you with a shortcut to declare Maven Central. Instead of having to define the URL http://repo1.maven.org/maven2 each and every time, you can just call the method mavenCentral(), as shown in the following code snippet:

repositories {
   mavenCentral()
}

A similar shortcut exists for defining a local Maven repository that by default is available under <USER_HOME>/.m2/repository.

Adding the preconfigured local Maven repository

When Gradle resolves a dependency, it’s located in the repository, downloaded, and then stored in the local cache. The location of this cache in your local file system is different than the directory in which Maven stores artifacts after downloading them. You may wonder when you’d want to use a local Maven repository now that you’re dealing with Gradle. This is especially the case if you work in a mixed environment of build tools. Imagine you’re working on one project that uses Maven to produce a library, and another project operating with Gradle wants to consume the library. Especially during development, you’ll go through cycles of implementing changes and trying out the changes on the consuming side. To prevent you from having to publish the library to a remote Maven repository for every little change, Gradle provides you with the option to target a local Maven repository, as shown in the following repository declaration:

repositories {
   mavenLocal()
}

Be aware that using a local Maven repository should be limited to this specific use case, as it may cause unforeseen side effects. You’re explicitly dependent on artifacts that are only available in the local file system. Running the script on other machines or a continuous integration server may cause the build to fail if the artifacts don’t exist.

Adding a custom Maven repository

There are multiple reasons why you’d want to target a repository other than Maven Central. Perhaps a specific dependency is simply not available, or you want to ensure that your build is reliable by setting up your own enterprise repository. One of the options a repository manager gives you is to configure a repository with a Maven layout. This means that it adheres to the artifact storage pattern we discussed before. Additionally, you can protect access to your repository by requiring the user to provide authentication credentials. Gradle’s API supports two ways of configuring a custom repository: maven() and mavenRepo(). The following listing shows how to target an alternative public Maven repository if an artifact isn’t available in Maven Central.

Listing 5.9. Declaring a custom Maven repository
repositories {
   mavenCentral()
   maven {
      name 'Custom Maven Repository',
      url 'http://repository-gradle-in-action.forge.cloudbees.com/release/')
   }
}

I can’t discuss every available configuration option in this chapter, so please refer to the online documentation for more information. Let’s see how an Ivy repository is different from a Maven repository and its configuration.

5.5.3. Ivy repositories

Artifacts in a Maven repository have to be stored with a fixed layout. Any deviation from that structure results in irresolvable dependencies. On the other hand, even though an Ivy repository proposes a default layout, it’s fully customizable. In Ivy, repository dependency metadata is stored in a file named ivy.xml. Gradle provides a wide variety of methods to configure Ivy repositories and their specific layout in your build. It goes beyond the scope of this book to cover all options, but let’s look at one example. Imagine you want to resolve the Cargo dependencies from an Ivy repository. The following listing demonstrates how to define the repository base URL, as well as the artifact and metadata layout pattern.

Listing 5.10. Declaring an Ivy repository

As with the POM in Maven repositories, you’re not forced to use the Ivy metadata to resolve transitive dependencies. The Ivy repository is perfect for resolving dependencies that don’t necessarily follow the standard Maven artifact pattern. For example, you could decide to place JAR files into a specific directory of a web server and serve it up via HTTP. To complete our discussion about repositories, we’ll look at flat directories.

5.5.4. Flat directory repositories

The simplest and most rudimentary form of a repository is the flat directory repository. It’s a single directory in the file system that contains only the JAR files, with no metadata. If you’re used to manually maintaining libraries with your project sources and planning to migrate to automated dependency management, this approach will interest you.

When you declare your dependencies, you can only use the attributes name and version. The group attribute is not evaluated and leads to an unresolved dependency if you try to use it. The next listing shows how to declare the Cargo dependencies as a map and shortcut notation retrieved from a flat directory repository.

Listing 5.11. Cargo dependencies declaration retrieved from a flat directory repository

This listing also perfectly demonstrates how useful it is to be able to use metadata that automatically declares transitive dependencies. In the case of the flat directory repository, you don’t have this information, so you need to declare every single dependency by itself, which can become quite tiring.

5.6. Understanding the local dependency cache

So far we’ve discussed how to declare dependencies and configure various types of repositories to resolve those artifacts. Gradle automatically determines whether a dependency is needed for the task you want to execute, downloads the artifacts from the repositories, and stores them in the local cache. Any subsequent build will try to reuse these artifacts. In this section, we’ll dig deeper by analyzing the cache structure, identifying how the cache works under the hood and how to tweak its behavior.

5.6.1. Analyzing the cache structure

Let’s explore the local cache structure through the example of your Cargo libraries. You know Gradle downloaded the JAR files when you ran the deployment task, but where did it put them? If you check the Gradle forum, you’ll find that many users frequently ask for it. You can use Gradle’s API to find out. The following listing shows how to print out the full, concatenated path of all dependencies assigned to the configuration cargo.

Listing 5.12. Printing the concatenated file path of all Cargo dependencies
task printDependencies << {
   configurations.getByName('cargo').each { dependency ->
      println dependency
   }
}

If you run the task, you’ll see that all JAR files get stored in the directory /Users/benjamin/.gradle/caches/artifacts-15/filestore:

$ gradle -q printDependencies
/Users/benjamin/.gradle/caches/artifacts-15/filestore/
 org.codehaus.cargo/cargo-core-uberjar/1.3.1/jar/
 3d6aff857b753e36bb6bf31eccf9ac7207ade5b7/cargo-core-uberjar-1.3.1.jar
/Users/benjamin/.gradle/caches/artifacts-15/filestore/
 org.codehaus.cargo/cargo-ant/1.3.1/jar/
 a5a790c6f1abd6f4f1502fe5e17d3b43c017e281/cargo-ant-1.3.1.jar
...

This path will probably look slightly different on your machine. Let’s dissect this path even more and give it some more meaning. Gradle’s root directory for storing dependencies in the local cache is <USER_HOME>/.gradle/caches. The next part of the path, artifact-15, is an identifier that’s specific to the Gradle version. It’s needed to differentiate changes to the way metadata is stored.

Bear in mind that this structure may change with newer versions of Gradle. The actual cache is divided into two parts. The subdirectory filestore contains the raw binaries downloaded from the repository. Additionally, you’ll find some binary files that store metadata about the downloaded artifacts. You’ll never need to look at them during your day-to-day business. The following directory tree shows the contents from the root level of a local dependency cache:

The filestore directory is a natural representation of a dependency. The attributes group, name, and version directly map to subdirectories in the file system. In the next section, we’ll discuss the benefits Gradle’s cache brings to your build.

5.6.2. Notable caching features

The real power of Gradle’s cache lies in its metadata. It enables Gradle to implement additional optimizations that lead to smarter, faster, and more reliable builds. Let’s discuss the features one by one.

Storing the origin of a dependency

Imagine a situation where you declare a dependency in your script. While running the build for the first time, the dependency gets downloaded and stored in the cache. Subsequent builds will happily use the dependency available in the cache. The build is successful. What would happen if the structure of the repository were changed (for example, one of the attributes was renamed or the dependency moved or was simply deleted)—something you as an artifact consumer have no control over? With many other dependency managers like Maven and Ivy, the build would work just fine, because the dependency exists in the local cache and can be resolved. However, for any other developer that runs the build on a different machine, the build would fail. This is a problem and leads to inconsistent builds. Gradle takes a different approach to this situation. It knows the location a dependency originates from and stores this information in the cache. As a result, your build becomes more reliable.

Artifact change detection

Gradle tries to reduce the network traffic to remote repositories. This is not only the case for dependencies that were already downloaded. If a dependency cannot be resolved in a repository, this metadata is stored in the cache. Gradle uses this information to avoid having to check the repository every time the build runs.

Reduced artifact downloads and improved change detection

Gradle provides tight integration with Maven’s local repository to avoid having to download existing artifacts. If a dependency can be resolved locally, it’s reused. The same is true for artifacts that were stored with other versions of Gradle.

Gradle detects if an artifact was changed in the repository by comparing its local and remote checksum. Unchanged artifacts are not downloaded again and reused from the local cache. Imagine the artifact was changed on the repository but the checksum is still the same. This could happen if the administrator of the repository replaces an artifact with the same version. Ultimately, your build will use an outdated version of the artifact. Gradle’s dependency manager tries to eliminate this situation by taking additional information into consideration. For example, it can ensure an artifact’s uniqueness by comparing the value of the HTTP header parameter content-length or the last modified date. This is an advantage Gradle’s implementation has over other dependency managers like Ivy.

Offline mode

If your build declares remote repositories, Gradle may have to check them for dependency changes. Sometimes this behavior is undesirable; for example, if you’re traveling and don’t have access to the Internet. You can tell Gradle to avoid checking remote repositories by running in offline mode with the --offline command-line option. Instead of performing dependency resolution over the network, only dependencies from the local cache will be used. If a required dependency doesn’t exist in the cache, the build will fail.

5.7. Troubleshooting dependency problems

Version conflicts can be a hard nut to crack. If your project deals with many dependencies and you choose to use automatic resolution for transitive dependencies, version conflicts are almost inevitable. Gradle’s default strategy to resolve those conflicts is to pick the newest version of a dependency. The dependency report is an invaluable tool for finding out which version was selected for the dependencies you requested. In the following section, I’ll show how to troubleshoot version conflict and tweak Gradle’s dependency resolution strategy to your specific use case.

5.7.1. Responding to version conflicts

Gradle won’t automatically inform you that your project dealt with a version conflict. Having to constantly run the dependency report to find out isn’t a practical approach to the problem. Instead, you can change the default resolution strategy to fail the build whenever a version conflict is encountered, as shown in the following code example:

configurations.cargo.resolutionStrategy {
   failOnVersionConflict()
}

Failing can be helpful for debugging purposes, especially in the early phases of setting up the project and changing the set of dependencies. Running any of the project’s tasks will also indicate the version conflict, as shown in the following sample output:

$ gradle -q deployToLocalTomcat

FAILURE: Build failed with an exception.

* Where:
Build file '/Users/benjamin/Dev/books/gradle-in-action/code/chapter4/
 cargo-dependencies-fail-on-version-conflict/build.gradle' line: 10

* What went wrong:
Execution failed for task ':deployToLocalTomcat'.
> Could not resolve all dependencies for configuration ':cargo'.
   > A conflict was found between the following modules:
      - xml-apis:xml-apis:1.3.03
      - xml-apis:xml-apis:1.0.b2
Rich API to access resolved dependency graph

In memory, Gradle builds a model of the resolved dependency graph. Gradle’s resolution result API gives you an even more fine-grained control over the requested and selected dependencies. A good place to start geting familiar with the API is the interface ResolutionResult.

5.7.2. Enforcing a specific version

The more projects you have to manage, the more you may feel the need to standardize the build environment. You’ll want to share common tasks or make sure that all projects use a specific version of a library. For example, you want to unify all of your web projects to be deployed with Cargo version 1.3.0, even though the dependency declaration may request a different version. With Gradle, it’s really easy to implement such an enterprise strategy. It enables you to enforce a specific version of a top-level dependency, as well as a transitive dependency.

The following code snippet demonstrates how to reconfigure the default resolution strategy for the configuration cargo to force a dependency on version 1.3.0 of the Ant tasks:

configurations.cargo.resolutionStrategy {
   force 'org.codehaus.cargo:cargo-ant:1.3.0'
}

Now when you run the dependency report task, you’ll see that the requested Cargo Ant version was overruled by the globally enforced module version:

5.7.3. Using the dependency insight report

A change to the resolution strategy of a configuration, as shown previously, is perfectly placed in an initialization script so it can be enforced on a global level. The build script user may not know why this particular version of the Cargo Ant tasks has been picked. The only thing they saw was that the dependency report indicated that a different version was selected. Sometimes you may want to know what forced this version to be selected.

Gradle provides a different type of report: the dependency insight report, which explains how and why a dependency is in the graph. To run the report, you’ll need to provide two parameters: the name of the configuration (which defaults to the compile configuration) and the dependency itself. The following invocation of the help task dependencyInsight shows the reason, as well as the requested and selected version of the dependency xml-apis:xml-apis:

While the dependency report starts from the top-level dependencies of a configuration, the insight report shows the dependency graph starting from the particular dependency down to the configuration. As such, the insight report represents the inverted view of the regular dependency report, as shown in figure 5.10.

Figure 5.10. View of dependency graph with different report types

5.7.4. Refreshing the cache

To avoid having to hit a repository over and over again for specific types of dependencies, Gradle applies certain caching strategies. This is the case for snapshot versions of a dependency and dependencies that were declared with a dynamic version pattern. Once resolved, they’re cached for 24 hours, which leads to snappier, more efficient builds. After the artifact caching timeframe is expired, the repository is checked again and a new version of the artifact is downloaded if it has changed.

You can manually refresh the dependency in your cache by using the command-line option --refresh-dependencies. This flag forces a check for changed artifact versions with the configured repositories. If the checksum changed, the dependency will be downloaded again and replace the existing copy in the cache. Having to add the command-line options can become tiring after a while, or you may forget to tag it on. Alternatively, you can configure a build to change the default behavior of your cache.

Let’s say you’ve always wanted to the get latest 1.x version of the Cargo Ant tasks you declared with org.codehaus.cargo:cargo-ant:1.+. You can set the cache timeout for dynamic dependency versions to 0 seconds, as shown in the following code snippet:

configurations.cargo.resolutionStrategy {
   cacheDynamicVersionsFor 0, 'seconds'
}

You may have good reasons for not wanting to cache a SNAPSHOT version of an external module. For example, another team in your organization works on a reusable library that’s shared among multiple projects. During development the code changes a lot, and you always want to get the latest and (hopefully) greatest additions to the code. The following code block modifies the resolution strategy for a configuration to not cache SNAPSHOT versions at all:

configurations.compile.resolutionStrategy {
   cacheChangingModulesFor 0, 'seconds'
}

5.8. Summary

Most projects, be they open source projects or an enterprise product, are not completely self-contained. They depend on external libraries or components built by other projects. While you can manage those dependencies yourself, the manual approach doesn’t fulfill the requirements of modern software development. The more complex a project becomes, the harder it is to figure out the relationships between dependencies, resolve potential version conflicts, or even know why you need a specific dependency.

With automated dependency management, you declare dependencies by unique identifiers within your project without having to manually touch the artifacts. At runtime, the dependent artifacts are automatically resolved in a repository, downloaded, stored in a local cache, and made available to your project. Automated dependency management doesn’t come without challenges. We discussed potential pitfalls and how to cope with them.

Gradle provides powerful out-of-the-box dependency management. You learned how to declare different types of dependencies, group them with the help of configurations, and target various types of repositories to download them. The local cache is an integral part of Gradle’s dependency management infrastructure and is responsible for high-performance and reliable builds. We analyzed its structure and discussed its essential features. Knowing how to troubleshoot dependency version conflicts and fine-tune the cache is key to a stable and reliable build. You used Gradle’s dependency reporting to get a good understanding of the resolved dependency graph, as well as why a specific version of a dependency was selected and where it came from. I showed strategies for changing the default resolution strategy and cache behavior, as well as appropriate situations that make them necessary.

In the next chapter, you’ll take your To Do application to the next level by modularizing the code. You’ll learn how to use Gradle’s multiproject build support to define dependencies between individual components and make them function as a whole.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
35.175.133.71