Chapter 3. Asynchronous Apex for Fun and Profit

In this chapter, we will focus on processing large amounts of data. We'll talk about the batchable interface as well as its options and how we can schedule batch jobs using a corresponding schedulable interface. We will talk about the @future annotation for methods and the new Queueable interface. All told we will cover:

  • Batchable
  • Schedulable
  • Queueable
  • The @future annotation

Using batchable classes

Collectively referred to as asynchronous code batchable, queueable, and the @future methods contract a tradeoff with the Salesforce1 platform. In exchange for running the code asynchronously, meaning we have no control over when it's actually executed, the platform relaxes certain governor limitations. For instance, during normal, synchronous Apex execution, you are limited to modifying 10,000 records per DML call. Attempting to access more than 10,000 records throws a SOQL error. However, during the execution of a batchable class, those governor limits are reset every time the execute method is called.

Conceptually, asynchronous code executes in two steps. First, the code is queued. As system resources allow, the system pulls jobs off the queue and executes them. As usual, the devil is in the details. Batchable, Queueable, and the @future annotation all have various tradeoffs that distinguish them from each other. Knowing when to use which one is crucial. As we work through each of the options, we'll talk about appropriate use cases for each.

For batchable jobs, there is an intermediary step between queuing and execution: chunking. When you execute your batch job, the system runs the initial query and chunks up the results, enqueueing N number of jobs—one job for every X number of results. The X factor is the batch execution context, or the number of records that each batch will process. As the developer, you can specify what the branch execution context is when you start the job. If left unspecified, it defaults to 200 records per chunk.

The Database.batchable interface, provided by the platform is straightforward. Simply extended your class with Database.Batchable, like this:

public with sharing class myBatchable extends Database.batchable

Once you've extended Database.batchable, you're required to implement the three method signatures the interface requires. The first of the three required methods is the Start method, which the platform calls at the beginning of batch's overall execution:

global Database.QueryLocator start(Database.BatchableContext BC) {
  return Database.getQueryLocator([
SELECT IDId, FirstName, LastName, BillingAddress
    FROM Account
    WHERE Active = true AND BillingState = 'Tx'
    ORDER BY Id descDESC
  ]);
}

This first method gathers the records or objects that the execute method will act on and, in conjunction with the execution context, determines the number of chunks that will be executed. The start method provides two possible return values, either a Database.queryLocator object or an iterable object, such as a List. The preceding example uses queryLocator. Regardless of the type returned, it must accept a single parameter of the Database.BatchableContext type. When the execute method is called, it returns an ID, which you can use to query the AsyncApexJob object. The AsyncApexJob object gives you insight into the status of a batch job:

global void execute(Database.BatchableContext BC, List<Account> scope) {
  for (Account a : scope) {
    //Complex operations against individual 
//Account records here
  }
  update scope;
}

The Database.batchableContext BC variable is essentially dependency injected by the system. However, our second method, the execute method, must reference it to fulfill the batchable interface. The execute method is run once per chunk. It accepts not only Database.batchableContext, but also a chunk of records; a list of whatever object the start() method returns. In the preceding example code, I've called that chunk scope, and in this case, it's a list of the Account objects. Because the execute method is called once per chunk, scope is essentially a local iteration variable. Thankfully, the platform dependency injects a chunk into every execute call; meaning we're not responsible for running the execute() calls ourselves.

The general pattern for a batchable class execute method is to iterate over the given scope using a for. In general, this for loop is where you implement your custom business logic either by placing the logic there in the execute method, or better yet, calling a dedicated class method much like we talked about with triggers. While you modify individual records inside the for loop, like a trigger, you must keep DML outside of your loop! Loop over your scope records, modify them, and then insert/update/delete, and so on, all of them at once just after the loop ends. If you need to use the loop to determine what kind of DML ultimately happens to the records, do so with multiple lists. For instance, if you're iterating over a scope of accounts and determining which of them to delete, create an additional list and use that to delete those records in bulk as shown here:

global void execute(Database.BatchableContext BC, List<Account> scope) {
  List<Account> toDelete = new List<Account>();
  List<Account> toUpdate = new List<Account>();

  for (Account a : scope) {
    if(a.bad_account__c){
      toDelete.add(a);
    } else {
      toUpdate.add(a);
    }
  }
  update toUpdate;
  delete toDelete;
}

In this example, we're not actually updating the scope directly, but rather splitting the records into two lists. One to delete, one to update, and only then calling the appropriate DML after the loop is finished.

Of all the methods of the batchableBatchable class, the execute method is conceptually the most straightforward. However, it's often the most difficult to perfect. It pays to keep in mind the governor limits affecting batchable class execution. While asynchronous Apex, such as jobs conforming to the batchable interface, relax some governor limits, your execute method is still limited to 10k rows per DML call. If, for instance, you utilize the increased memory you have available to access to more than 10k, you'll hit governor limits even if you use the bulk pattern discussed earlier. On the other hand, if you're executing SOQL queries within your execute method, but outside your for loop, you're bound by the standard governor limits, they just reset every time the execute method is invoked.

The third method in the batchable interface is the finish() method. The finish method is run after the start() method has queued all of the chunks and those chunks have been executed. The general idea is that after your batch job is functionally completed, the finish method can send e-mail notifications to users or handle execution exceptions. In general, such cleanup tasks should rarely rely on specific data, but can reliably rely on the jobs execution to be complete. For instance, if you use a batch job to look at all of your opportunity records in order to sum the value of all records meeting a certain criteria, your finish() method would be responsible for persisting the summary record.

Knowing what to use the finish method for can be one the most confusing aspects of the batchableBatchable Apex classes. It is safe to use the finish() method for things that are guaranteed to have run. However, it is not safe to depend upon any data generated by the batch. This is because the data generated by an execute method may not always exist. Exceptions within the execute() method force a roll back of any DML thus, if an exception occurs, the data you expect may not actually be there when your finish method runs. On the other hand, knowing when to use a batch class is fairly straightforward. Use batch class when you need to carry out the same logic against a large number of records; specifically, more than 10,000 records. Using a Database.Batchable class allows you to access or modify up to 50 million records per queryLocator. At a maximum DML chunk size of 10,000, this means your batchable class will queue 5000 chunks.

Additional extensions

Batchable classes can implement two additional interfaces: Database.Stateful and Database.AllowsCallouts. Implementing these two interfaces is as simple as adding them to your batchableBatchable class signature, as follows:

Global Class CustomBatchable Implements 
Database.Batchable<sObject>, 
Database.Stateful, // This implements the Stateful interface
Database.AllowsCallouts { // This enables Callouts
}

The Database.Stateful interface allows you to maintain state across all of the execute method invocations. The crucial difference is that when Database.stateful is implemented, instance variables are not reset each time the execute method is called. This is incredibly useful, for example, if you want to maintain a summary of calculations. Previously, I talked about the hypothetical example where we iterated over all our opportunities to calculate a sum if those opportunities met certain criteria. The Database.Stateful interface makes this simple. Simply add Database.stateful to the list of interfaces and create an instance variable like this:

global double totalSum =0;

During your execution method, simply add to the totalSum variable, and then in your finish method, write that sum to the database. Because class instance variables are not reset when Database.stateful is implemented, we can simply add to the instance variable in the execution method. That instance variable is also available in the finish method for persisting to the database. Another classic use case for implementing Database.Stateful is error handling. Exceptions within the execute method result in a rollback of that invocations scope. If you wish to allow partial scope updating while also capturing records that fail, you must implement Database.Stateful and use Database.update() for DML updates. There are two key parts to this recipe.

First, DML updates must be done with the Database.update() method with the optional AllOrNone flag set to false. This will allow some records to fail, while updating others. The Database.update() method returns a Database.saveResult object—a list indicating which records succeeded or failed.

Secondly, in order to maintain a list of failed records for processing later, you need to implement Database.Stateful and create an instance list variable to hold your failed records. Once all of the execute invocations have completed, your finish() method can access or manipulate the list of failed records. This is an incredibly useful pattern as it allows you a clean path for updating records in bulk while handling exceptions!

Here is an example of a customBatchable class that implements Database.Stateful to capture specific record. Ensure that you read the comments dispersed throughout that explain what's happening:

Global Class CustomBatchable implements Database.Batchable<sObject>, Database.Stateful, Database.AllowsCallouts {
  //failedToUpdate persists throughout the entire job.
  Private Set<Account> failedToUpdate = new Set<Account>();
  //Because it's marked Static, updatedSuccessfully
  // resets every time the execute method runs 
  Private Static Set<Id> updatedSuccessfully = new Set<Id>(); 
  String query;

  global CustomBatchable() {
    //Optional constructor, useful for setting query 
    //variables like Dates etc. Setting the query in the 
    // constructor allows you to use dynamic SOQL as well
    this.query = 'SELECT IDId, Name, BillingStreet, BillingState 
    FROM Account ORDER BY Id DESC';
  }

  global Database.QueryLocator start(Database.BatchableContext BC) {
    return Database.getQueryLocator(query);
  }

  global void execute(Database.BatchableContext BC, List<Account> scope) {
    for (Account a : scope) {
      if (a.BillingState == 'Tx' || a.Name.contains('awesome')) {
        a.Active__c = 'true';
      }
    }
    Database.SaveResult[] results = Database.Update(scope, false);
      
    for (Database.SaveResult sr : results) {
      if (sr.isSuccess()) {
        updatedSuccessfully.add(sr.getId());
      }
    }

    for (Account a : scope) {
      if (!updatedSuccessfully.contains(a.Id)) {
        failedToUpdate.add(a);
      }
    }
  }

  global void finish(Database.BatchableContext BC) {
    if (failedToUpdate.size() > 0) {
      //Email admin about failed updates
      //Or process them individually, attempting to 
      //auto-correct DML issues.
    }
    //Once done processing the failed records:
    List<Account> insertNow = new List<Account>();
    insertNow.addAll(failedToUpdate);
    insert insertNow;
  }
}

Note how the execute method references both a class instance variable, failedToUpdate, and a class static variable, updatedSuccessfully. The static variable, updatedSuccessfully, resets every time the execute method is run, but the instance variable, failedToUpdate does not. This allows us to add records to the instance variable whenever an object fails to update. In the case of our preceding example class, we use updatedSuccessfully to build a temporary set of IDs the SaveResult object lists as successfully updated. Then, we compare the set of successfully updated records against the full list of records we attempted to update. Those records whose id is not in the successfullyUpdated set are added to the FailedToUpdate set that persists across all execution runs. When all the batches are executed, the finish method acts on those records in the FailedToUpdate set.

If your batch job needs to make callouts to external web services you'll need to write your batch class to implement Database.AllowsCallouts. Thankfully, like the Database.Stateful interface, you only need to add Database.AllowsCallouts to your class definition. Also, note that your callout logic does not need to directly reside inside your batch class. Indeed, you should implement your callouts in their own classes and call that code from within your batchable class. That said, if your batch class invokes code that makes a callout, you must implement Database.AllowsCallouts.

There are a few governor caveats to implement Database.AllowsCallouts. While your execute method resets governor limits, you are still limited to a maximum of 100 callouts per invocation. If you are working with a batch execution context of 200 or 2000, this means that your callouts have to be bulk safe, as making a callout for each of the 200 records would necessarily cause a governor limit exception. Alternatively, if the web service you are calling is not bulk safe and requires a 1:1 call per record, you must set your execution context size to 100 or lower. However, this will limit the number of records you can process overall, so ensure that you embed some sort of processed flag in your data model and initial query so that you can execute the batch job multiple times. Again, remember that while you can make callouts in the start() and finish() methods, each method is limited to a total of 100 callouts.

Regardless of whether your batchable class implements Database.Sateful or Database.allowsCallouts, all batchable classes are executed by running the Database.executeBatch() method and providing both an instance of the batchable class and the query execution context as an integer as shown here:

Id batchJobId = Database.executeBatch(new CustomBatchable(), 200);

The resulting Id object helps you find the status of the batch. Batch jobs can also be scheduled using the System.schedule() method, but it should be noted that System.schedule() runs as the system user, executing all classes whether or not the user has permission. Alternatively, you can use a schedulable class to start your batch job via Database.executeBatch().

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.240.252