Chapter 7. Using, Extending, and Creating API Integrations

Increasingly, development is less about individual programs and more about building and integrating entire systems of code. In this chapter, we'll look at using the Salesforce1 sObject API, the bulk sObject API, how we can extend the sObject API with custom endpoints, and you will learn how to call various APIs from within the Salesforce1 platform. Specifically, we'll discuss the following topics:

  • The various methods for integrating different systems together
  • Learning to call the Salesforce sObject API to create, read, update, and delete individual Salesforce records from outside Salesforce
  • Learning to use the bulk API for integrating bulk data into our Salesforce instance
  • Building our own custom REST endpoint in Apex, and calling that from outside Salesforce
  • Building a REST client that we can use to quickly and simply consume external APIs from within Salesforce

In the beginning, we physically moved tapes around

When computers were the size of small houses and required their own nuclear power station to run, data was moved between systems on magnetic tapes. Because computing time was a precious resource, operations were often done in bulk. Over time, a common pattern emerged: extracting data from the database in bulk, transforming that data into a bulk, and finally, loading that transformed data back into the database in bulk. This process is also known as the Extract, Transform, and Load (ETL) process. The idea was to pull records from a data store, run some kind of calculation or transformation on them, and then load that data back to a data store. This worked not only intrasystem loading data from an internal data source, but also form intersystem, as extracted data from one system could be loaded from the magnetic tapes that were physically transported to the second system. In a sense, ETL was the first API. APIs are built from a combination of one or more data interchange formats and a transportation protocol. In the case of ETL, the data interchange format was almost always comma- or tab-delimited flat files. A delimiter, such as a comma, was used to separate record fields. Line returns were used to separate records. ETL's transportation protocol, was magnetic tapes physically moved between systems, buildings, and cities. Today, APIs are far more automated, utilizing transportation protocols that don't rely on people or machines moving data on tapes. Nevertheless, they fundamentally rely on the same two components: Machine-readable data and a transportation protocol. Modern APIs utilize machine-readable, text-based data interchange formats, such as XML and JSON, transported over modern transportation protocols, such as HTTP and HTTPS. In fact, these tools underpin not only APIs but the contemporary ETL tools built on them.

SOAP then REST – the evolution of modern APIs

In the early 2000s, two technologies—XML and the HTTP protocol—evolved together into a technology for remotely accessing data and procedures. Deceptively named Simple Object Access Protocol (SOAP), it proved to be less than simple and far more procedural than the object-oriented. Still, for many years, it was the preeminent method of exposing and consuming APIs. Part of the allure of SOAP was the extensive tooling that surrounded it. API vendors distributed Web Service Definition Language (WSDL) files that tooling could generate object code from. Virtually, every language from PHP to Apex provided such WSDL parsers. This made it convenient, if not simple, to consume APIs. The Salesforce1 platform provides the WSDL2Apex tool, which has now been open sourced to do this conversion for you. While there are some limitations on obscure types of WSDLs, in general, if you need to consume a SOAP API, WSDL2 Apex will take care of it for you.

Newer APIs are almost always released by vendors as REST APIs. Representational State Transfer (REST) or RESTful APIs as they are sometimes called, are fundamentally very different from SOAP APIs. First and foremost, REST is less coupled, allowing for greater flexibility in things such as data interchange formats. SOAP, conversely is tightly coupled to XML and dictates that all SOAP APIs use XML for data exchange. In contrast, RESTful APIs can use a variety of data formats. JSON is the most common, but you can use Google's Protocol Buffers, Microsoft's oData, or even XML. Like SOAP APIs, RESTful APIs utilize the HTTP(S) protocol for data transmission; however, that's where the similarities end. While SOAP requests encode the action and data in the payload of the HTTP(s) request, REST uses the standard set of HTTP actions that map by convention to types of API methods. Most RESTful APIs follow the following convention:

HTTP Action

API Action

GET

This has multiple uses, but it is primarily used for data retrieval.

POST

This processes the incoming data; it is typically used to create a new record

PUT / PATCH

This updates a record

DELETE

This deletes data

Note

The PATCH action is a relatively new addition to the HTTP protocol. On servers or APIs that predate it, PUT is used.

In practice, these actions combined with formatted data allow you to Create, Read, Update, and Delete (CRUD) data. Because RESTful APIs are HTTP(S)based, their URL endpoints typically define the object as well as API version and (optionally) namespace.

I oAuth, therefore I am

Salesforce provides a number of RESTful APIs, notably the sObject API, but also APIs for bulk data manipulation and streaming updates. These APIs are uniformly secured by the oAuth 2 identification protocol. oAuth is a conceptually simple identification protocol, however, due to the nature of what it's doing, it's still relatively complex. oAuth uses a client key and a client secret in concert with the information you provide, such as a username and password combination, to establish two things. First, it establishes the identity of the app the oAuth server is communicating with is allowed, and second, the user who is authenticating. In return of proof-of-accepted-app and valid credentials, you'll receive an oAuth access token. This token is essentially a long, pseudo-random string of characters that functions as a proof of identity for a given app.

These tokens usually have a finite time to live, for instance, an hour, during which that token identifies you to the server for access. With RESTful APIs, you often have to provide this token in a request header called Authorization. Often, this value is predicated by an identifying string, such as oAuth or Bearer, Salesforce's RESTful APIs all utilize this mechanism for authorization of API calls. Specifically, Salesforce authorization headers should be formatted this way:

Authorization: Bearer XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

We'll be making use of this authorization header throughout this chapter, so it's important that we figure out how to authorize our API calls. In order to make dealing with REST calls easier, let's use Postman, a free REST client plugin for Chrome. You can get Postman at www.getpostman.com. Postman lets you define collections of REST calls and save them together. For instance, you might create a sandbox collection with an authentication call and an upsert call saved in the collection. Handily, Postman is cloud based, so you can access your collections and requests anywhere you've logged into Chrome. The foundation for any Salesforce collection is the access token request, so let's set that up now:

I oAuth, therefore I am

In the preceding image, you'll see the basic Postman interface. On the left-hand side is the list of collections and their requests. On the right-hand side, we have the interface for building a request. At the top is the HTTP action selector, in this case, set to POST along with the endpoint URL. For our authentication call, we'll need to POST our information to https://login.salesforce.com/services/oauth2/token. We'll need to pass certain parameters with the POST call and they are set up in the upper portion of the right-hand side pane. You'll need to fill out the following keys and their values:

Key

Value

client_id

This can be found on the detail page of your connected app

client_secret

Found on the detail page of your connected app

grant_type

Must say password this is not where you place your password

username

Your normal Salesforce username

password

Enter your password and security token here

Once you've filled in that information, it's time to hit the Send button at the top. If all goes well, when you scroll down, you'll see a JSON-encoded response containing your access token! It should look something like this:

{
  "access_token": "00Di0000000ap76!MYDAU6HT3R.T3S5A.Isaw350m3AaBVlSd6sC30sYDoHS9_UTZK3_Mtj7TaIjtflXF96J3ybrrvVTqpU3rezCgjVFe56xdzY6m5AKht_d",
  "instance_url": "https://na15.salesforce.com",
  "id": "https://login.salesforce.com/id/NOTREAL0000ap76EAA/003i0000000fm0Faq3",
  "token_type": "Bearer",
  "issued_at": "1444961351427",
  "signature": "8cTp+CtTAr3Y0uR3ad1ngTh1$sIQHJg25gGSrivMKpYA="
}

This is the access token we'll need to authorize any other REST request we make.

Achievement unlocked – access token

Now that we have an access token, we can start to manipulate data. Because REST requests utilize HTTP(S) actions, and URLs to access data, we first need to understand how the URLs are structured. All Salesforce REST APIs share a common root, /services, and structure:

/services/API_to_use/version/object

Thus, /services/data/v20.0/query is a request to the query endpoint of version 20 of the data API. Likewise, a simple request for information based on a known ID results in a URI like /services/data/v33.0/sobjects/account/001i000000FOKzS and returns a JSON object containing the account's details. If you have a particular subset of fields or need to find more than one record, use the query endpoint. The query endpoint accepts a URL-encoded SOQL query and returns a JSON object with a results key. This array will contain the results of the query; however, only the fields you specify are returned. Importantly, however, each record will also have an attribute's object attached to it. This attribute's object details the sObject type and, most importantly, it's URL. Take a look at the following example of a JSON response from Salesforce containing the two fields:

{
  "totalSize": 1,
  "done": true,
  "records": [
    {
      "attributes": {
        "type": "Account",
        "url": "/services/data/v33.0/sobjects/Account/001i000000FOKzOAAX"
      },
      "Name": "GenePoint",
      "Id": "001i000000FOKzOAAX"
    }
  ]
}

Here, in my SOQL statement, I requested Id, Name, as well as the attribute's child object, which I've highlighted. The url attribute is important, as it will become the url attribute we make update and undelete requests to.

Making the updates is simple. Make a PATCH request to the url attribute of the object you want to update. In your request body, send a JSON object with the fields you wish to update and the new values, as shown here:

{
    "name": "Most Awesome Company!",
    "industry": "Yak Shaving"
}

Likewise, deletion is equally simple. Send a DELETE request to the record's attribute url. It's important to note that these two requests return a response code of 204, no content, when successful. If you want the updated version of the record, you'll have to use a follow up GET request.

Creating a new record isn't difficult either, but it does highlight how the HTTP actions work. While we make the GET requests to retrieve data, the PATCH requests to update and DELETE calls to remove data, we use the POST action to create data. The POST request is really more like a process, in that it will process the request body in relation to the URL the request was made to. Thus, we can POST to /services/data/v20.0/sobjects/Account/ with a JSON request body, as follows:

{
    "name": "Most Awesome Co!",
    "industry": "Alchemy: Turning Caffeine into Code"
}

In response, the API will give us an ID but not the actual record or its attribute information. Specifically, the returned JSON object looks like this:

{
  "id": "001i000001e1PyqAAE",
  "success": true,
  "errors": []
}

You can use the success and error keys to determine whether your object creation was successful or not. In the event of a failure, the errors key will be populated instructing you how to resolve the issue.

Putting it all together

We've talked about querying, creating, reading, updating, and deleting records via the REST API. There are a couple of additional tips that can make or break a REST API integration, namely, governor and location headers. Governor limits on API access can be confusing, as they're calculated on a rolling 24 hour basis. In response to this, Salesforce applies a header to every response titled Sforce-limit-info. This header has an easy-to-understand value that can be machine read:

api-usage=10/15000

This allows you to build an integration with the Salesforce sObject API and intelligently determine if you're nearing your rolling 24 hour limit. If you're nearing your limit, throttle back your calls. This prevents the potential loss of data as your integration loses access to the API due to governor limits.

Salesforce appends another custom header to data creation (POST) calls: the Location header. The value this header contains is the actionable URL for the object you just created. Rather than making an additional query request to get the new object and its modification or deletion URL, simply inspect the response headers for the location key.

One of the benefits of REST APIs is that virtually every programming environment has a robust HTTP(s) stack, capable of making requests to a URL and handling the marshalling and unmarshalling of data. This is so true that boilerplate templates have been created for common languages. Postman has these templates built in and can take your request and export the working code in a number of languages. For instance, here's a Java version (using OKhttp) of a create call:

OkHttpClient client = new OkHttpClient();

MediaType mediaType = MediaType.parse("application/JSON");
RequestBody body = RequestBody.create(mediaType, "{
    "name": "Most Awesome Co!",
    "industry": "Alchemy: Turning Caffeine into Code"
}
");
Request request = new Request.Builder()
.url("https://na15.salesforce.com/services/data/v33.0/sobjects/Account/")
  .post(body)
  .addHeader("authorization", "Bearer 00Di0000000ap76!AQUAQDzVKaqKP4muP.AaBVlSd6sC30sYDoHS9_UTZK3_Mtj7TaIjtflXF96J3ybrrvVTqpU3KJPCgjVFe56xdzY6m5AKht_d")
  .addHeader("content-type", "application/JSON")
  .addHeader("cache-control", "no-cache")
  .addHeader("postman-token", "e1ae9a9a-c842-42e4-4d10-8f470fb56531")
  .build();

Response response = client.newCall(request).execute();

These created code snippets are great for prototyping mobile applications, but you should always ensure that you understand and approve of the code before putting it in a production environment.

Bulk data for everyone! Look under your seat!

Sooner or later, you'll surely to have to do a data load. The dataloader tools available for Salesforce are both legion and, for the most part, fantastic; in a way they're the last vestiges of the L in ETL. Unfortunately, while most dataloaders are to some extent scriptable and schedulable, they are not truly APIs. They generally require generating a CSV file as an intermediary step. Newer dataloader systems use the bulk data API, but they have historically utilized the standard sObject REST or SOAP APIs. In contrast to the oAuth authenticated RESTful sObject API, the bulk data API is designed to authenticate via a SOAP call. Additionally, it is architected in a way that facilitates data loads from external systems and applications in a much faster and more efficient manner than the standard sObject API. This efficiency comes from the ability to process multiple batches of data in parallel. Let's compare the standard sObject REST API and the bulk data API side by side:

Bulk data for everyone! Look under your seat!

Uploading using the standard sObject rest API is done sequentially. Summer '15 introduced the ability to do this in a single call using the /composite/batch resource, but even with this, the records are still processed sequentially. In other words, each object's upsert has its own potential cascade of triggers, workflow rules, and so on, that must complete before the next record is processed. This is fine for 5-10 records, but it fails with hundreds or thousands of records. In contrast, the bulk API requires multiple requests, first to create a job and then to upload batches of data. These batches, however, can contain up to 10,000 records each. Additionally, you can submit up to 5,000 batches in a rolling 24-hour period. In effect, allowing you to upload 50 million records a day! What's even more amazing is that these batches are processed in parallel with each other.

To use the bulk API, you'll first need to authenticate. Like the sObject API, once we've authenticated, we'll pass our credentials via a request header. Unlike the sObject API, the bulk API does not use the authentication header, but rather a custom header titled X-SFDC-Session. To get your session ID, utilize the SOAP API's login method, which returns the session ID.

One you've authenticated, you can create a job. Jobs act as the parent record for batches, and give you insight into the overall progress of the data load. Once you've created a job, you'll need to add batches to it. Batches are essentially just data with a reference to their parent job. However, like jobs, it's possible to use the batch ID and the bulk API to monitor an individual batch. Create and monitor these objects using the following resource URIs and HTTP actions:

Purpose

URI

HTTP Action

Create Job

/services/async/APIversion/job

POST

Close Job

/services/async/APIversion/job/jobId

POST

Job Details

/services/async/APIversion/job/jobId

GET

Abort

/services/async/APIversion/job/jobId

POST

Add Batch

/services/async/APIversion/job/jobid/batch

POST

Batch Details

/services/async/APIversion/job/jobid/batch/batchId

GET

Details for all batches

/services/async/APIversion/job/jobid/batch

GET

Batch Results

/services/async/APIversion/job/jobid/batch/batchId/result

GET

If you look carefully at those URI and action pairs, you'll notice that a couple of them seem to be identical. This isn't a mistake. Remember that POST, should be understood as 'process this data'. Post calls always require data to be sent and the content of that data is what differentiates a close job request and an abort request. I want to draw your attention to these post requests, as that's where your input matters the most.

To create a job, the request body must contain a well-formatted XML document describing a JobInfo object, with three information nodes, operation, object, and contentType. An operation can be any one of the following: insert, upsert, update, and delete. It's important to keep your operation value lowercase, as Insert or Update will fail. The object specifies which of your org's objects, standard or custom, to manipulate. This field is not case sensitive. Finally, you'll need to explicitly specify the data's contentType. Your options here are XML, CSV, ZIP_XML, and ZIP_CSV. The ZIP versions are for objects with binary attachments! Here's an example POST request body to create a new CSV-based bulk upsert job on Contact:

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo
   xmlns="http://www.force.com/2009/06/asyncapi/dataload">
 <operation>upsert</operation>
 <object>Contact</object>
 <contentType>CSV</contentType>
</jobInfo>

POSTing this request body to /services/async/35.0/job would result in a response similar to this:

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo
   xmlns="http://www.force.com/2009/06/asyncapi/dataload">
 <id>750D00000000023IAF</id>
 <operation>upsert</operation>
 <object>Contact</object>
 <state>Open</state>
 <contentType>CSV</contentType>
</jobInfo>

Two fields are especially important in this response: the id and the state fields. We need the ID to add batches to this job, and the state tells us whether or not the job is open for new batches. Closing a job prevents other batches from being added. Armed with our job's ID, we can now start adding batches with POST requests to:

/services/async/35.0/job/750D00000000023IAF/batch

Because our job is set to use CSV data, our POST request body looks like this:

FirstName,LastName,Title,ReportsTo.Email,Birthdate,Description
Stephanie,Poorman,Senior Director Poorman family, [email protected],1984-06-07Z,"Best darn wife this side of pluto"
Tessa,Poorman,Destructosaurus,[email protected],2014-03-11,"World-renowned expert in toddling about and making messes."

In response to that request, you'll receive an XML response with details of the batch, like this:

<?xml version="1.0" encoding="UTF-8"?>
<batchInfo
   xmlns="http://www.force.com/2009/06/asyncapi/dataload">
 <id>751D0000000004fIAA</id>
 <jobId>750D00000000023IAF</jobId>
 <state>Queued</state>
 <createdDate>2015-10-14T18:15:59.000Z</createdDate>
 <systemModstamp>2015-10-14T18:15:59.000Z</systemModstamp>
 <numberRecordsProcessed>0</numberRecordsProcessed>
</batchInfo>

With the id atrribute returned by the batch creation post, we can grab its details with a GET request to:

/services/async/35.0/job/750D00000000023IAF/batch/751D0000000004fIAA

Such a get request will return largely the same information the creation call gave, but with up-to-the-minute data. Once the batch progresses from the queued status through in process all the way to finished, we can query for results. Now that we have at least one batch associated with our job, we can close or abort it. Closing the job requires a payload like this:

<?xml version="1.0" encoding="UTF-8"?>
<jobInfo xmlns="http://www.force.com/2009/06/asyncapi/dataload">
 <state>Closed</state>
</jobInfo>

Having closed the job, we've covered the life cycle of the bulk API—creating a job, adding batches, and closing the job. Monitoring jobs and their batches can be done in the UI by visiting the bulk data load jobs page in setup.

All good things have their limits

It's important to know the limits of the bulk API before using it. Unlike the simple limits of the sObject API, the bulk API limits can be a little bit more complex. The first few limits are straightforward:

  • You can only submit 5,000 batches every 24 hours.
  • Each batch can have, at the most, 10,000 records.
  • Importantly, the entire CSV or XML can contain no more than 10,000,000 characters. If your records contain a large number of fields, your limit won't be the number of records but the number of characters.
  • Likewise, each field has a maximum character capacity of 32,000 characters, but the entire record must contain less than 400,000 characters.
  • Dizzying numbers aside, the bulk data limits described at https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/asynch_api_concepts_limits.htm also illustrate a neat trick. Note that batches can only be added to jobs that are less than 24 hours old. What's not said is that once the job is created, you can add batches to it all day long, leaving it open until the day is done. This allows you to open one job per object and add batches as needed; this lowers your overall overhead call cost of using the bulk API.

Use cases for the bulk API

While the bulk API is vastly different than the sObject API, it still relies on RESTful principles; relying on endpoint URLs and HTTP action verbs to map actions. Determining when and where to use the bulk API isn't as clear as the HTTP verbs. In general, however, the use cases for the bulk API are different than those of the sObject API, so it's often easy to determine when to use it. Here are some guidelines for determining which of these two APIs to use.

It takes at least three API calls to use the bulk API: One for creating the job, one for adding the batch, and one to close the job. With those three calls, however, you can upload up to 10,000 records. Conversely, using the sObject API, you can only create 600 records in three API calls. Thus, the first guideline is based on the dataset size. If you have more than 600 records, use the bulk API.

The hardest cases to determine are those whose data source is transactional, for instance, an enterprise order system where the data volume of an individual transaction may be small, but the number of transactions may be high. In these situations, where the data source is creating two or three records at a time, hundreds or thousands of times a day, you have to evaluate the number of transactions against your available API calls. For instance, if you're creating less than 200 records, each transaction counts against one of your API calls. This is fine until you start hitting tens of thousands of transactions a day. If you start to hit one-third of your 24 hour API call limit, I suggest that you find a way to collect and batch these transactions into larger batches you can upload with the bulk API.

I always try to bias toward using the bulk API whenever I'm not creating a single record. My thought behind this is that if today I'm creating 25 records with this integration, when my company gets triple digit year over year growth percentages, it's likely that the 25 will be come 25,000 or more. Planning for the bulk API is admittedly a premature optimization, but at the cost of two API calls and a little bit of forethought; it's a premature optimization I'm comfortable with.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.33.235