Including a Twitter search in your application

This example shows you how to include the result of a Twitter search in your application. This time it is not client based as the first recipe of this chapter. The result will be downloaded to the server, and then displayed to the client. This poses a possible problem.

What happens if your server cannot reach Twitter? There might be a number of different reasons. For example, your DNS is flaky, Twitter is down, the routing to Twitter is broken, or you are pulling off too many requests resulting in a ban, and many, many more. However, this should not affect your application. It might, of course, affect what data is displayed on the page – it may however never stop, or block any of your services. Your complete system has to be unaffected by failure of any external system. This recipe shows a small example and incidentally uses the Twitter API for this. You can, however, copy the principle behind this to any arbitrary API you are connecting to. You can get more information about the Twitter API we are about to use at http://dev.twitter.com/doc/get/search.

The source code of the example is available at examples/chapter4/mashup-twitter-search.

Getting ready

All you need is a small application which gets some data from an external source.

How to do it...

In order to be a little bit dynamic add the following query to the application.conf file:

twitter.query=http://search.twitter.com/search.json?q=playframework%20OR%20from.playframework&lang=en

Create a POJO (Plain Old Java Object) which models the mandatory fields of a Twitter search query response, as shown in the following code snippet:

public class SearchResult implements Serializable {

  @SerializedName("from_user") public String from;
  @SerializedName("created_at") public Date date;
  @SerializedName("text") public String text;
}

Write a job which queries the Twitter service every 10 minutes and stores the results, as shown in the following code snippet:

@OnApplicationStart
@Every("10min")
public class TwitterSearch extends Job {

  public void doJob() {
    String url = Play.configuration.getProperty("twitter.query");
    
    if (url == null) {
      return;
    }
    
    JsonElement element = WS.url(url).get().getJson();
    
    if (!element.isJsonObject()) {
      return;
    }
    
    JsonObject jsonObj = (JsonObject) element;

    Gson gson = new GsonBuilder().setDateFormat("EEE, dd MMM yyyy HH:mm:ss Z").create();
    
    Type collectionType = new TypeToken<Collection<SearchResult>>(){}.getType();
    Collection<SearchResult> search = gson.fromJson(jsonObj.get("results"), collectionType);
    search = removeDuplicates(search);

    Cache.set("twitterSearch", search);
  }

 
   private Collection<SearchResult> removeDuplicates(Collection<SearchResult> search) {
    Collection<SearchResult> nonduplicateSearches = new LinkedHashSet();
    Set<String> contents = new HashSet();
    
    for (SearchResultsearchResult : search) {
      if (!contents.contains(searchResult.text)) {
        nonduplicateSearches.add(searchResult);
        contents.add(searchResult.text);
      }
    }

    return nonduplicateSearches;
  }
}

Put the Twitter results in your page rendering code, as shown in the following code snippet:

public class Application extends Controller {

  @Before
  public static void twitterSearch() {
    Collection<SearchResult> results = Cache.get("twitterSearch", Collection.class);
    if (results == null) {
      results = Collections.emptyList();
    }
    renderArgs.put("twitterSearch", results);
  }
  
  public static void index() {
    render();
  }
}

The final step is to create the template code, as shown in the following code snippet:

#{extends 'main.html' /}
#{settitle:'Home' /}

#{cache 'twitterSearchesRendered', for:'10min'}
  <ul>
    #{list twitterSearch, as:'search'}
    <li><i>${search.text}</i>by ${search.from}, ${search.date.since()}</li>
    #{/list}
  </ul>
#{/cache}

How it works...

As you can see, it is not overly complex to achieve independence from your API providers with a few tricks.

Configuration is pretty simple. The URL resembles a simple query searching for everything which contains Play framework, or is from the @playframework Twitter account. The created page should stay up-to-date with the Play framework.

The SearchResult class represents an entity with the JSON representation of the search reply defined in the configuration file. If you put this URL into a browser, you will see a JSON reply, which has from_user, createdAt, and text fields. As the naming scheme is not too good, the class uses the @SerializedName annotation for a mapping. You could possibly map more fields, if you wanted. Note that the @SerializedName annotation is a helper annotation from the Gson library.

The logic is placed in the TwitterSearch class. It is a job. This helps to decouple the query to Twitter from any incoming HTTP request. You do not want to query any API at the time new requests come in. Of course, there are special cases such as market rates that have to be live data. However, in most of the cases there is no problem, when the data provided is not real-time. Decoupling this solves several problems. It reduces wait times until the request is loaded. It reduces wait times, while the response is parsed. All this has to be done, while the client is waiting for a response. This is especially important for Play in order to not block other clients that are also requesting resources.

The TwitterSearch doJob() method checks whether a configuration URL has been provided. If this is the case, then it is fetched via the WS class, which is a very useful helper class – and directly stored as a JSON element. If the returned JSON element is a complex JSON object, a Google gson parser is created. It is created with a special date format, which automatically parses the date in the createdAt field, and saves the need to create a custom date serializer, as the results field inside the JSON reply contains all Twitter messages. This field should be deserialized into a collection of SearchResult instances. Because this is going to be a collection, a special collection type has to be created and mapped. This is done with the TypeToken class, which gets handed over to the gson.fromJson() method. Finally, the removeDuplicates() methods filters out all retweets by not allowing duplicate text content in the collection of SearchResult instances. This makes sure that only boring retweets are not displayed in your list of tweets. After the collection is cleared of doubled tweets, it is put in the cache.

The controller has a method with a @Before annotation, which checks the cache for a list of SearchResults. If there is no cache hit, it creates an empty collection. In any case, the collection is put into the renderArgs variable and can be used in any template of this controller.

The last step is to display the content. If you take a look at the template, here again the caching feature is used. It is absolutely optional to use the cache here. At worst, you get a delay of 20 minutes until your data is updated, because the job only runs every ten minutes in addition to caching for ten minutes inside of the template. Think about, whether such a caching makes sense in your application before implementing it.

There's more...

Even though caching is easy and fast to implement, sometimes there are scenarios where it might not be needed, like the possibility of telling the client to get the data. As usual, try to create applications without caching and only add it if needed.

Make it a client side API

Check out the search documentation on the Twitter site at http://dev.twitter.com/doc/get/search—when you check the possibility of the JSON API to use a callback parameter. It actually gets pretty easy to build this as a client side search, so your servers do not have to issue the request to Twitter. You should check any API, to see whether it is actually possible to offload stuff to the client. This keeps your application even more scalable and independent – from a server point of view, not the functionality point of view.

Add caching to your code late

Whenever you are developing new features and start integrating APIs, you should actually add the caching feature as late as possible. You might stumble over strange exceptions, when putting invalid or incomplete data into the cache because of incorrect error handling, or not putting serialized objects into the cache. Keep this in mind, as soon as you stumble across error messages or exceptions when trying to read data from a cache. Again, cover everything with tests as much as possible.

Be fast with asynchronous queries

If you have a use-case where you absolutely must get live data from another system, you still have the possibility to speed things up a little bit. Imagine the following controller call, which returns two daily quotes, and queries remote hosts in order to make sure it always gets the latest quote instead of a boring cached one, as shown in the following code snippet:

public static void quotes() throws Exception {

  Promise<HttpResponse> promise2 = WS.url("http://www.iheartquotes.com/api/v1/random?format=json").getAsync();
  Promise<HttpResponse> promise1 = WS.url("http://jamocreations.com/widgets/travel-quotes/get.php").getAsync();
  
  // code here, preferably long running like db queries...
  // ... 
  List<HttpResponse>resps = Promise.waitAll(promise1, promise2).get();

  if(resps.get(0) != null) {
    renderArgs.put("cite1", resps.get(0).getXml().getElementsByTagName("quote").item(0).getChildNodes().item(1).getTextContent());
  }

  if(resps.get(1) != null) {
    renderArgs.put("cite2", ((JsonObject) resps.get(1).getJson()).get("quote").getAsString());
  }
  
  render();
}

This allows you to trigger both external HTTP requests in a non-blocking fashion, instead of calling them sequentially, and then perform some database queries. After that you can access the promise, where the probability of them already having ended is much higher. The preceding snippet is included in examples/chapter5/mashup-api.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.67.5