Working with real data: fetching trending tweets

Many online entities format their response data as JSON and XML in their Application Programmer Interfaces (APIs) to expose pertinent information to third-party developers who can subsequently integrate this data into their applications.

One such online entity is Twitter. In this recipe, we are going to make a command-line application that makes two requests to Twitter's REST service. The first will retrieve the most popular current topics on Twitter and the second will return the most recent tweets regarding the hottest topic on Twitter.

Getting ready

Let's create our file and name it twitter_trends.js. We may also wish to install the third-party colors module to make our output more beautiful:

npm install colors

How to do it...

We'll need the http module in order to make requests, and the colors module to get some color in our console output:

var http = require('http'),
var colors = require('colors'),

We're going to be making a GET request inside another GET request. Between these requests we'll be processing JSON data to either pass into the subsequent request or to output to console. In the spirit of DRY (Don't Repeat Yourself), and to demonstrate how to avoid spaghetti code, we'll abstract our GET requests and JSON handling into a function called makeCall.

function makeCall(urlOpts, cb) {
  http.get(urlOpts, function (response) { //make a call to the twitter API  
    trendingTopics.jsonHandler(response, cb);
  }).on('error', function (e) {
    console.log("Connection Error: " + e.message);
  });
}
}

Notice the mysterious appearance of trendingTopics and its jsonHandler method. trendingTopics is an object that is going to provide all the settings and methods for our Twitter interactions. jsonHandler is a method on the trendingTopics object for receiving the response stream and converting the JSON to an object.

We need to set up options for our calls to the trends and tweets APIs, along with some Twitter interaction-related functionality. So above our makeCall function, we'll create the trendingTopics object as follows:

var trendingTopics = module.exports = {
  trends: {
    urlOpts: {
      host: 'api.twitter.com',
      path: '/1/trends/1.json', //1.json provides global trends,
      headers: {'User-Agent': 'Node Cookbook: Twitter Trends'}
    }
  },
  tweets: {
    maxResults: 3, //twitter applies this very loosely for the "mixed" type
    resultsType: 'realtime', //choice of mixed, popular or realtime
    language: 'en', //ISO 639-1 code
    urlOpts: {
      host: 'search.twitter.com',
      headers: {'User-Agent': 'Node Cookbook: Twitter Trends'}
    }
  },
  jsonHandler: function (response, cb) {
    var json = '';
    response.setEncoding('utf8'),
    if (response.statusCode === 200) {
      response.on('data', function (chunk) {
        json += chunk;
      }).on('end', function () {
        cb(JSON.parse(json));
      });
    } else {
      throw ("Server Returned statusCode error: " + response.statusCode);
    }
  },
  tweetPath: function (q) {
    var p = '/search.json?lang=' + this.tweets.language + '&q=' + q +
        '&rpp=' + this.tweets.maxResults + '&include_entities=true' +
        '&with_twitter_user_id=true&result_type=' +
        this.tweets.resultsType;
    this.tweets.urlOpts.path = p;
  }
};

While creating the trendingTopics variable, we also turn the object into a module by simultanouesly loading it into module.exports. See how we use this in the There's more... section.

Within our trendingTopics object, we have the trends and tweets objects and two methods: jsonHandler and tweetPath.

Finally, we'll invoke our makeCall function to request the top global trends from the Twitter trends API, converting the returned JSON to an object, AND using this object to ascertain the path to request tweets on the highest trending topic for using another embedded makeCall invocation.

makeCall(trendingTopics.trends.urlOpts, function (trendsArr) {
  trendingTopics.tweetPath(trendsArr[0].trends[0].query);
  makeCall(trendingTopics.tweets.urlOpts, function (tweetsObj) {
    tweetsObj.results.forEach(function (tweet) {
      console.log("
" + tweet.from_user.yellow.bold + ': ' + tweet.text);
    });
  });
});

How it works...

Let's pick apart the trendingTopics object. trends and tweets provide settings relevant to Twitter's API. For trends this is simply a URL options object to be passed into http.get later on. In the tweets object, we have the URL object along with some other properties pertaining to options we can set within our REST call to the Twitter search API.

Twitter API and the User-Agent header

Notice we've gone to the trouble of setting a User-Agent header. This is due to the Twitter API policy, which penalizes a lack of User-Agent string by imposing a lower rate limit.

Our jsonHandler method on the trendingTopics object takes a reponse and cb (callback) parameter. trendingTopics.jsonHandler uses the response object from the http.get call to capture the incoming data stream into a variable (json). When the stream has ended, which is detected using the end event listener on response, cb is invoking converted JSON as the parameter. The callback from trendingTopics.jsonHandler finds its way up into the makeCall callback.

makeCall abstractly combines the GET request and JSON handling, and provides a callback with a single parameter which is the data returned by Twitter as parsed JSON (in this case, it is an array of objects).

In the outer makeCall invocation we call the parameter trendsArr, because Twitter returns its JSON data in an array wrapper. We use trendsArr to locate the query fragment representation of the Twitter's top trend and pass it to the final method of our trendingTopics object: trendingTopics.tweetPath. This method takes a query fragment (q) as its single parameter. It then uses this parameter along with the options in trendingTopics.tweets to build the final Search API path. It injects this path into the urlOpts object of trendingTopics.tweets, which is then passed through into the inner makeCall invocation.

In the inner makeCall invocation we name the parameter tweetsArr. This is an array of objects containing tweet data as returned from the Twitter Search API in response to a query searching for the top trend discovered via the former (outer) call to the Trend API. We loop through the array using the veritable forEach (ES5) looping function, handling each element passed through the loop as tweet.

The objects contained in the tweetsArr array contain lots of data, such as time information, amount of re-tweets, and so forth. However, we're just interested in the content of the tweet, and who tweeted. So we log the from_user and text properties of each tweet to the console:

Twitter API and the User-Agent header

This is also where the colors module comes in handy since, within console.log we have tweet.from_user.yellow.bold. The colors are not properties on the object returned by Twitter, but rather some trickery performed by the colors module to provide an easy interface for styling console text.

There's more...

Let's look into working with an XML-based service.

Cross referencing Google Hot Trends with Twitter tweets

It may be noticed that trending tweets tend to have rather fad-like influences generated from within the Twitter community. Google Hot Trends is another source of trending information. It provides hourly updates of the highest trending searches.

We can extend our example to access and process Google's Hot Trends XML atom feed, and then integrate the top result into our Twitter Search API request. To do this, let's create a new file called google_trends.twitter.js. It's nice to work with XML data as a JavaScript object, so we'll require the non-core xml2js featured in the Converting an object to XML and back again recipe in this chapter, along with http, colors, and our own trendingTopics module.

var http = require('http'),
var xml2js = new (require('xml2js')).Parser();

var colors = require('colors'), //for prettifying the console output
var trendingTopics = require('./twitter_trends'), //load trendingTopics obj

Now we'll extend our trendingTopics object by inheriting from it using the EcmaScript 5 Object.create method.

var hotTrends = Object.create(trendingTopics, {trends: {value: {urlOpts: {
    host: 'www.google.com',
    path: '/trends/hottrends/atom/hourly',
    headers: {'User-Agent': 'Node Cookbook: Twitter Trends'}
  }
    }}});

hotTrends.xmlHandler = function (response, cb) {
  var hotTrendsfeed = '';
  response.on('data', function (chunk) {
    hotTrendsfeed += chunk;
  }).on('end', function () {
    xml2js.parseString(hotTrendsfeed, function (err, obj) {
      if (err) { throw (err.message); }
      xml2js.parseString(obj.entry.content['#'],
	function (err, obj) {
        if (err) { throw (err.message); }
        cb(encodeURIComponent(obj.li[0].span.a['#']));
      });
    });
  });
};

We declared a variable called hotTrends, and used Object.create to initialize an instance of trendingTopics, re-substantiating the trends property via the property declarations object (the second parameter of Object.create). This means that instead of trends being an inherited property, it now belongs to hotTrends and we haven't overwritten the trends property in trendingTopics when adding it to our new hotTrends object.

We then add a new method: hotTrends.xmlHandler. This combines all the incoming chunks into the hotTrendsfeed variable. Once the stream has ended, it invokes xml2js.parseString and passes the XML contained in hotTrendsfeed into it. In the callback of the first parseString method, we invoke xml2js.parseString again. Why? Because we have to parse two sets of XML, or rather one set of XML and one set of adequately formed HTML. (If we head to http://www.google.com/trends/hottrends/atom/hourly it will be rendered as HTML. If we view the source, we'll then see an XML document with embedded HTML content.)

Google's Hot Trends XML feed delivers the Hot Trends as HTML inside of its content XML node.

The HTML is wrapped within a CDATA section, so it isn't parsed by xml2js the first time round. Ergo, we create a new Parser and then parse the HTML via obj.entry.content['#'].

Finally, the hotTrends.xmlHandler method completes in the second embedded xml2js callback where it executes its own callback parameter (cb) with a query fragment generated from the top list item element in HTML.

Now all we have to do is make some adjustments to makeCall:

function makeCall(urlOpts, handler, cb) {
  http.get(urlOpts, function (response) { //make a call to the twitter api  
    handler(response, cb);
  }).on('error', function (e) {
    console.log("Connection Error: " + e.message);
  });
}

makeCall(hotTrends.trends.urlOpts, hotTrends.xmlHandler, function (query) {
  hotTrends.tweetPath(query);
  makeCall(hotTrends.tweets.urlOpts, hotTrends.jsonHandler, function (tweetsObj) {
    tweetsObj.results.forEach(function (tweet) {
      console.log("
" + tweet.from_user.yellow.bold + ': ' + tweet.text);
    });
  });
});

As we are now dealing with both JSON and XML, we slipped in another parameter to our makeCall function declaration: handler. The handler parameter allows us to specify whether to use the inherited jsonHander method or our supplemented xmlHandler method.

When we invoke the outer makeCall, we pass in hotTrends.xmlHandler, naming the parameter query. This is done because we are directly passing in the query fragment generated by xmlHandler instead of the array returned from Twitter. This is passed directly into the tweetPath method, which consequently updates the path property of the hotTrends.tweets.urlOpts object.

We pass hotTrends.tweets.urlOpts into the second makeCall, this time setting the handler parameter to hotTrends.jsonHandler.

The second makeCall callback behaves exactly the same as in the main recipe. It outputs the tweets to the console. This time, however, it outputs tweets based on Google Hot Trends.

See also

  • Using Node as an HTTP client discussed In Chapter 2,Exploring the HTTP Object
  • Converting an object to JSON and back again discussed in this chapter
  • Converting an object to XML and back again discussed in
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.72.224