Real data

Up to this point, we've been working on an example JSON that the Twitter documentation provides. I assume by now you have your Twitter API access. So, let's get real Twitter data!

To get your API keys from the developer portal, click on the Get Started link. You will come to a page such as this:

Select Create an app. You will be brought to a page that looks like this:

I had previously created a Twitter app a long time ago (it had very similar features to the one we're creating in this project); hence, I have an app there already. Click on the blue Create an app button at the top right. You will be brought to the following form:

Fill in the form then click submit. It might take a few days before you receive an email saying the app has been approved for development. Be sure to be truthful in the description. Lastly, you should then be able to click into your app, and get the following page, which shows your API key and secret:

Click Create to create your access token and access token secret. You'll be needing them.

Now that we have our API access key, this is how you'd access Twitter using the Anaconda package:

 const (
ACCESSTOKEN = "_____"
ACCESSTOKENSECRET = "______"
CONSUMERKEY = "_____"
CONSUMERSECRET = "_______"
)
func main() {
twitter := anaconda.NewTwitterApiWithCredentials(ACCESSTOKEN, ACCESSTOKENSECRET, CONSUMERKEY, CONSUMERSECRET)
raw, err := twitter.GetHomeTimeline(nil)
f, err := os.OpenFile("dev.json", os.O_TRUNC|os.O_WRONLY|os.O_CREATE, 0644)
dieIfErr(err)
enc := json.NewEncoder(f)
enc.Encode(raw)
f.Close()
}

At first glance, this snippet of code is a little weird. Let's go through the code line by line. The first six lines deal with the access tokens and keys. Obviously, they should not be hardcoded in. A good way to handle secrets like these is to put them in environment variables. I'll leave that as an exercise to the reader. We'll move on to the rest of the code:

 twitter := anaconda.NewTwitterApiWithCredentials(ACCESSTOKEN, ACCESSTOKENSECRET, CONSUMERKEY, CONSUMERSECRET)
raw, err := twitter.GetHomeTimeline(nil)

These two lines uses the Anaconda library to get the tweets found in the Home timeline. The nil being passed in may be of interest. Why would one do this? The GetHomeTimeline method takes a map of url.Values. The package can be found in the standard library as net/url. Values is defined thus:

type Values map[string][]string

But what do the values represent? It turns out that you may pass some parameters to the Twitter API. The parameters and what they do are enumerated here: https://developer.twitter.com/en/docs/tweets/timelines/api-reference/get-statuses-home_timeline. I don't wish to limit anything, so passing in nil is acceptable.

The result is []anaconda.Tweet, all neatly packaged up for us to use. The following few lines are therefore quite odd:

 f, err := os.OpenFile("dev.json", os.O_TRUNC|os.O_WRONLY|os.O_CREATE, 0644)
dieIfErr(err)
enc := json.NewEncoder(f)
enc.Encode(raw)
f.Close()

Why would I want to save this as a JSON file? The answer is simple—when using machine learning algorithms, you may need to tune the algorithm. Saving the request as a JSON file serves two purposes:

  • It allows for consistency. Under active development, you would expect to tweak the algorithm a lot. If the JSON file keeps changing, how do you know if it's the tweaks that are making the improvements, and not because the JSON has changed?
  • Being a good citizen. Twitter's API is rate limited. This means you cannot request the same thing over and over again too many times. While testing and tuning machine learning algorithms, you are likely to have to repeatedly process your data over and over again. Instead of hammering the Twitter servers, you should be a good citizen and use a locally cached copy.

We defined load earlier. Again, we shall see its usefulness in the context of tweaking the algorithms.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.151.220