Accessing Flickr's data

Now that we have a brief overview of Flickr, let's get started with our ritual of getting access to a platform's data. Pretty much as we have already seen across different social networks, Flickr exposes its data through a set of APIs for which we would need to create an app. For the use cases to be discussed further in this chapter, we will be relying on the latest API endpoints exposed by Flickr through direct calls, instead of R packages to do so. Flickr datasets, similar to StackExchange data dumps, though not official in certain cases, are also available on the Internet but are beyond the scope of this chapter.

Flickr exposes its data through APIs which accept and respond in formats such as JSON, XML, SOAP, and so on. It also supports developer API kits in various programming languages such as C, Java, Python, and so on, but unfortunately not in R. Though R has certain packages to connect and work with Flickr APIs, most of them are not updated with the latest changes and so pose a problem.

Note

A complete list of Flickr APIs and documentation related to them is available here: https://www.flickr.com/services/api/

Flickr API methods are cleanly segregated into various categories based on their purpose, such as activity, auth, interestingness, groups, favorites and so on. We will be using API methods under some of these categories for our use cases.

Creating the Flickr app

The process of creating a Flickr app is pretty straightforward. We will go through a step-by-step process to create one for this chapter to be used across the use cases.

To begin with, we assume you have a Flickr or a Yahoo! account. Once you have one, go to the following URL: https://www.flickr.com/services/apps/create/. Then click on the Request an API Key link:

Creating the Flickr app

Creating a Flickr app

On the next screen, select the appropriate option related to commercial use or non-commercial use of the APIs.

The next screen is where you enter details related to the app. The following image shows the brief form you need to fill out:

Creating the Flickr app

Providing app details

Once submitted, the final screen presents the app details with an API key and secret for use. It also adds your app to what Flickr calls The App Garden. The App Garden is a repository of all Flickr apps and is a good place to draw inspiration on what can be done using Flickr data and APIs. The following is a screenshot of our newly created app:

Creating the Flickr app

App is ready

This page also outlines resources and links related to API Terms of Use, Community Documentation, Flickr API Group, and Community Guidelines. We ask readers to go through them before proceeding with the use cases.

Connecting to R

Now that we have an app on Flickr with the required credentials, our next step is to test the connection to its APIs.

As mentioned earlier, we will be directly interacting with Flickr APIs instead of using outdated R packages. Using an API directly also gives us an opportunity to learn how to extract data in cases where helper methods/abstractions through packages are not available. Also, in cases where APIs are undergoing constant change, it is hard to keep the packages relevant and updated unless there is strong demand.

For interacting directly with the APIs, we will be utilizing the oauth_app() function from the httr package to connect and extract information. The following code snippet sets up the required variables and connects to our Flickr app using R:

library(httr)

api_key <- "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
secret <- "XXXXXXXXXXXXXXXXX"


flickr.app <- oauth_app("Flickr Sample App",api_key,secret)


flickr.endpoint <- oauth_endpoint(
  request = "https://www.flickr.com/services/oauth/request_token"
  , authorize = "https://www.flickr.com/services/oauth/authorize"
  , access = "https://www.flickr.com/services/oauth/access_token"
)

tok <- oauth1.0_token(
  flickr.endpoint
  , flickr.app
  , cache = F
)

Upon execution, the code snippet takes us to Flickr's authorization page to allow this session to be authenticated and to proceed with using its API methods, as shown in the following screenshot:

Connecting to R

App authorization page

Once you select the OK, I'LL AUTHORIZE IT button, your credentials are verified and the following screen is presented:

Connecting to R

Authorization complete

This completes our connection to Flickr using R and now we can proceed to extract data and uncover interesting insights.

Getting started with Flickr data

Before we get into the actual use cases, let's spend some time extracting sample data from the APIs and see what it looks like. The following code snippet uses the interestingess method to fetch interesting photos from yesterday. We use the GET() method from the httr package to use the API as follows:

raw_sample_data <- GET(url=sprintf(
"https://api.flickr.com/services/rest/?method=flickr.interestingness.getList&api_key=%s&date=%s&format=json&nojsoncallback=1"
  , api_key
  , format( Sys.Date()-1, "%Y-%m-%d")
  , tok$credentials$oauth_token
)
)

The returned data structure is a list with multiple attributes, the most important ones being $status_code and $content. A status code of 200 refers to OK and that server has indeed returned a response with some data. Since we mentioned the format as JSON in the URL, $content needs to be parsed as a JSON to make sense out of it.

Since our response object needs to go through multiple preprocessing steps, we make use of an R library called pipeR. This package helps in chaining function calls using the %>>% and . operators. We use it to extract the data retrieved from the Flickr API under the $content attribute. We also chain helper methods from the jsonlite package to handle JSON.

Note

For more details on pipeR, refer to this tutorial: https://renkun.me/pipeR-tutorial/

The following snippet transforms the raw response object into a DataFrame by extracting photo related information from $photos attribute under $content:

# extract relevant photo data 
processed_sample_data <- raw_sample_data %>>% 
                          content(as="text") %>>% 
                          jsonlite::fromJSON ()%>>%
                          ( data.frame(
                           date = format( Sys.Date() - i, "%Y-%m-%d")
                            ,.
                            ,stringsAsFactors=F
)
)

The output from this snippet generates a DataFrame as shown in the following screenshot:

Getting started with Flickr data

Sample data from Flickr

The preceding exercise helped us extract a day's worth of raw interestingness data from the Flickr API and convert it into an R DataFrame. We create a utility function so as to reuse the code for the upcoming use cases. The following snippet is a quick example to showcase extraction of three days' worth of data using our utility method: getInterestingData():

# extract multiple days worth of data

#use this to specify how many days to analyze
daysAnalyze = 3
  
raw_sample_data <- lapply(1:daysAnalyze,getInterestingData) %>>%
                    # combine all the days into a data frame
                    ( do.call(rbind, .) )
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.197.94