13.9 Introduction to Tweepy Cursors: Getting an Account’s Followers and Friends

When invoking Twitter API methods, you often receive as results collections of objects, such as tweets in your Twitter timeline, tweets in another account’s timeline or lists of tweets that match specified search criteria. A timeline consists of tweets sent by that user and by that user’s friends—that is, other accounts that the user follows.

Each Twitter API method’s documentation discusses the maximum number of items the method can return in one call—this is known as a page of results. When you request more results than a given method can return, Twitter’s JSON responses indicate that there are more pages to get. Tweepy’s Cursor class handles these details for you. A Cursor invokes a specified method and checks whether Twitter indicated that there is another page of results. If so, the Cursor automatically calls the method again to get those results. This continues, subject to the method’s rate limits, until there are no more results to process. If you configure the API object to wait when rate limits are reached (as we did), the Cursor will adhere to the rate limits and wait as needed between calls. The following subsections discuss Cursor fundamentals. For more details, see the Cursor tutorial at:

http://docs.tweepy.org/en/latest/cursor_tutorial.html

13.9.1 Determining an Account’s Followers

Let’s use a Tweepy Cursor to invoke the API object’s followers method, which calls the Twitter API’s followers/list method11 to obtain an account’s followers. Twitter returns these in groups of 20 by default, but you can request up to 200 at a time. For demonstration purposes, we’ll grab 10 of NASA’s followers.

Method followers returns tweepy.models.User objects containing information about each follower. Let’s begin by creating a list in which we’ll store the User objects:

In [17]: followers = []

Creating a Cursor

Next, let’s create a Cursor object that will call the followers method for NASA’s account, which is specified with the screen_name keyword argument:

In [18]: cursor = tweepy.Cursor(api.followers, screen_name='nasa')

The Cursor’s constructor receives as its argument the name of the method to call—api.followers indicates that the Cursor will call the api object’s followers method. If the Cursor constructor receives any additional keyword arguments, like screen_name, these will be passed to the method specified in the constructor’s first argument. So, this Cursor specifically gets followers for the @nasa Twitter account.

Getting Results

Now, we can use the Cursor to get some followers. The following for statement iterates through the results of the expression cursor.items(10). The Cursor’s items method initiates the call to api.followers and returns the followers method’s results. In this case, we pass 10 to the items method to request only 10 results:

In [19]: for account in cursor.items(10):
    ...:     followers.append(account.screen_name)
    ...:

In [20]: print('Followers:',
    ...:       ' '.join(sorted(followers, key=lambda s: s.lower())))
    ...:
Followers: abhinavborra BHood1976 Eshwar12341 Harish90469614 heshamkisha
Highyaan2407 JiraaJaarra KimYooJ91459029 Lindsey06771483 Wendy_UAE_NL

The preceding snippet displays the followers in ascending order by calling the built-in sorted function. The function’s second argument is the function used to determine how the elements of followers are sorted. In this case, we used a lambda that converts every follower name to lowercase letters so we can perform a case-insensitive sort.

Automatic Paging

If the number of results requested is more than can be returned by one call to followers, the items method automatically “pages” through the results by making multiple calls to api.followers. Recall that followers returns up to 20 followers at a time by default, so the preceding code needs to call followers only once. To get up to 200 followers at a time, we can create the Cursor with the count keyword argument, as in:

cursor = tweepy.Cursor(api.followers, screen_name='nasa', count=200)

If you do not specify an argument to the items method, The Cursor attempts to get all of the account’s followers. For large numbers of followers, this could take a significant amount of time due to Twitter’s rate limits. The Twitter API’s followers/list method can return a maximum of 200 followers at a time and Twitter allows a maximum of 15 calls every 15 minutes. Thus, you can only get 3000 followers every 15 minutes using Twitter’s free APIs. Recall that we configured the API object to automatically wait when it hits a rate limit, so if you try to get all followers and an account has more than 3000, Tweepy will automatically pause for 15 minutes after every 3000 followers and display a message. At the time of this writing, NASA has over 29.5 million followers. At 12,000 followers per hour, it would take over 100 days to get all of NASA’s followers.

Note that for this example, we could have called the followers method directly, rather than using a Cursor, since we’re getting only a small number of followers. We used a Cursor here to show how you’ll typically call followers. In some later examples, we’ll call API methods directly to get just a few results, rather than using Cursors.

Getting Follower IDs Rather Than Followers

Though you can get complete User objects for a maximum of 200 followers at a time, you can get many more Twitter ID numbers by calling the API object’s followers_ids method. This calls the Twitter API’s followers/ids method, which returns up to 5000 ID numbers at a time (again, these rate limits could change).12 You can invoke this method up to 15 times every 15 minutes, so you can get 75,000 account ID numbers per rate-limit interval. This is particularly useful when combined with the API object’s lookup_users method. This calls the Twitter API’s users/lookup method13 which can return up to 100 User objects at a time and can be called up to 300 times every 15 minutes. So using this combination, you could get up to 30,000 User objects per rate-limit interval.

tick mark Self Check

  1. (Fill-In) Each Twitter API method’s documentation discusses the maximum number of items the method can return in one call—this is known as a       of results.
    Answer: page.

  2. (True/False) Though you can get complete User objects for a maximum of 200 followers at a time, you can get many more Twitter ID numbers by calling the API object’s followers_ids method.
    Answer: True.

  3. (IPython Session) Use a Cursor to get and display 10 followers of the NASAKepler account.

In [21]: kepler_followers = []

In [22]: cursor = tweepy.Cursor(api.followers, screen_name='NASAKepler')

In [23]: for account in cursor.items(10):
    ...:     kepler_followers.append(account.screen_name)
    ...:

In [24]: print(' '.join(kepler_followers))
cheleandre_ FranGlacierGirl Javedja88171520 Ameer90577310 c4rb0hydr8
rashadali77777 ICPN2019 us0OU5hSZ8BwnsA KHRSC1 xAquos

13.9.2 Determining Whom an Account Follows

The API object’s friends method calls the Twitter API’s friends/list method14 to get a list of User objects representing an account’s friends. Twitter returns these in groups of 20 by default, but you can request up to 200 at a time, just as we discussed for method followers. Twitter allows you to call the friends/list method up to 15 times every 15 minutes. Let’s get 10 of NASA’s friend accounts:

In [25]: friends = []

In [26]: cursor = tweepy.Cursor(api.friends, screen_name='nasa')

In [27]: for friend in cursor.items(10):
    ...:     friends.append(friend.screen_name)
    ...:

In [28]: print('Friends:',
    ...:       ' '.join(sorted(friends, key=lambda s: s.lower())))
    ...:
Friends: AFSpace Astro2fish Astro_Kimiya AstroAnnimal AstroDuke
NASA3DPrinter NASASMAP Outpost_42 POTUS44 VicGlover

tick mark Self Check

  1. (Fill-In) The API object’s friends method calls the Twitter API’s       method to get a list of User objects representing an account’s friends.
    Answer: friends/list.

13.9.3 Getting a User’s Recent Tweets

The API method user_timeline returns tweets from the timeline of a specific account. A timeline includes that account’s tweets and tweets from that account’s friends. The method calls the Twitter API’s statuses/user_timeline method15, which returns the most recent 20 tweets, but can return up to 200 at a time. This method can return only an account’s 3200 most recent tweets. Applications using this method may call it up to 1500 times every 15 minutes.

Method user_timeline returns Status objects with each one representing a tweet. Each Status’s user property refers to a tweepy.models.User object containing information about the user who sent that tweet, such as that user’s screen_name. A Status’s text property contains the tweet’s text. Let’s display the screen_name and text for three tweets from @nasa:

In [29]: nasa_tweets = api.user_timeline(screen_name='nasa', count=3)

In [30]: for tweet in nasa_tweets:
    ...:     print(f'{tweet.user.screen_name}: {tweet.text}n')
    ...:
NASA: Your Gut in Space: Microorganisms in the intestinal tract play an especially important role in human health. But wh… https://t.co/uLOsUhwn5p

NASA: We need your help! Want to see panels at @SXSW related to space exploration? There are a number of exciting panels… https://t.co/ycqMMdGKUB

NASA: “You are as good as anyone in this town, but you are no better than any of them,” says retired @NASA_Langley mathem… https://t.co/nhMD4n84Nf

These tweets were truncated (as indicated by …), meaning that they probably use the newer 280-character tweet limit. We’ll use the extended_tweet property shortly to access full text for such tweets.

In the preceding snippets, we chose to call the user_timeline method directly and use the count keyword argument to specify the number of tweets to retrieve. If you wish to get more than the maximum number of tweets per call (200), then you should use a Cursor to call user_timeline as demonstrated previously. Recall that a Cursor automatically pages through the results by calling the method multiple times, if necessary.

Grabbing Recent Tweets from Your Own Timeline

You can call the API method home_timeline, as in:

api.home_timeline()

to get tweets from your home timeline16—that is, your tweets and tweets from the people you follow. This method calls Twitter’s statuses/home_timeline method.17 By default, home_timeline returns the most recent 20 tweets, but can get up to 200 at a time. Again, for more than 200 tweets from your home timeline, you should use a Tweepy Cursor to call home_timeline.

tick mark Self Check

  1. (Fill-In) You can call the API method home_timeline to get tweets from your home timeline, that is, your tweets and tweets from      .
    Answer: the people you follow.

  2. (IPython Session) Get and display two tweets from the NASAKepler account.
    Answer:

In [31]: kepler_tweets = api.user_timeline(
    ...:     screen_name='NASAKepler', count=2)
    ...:

In [32]: for tweet in kepler_tweets:
    ...:     print(f'{tweet.user.screen_name}: {tweet.text}n')
    ...:
NASAKepler: RT @TheFantasyG: Learning that there are
#MorePlanetsThanStars means to me that there are near endless
possibilities of unique discoveries…

NASAKepler: @KerryFoster2 @NASA Refueling Kepler is not practical since
it currently sits 94 million miles from Earth. And with… https://t.co/D2P145EL0N
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.63.145