Searching for a source code repository at GitHub

As a Python programmer, you may already be familiar with GitHub (http://www.github.com), a source code-sharing website, as shown in the following screenshot. You can share your source code privately to a team or publicly to the world using GitHub. It has a nice API interface to query about any source code repository. This recipe may give you a starting point to create your own source code search engine.

Searching for a source code repository at GitHub

Getting ready

To run this recipe, you need to install the third-party Python library requests by entering $ pip install requests or $ easy_install requests.

How to do it...

We would like to define a search_repository() function that will take the name of author (also known as coder), repository, and search key. In return, it will give us back the available result against the search key. From the GitHub API, the following are the available search keys: issues_url, has_wiki, forks_url, mirror_url, subscription_url, notifications_url, collaborators_url, updated_at, private, pulls_url, issue_comment_url, labels_url, full_name, owner, statuses_url, id, keys_url, description, tags_url, network_count, downloads_url, assignees_url, contents_url, git_refs_url, open_issues_count, clone_url, watchers_count, git_tags_url, milestones_url, languages_url, size, homepage, fork, commits_url, issue_events_url, archive_url, comments_url, events_url, contributors_url, html_url, forks, compare_url, open_issues, git_url, svn_url, merges_url, has_issues, ssh_url, blobs_url, master_branch, git_commits_url, hooks_url, has_downloads, watchers, name, language, url, created_at, pushed_at, forks_count, default_branch, teams_url, trees_url, organization, branches_url, subscribers_url, and stargazers_url.

Listing 6.5 gives the code to search for details of a source code repository at GitHub, as shown:

#!/usr/bin/env python
# Python Network Programming Cookbook -- Chapter - 6
# This program is optimized for Python 2.7.
# It may run on any other version with/without modifications.

SEARCH_URL_BASE = 'https://api.github.com/repos'

import argparse
import requests
import json

def search_repository(author, repo, search_for='homepage'):
  url = "%s/%s/%s" %(SEARCH_URL_BASE, author, repo)
  print "Searching Repo URL: %s" %url
  result = requests.get(url)
  if(result.ok):
    repo_info = json.loads(result.text or result.content)
    print "Github repository info for: %s" %repo
    result = "No result found!"
    keys = [] 
    for key,value in repo_info.iteritems():
      if  search_for in key:
          result = value
      return result

if __name__ == '__main__':
  parser = argparse.ArgumentParser(description='Github search')
  parser.add_argument('--author', action="store", dest="author", 
required=True)
  parser.add_argument('--repo', action="store", dest="repo", 
required=True)
  parser.add_argument('--search_for', action="store", 
dest="search_for", required=True)

  given_args = parser.parse_args() 
  result = search_repository(given_args.author, given_args.repo, 
given_args.search_for)
  if isinstance(result, dict):
    print "Got result for '%s'..." %(given_args.search_for)
    for key,value in result.iteritems():
    print "%s => %s" %(key,value)
  else:
    print "Got result for %s: %s" %(given_args.search_for, 
result)

If you run this script to search for the owner of the Python web framework Django, you can get the following result:

$ python 6_5_search_code_github.py --author=django --repo=django --search_for=owner 
Searching Repo URL: https://api.github.com/repos/django/django 
Github repository info for: django 
Got result for 'owner'... 
following_url => https://api.github.com/users/django/following{/other_user} 
events_url => https://api.github.com/users/django/events{/privacy} 
organizations_url => https://api.github.com/users/django/orgs 
url => https://api.github.com/users/django 
gists_url => https://api.github.com/users/django/gists{/gist_id} 
html_url => https://github.com/django 
subscriptions_url => https://api.github.com/users/django/subscriptions 
avatar_url => https://1.gravatar.com/avatar/fd542381031aa84dca86628ece84fc07?d=https%3A%2F%2Fidenticons.github.com%2Fe94df919e51ae96652259468415d4f77.png 
repos_url => https://api.github.com/users/django/repos 
received_events_url => https://api.github.com/users/django/received_events 
gravatar_id => fd542381031aa84dca86628ece84fc07 
starred_url => https://api.github.com/users/django/starred{/owner}{/repo} 
login => django 
type => Organization 
id => 27804 
followers_url => https://api.github.com/users/django/followers 

How it works...

This script takes three command-line arguments: repository author (--author), repository name (--repo), and the item to search for (--search_for). The arguments are processed by the argpase module.

Our search_repository() function appends the command-line arguments to a fixed search URL and receives the content by calling the requests module's get() function.

The search results are, by default, returned in the JSON format. This content is then processed with the json module's loads() method. The search key is then looked for inside the result and the corresponding value of that key is returned back to the caller of the search_repository() function.

In the main user code, we check whether the search result is an instance of the Python dictionary. If yes, then the key/values are printed iteratively. Otherwise, the value is printed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.138.177