As a Python programmer, you may already be familiar with GitHub (http://www.github.com), a source code-sharing website, as shown in the following screenshot. You can share your source code privately to a team or publicly to the world using GitHub. It has a nice API interface to query about any source code repository. This recipe may give you a starting point to create your own source code search engine.
To run this recipe, you need to install the third-party Python library requests
by entering $ pip install requests
or $ easy_install requests
.
We would like to define a search_repository()
function that will take the name of author (also known as coder), repository, and search key. In return, it will give us back the available result against the search key. From the GitHub API, the following are the available search keys: issues_url
, has_wiki
, forks_url
, mirror_url
, subscription_url
, notifications_url
, collaborators_url
, updated_at
, private
, pulls_url
, issue_comment_url
, labels_url
, full_name
, owner
, statuses_url
, id
, keys_url
, description
, tags_url
, network_count
, downloads_url
, assignees_url
, contents_url
, git_refs_url
, open_issues_count
, clone_url
, watchers_count
, git_tags_url
, milestones_url
, languages_url
, size
, homepage
, fork
, commits_url
, issue_events_url
, archive_url
, comments_url
, events_url
, contributors_url
, html_url
, forks
, compare_url
, open_issues
, git_url
, svn_url
, merges_url
, has_issues
, ssh_url
, blobs_url
, master_branch
, git_commits_url
, hooks_url
, has_downloads
, watchers
, name
, language
, url
, created_at
, pushed_at
, forks_count
, default_branch
, teams_url
, trees_url
, organization
, branches_url
, subscribers_url
, and stargazers_url
.
Listing 6.5 gives the code to search for details of a source code repository at GitHub, as shown:
#!/usr/bin/env python # Python Network Programming Cookbook -- Chapter - 6 # This program is optimized for Python 2.7. # It may run on any other version with/without modifications. SEARCH_URL_BASE = 'https://api.github.com/repos' import argparse import requests import json def search_repository(author, repo, search_for='homepage'): url = "%s/%s/%s" %(SEARCH_URL_BASE, author, repo) print "Searching Repo URL: %s" %url result = requests.get(url) if(result.ok): repo_info = json.loads(result.text or result.content) print "Github repository info for: %s" %repo result = "No result found!" keys = [] for key,value in repo_info.iteritems(): if search_for in key: result = value return result if __name__ == '__main__': parser = argparse.ArgumentParser(description='Github search') parser.add_argument('--author', action="store", dest="author", required=True) parser.add_argument('--repo', action="store", dest="repo", required=True) parser.add_argument('--search_for', action="store", dest="search_for", required=True) given_args = parser.parse_args() result = search_repository(given_args.author, given_args.repo, given_args.search_for) if isinstance(result, dict): print "Got result for '%s'..." %(given_args.search_for) for key,value in result.iteritems(): print "%s => %s" %(key,value) else: print "Got result for %s: %s" %(given_args.search_for, result)
If you run this script to search for the owner of the Python web framework Django, you can get the following result:
$ python 6_5_search_code_github.py --author=django --repo=django --search_for=owner Searching Repo URL: https://api.github.com/repos/django/django Github repository info for: django Got result for 'owner'... following_url => https://api.github.com/users/django/following{/other_user} events_url => https://api.github.com/users/django/events{/privacy} organizations_url => https://api.github.com/users/django/orgs url => https://api.github.com/users/django gists_url => https://api.github.com/users/django/gists{/gist_id} html_url => https://github.com/django subscriptions_url => https://api.github.com/users/django/subscriptions avatar_url => https://1.gravatar.com/avatar/fd542381031aa84dca86628ece84fc07?d=https%3A%2F%2Fidenticons.github.com%2Fe94df919e51ae96652259468415d4f77.png repos_url => https://api.github.com/users/django/repos received_events_url => https://api.github.com/users/django/received_events gravatar_id => fd542381031aa84dca86628ece84fc07 starred_url => https://api.github.com/users/django/starred{/owner}{/repo} login => django type => Organization id => 27804 followers_url => https://api.github.com/users/django/followers
This script takes three command-line arguments: repository author (--author
), repository name (--repo
), and the item to search for (--search_for
). The arguments are processed by the argpase
module.
Our search_repository()
function appends the command-line arguments to a fixed search URL and receives the content by calling the requests
module's get()
function.
The search results are, by default, returned in the JSON format. This content is then processed with the json
module's loads()
method. The search key is then looked for inside the result and the corresponding value of that key is returned back to the caller of the search_repository()
function.
In the main user code, we check whether the search result is an instance of the Python dictionary. If yes, then the key/values are printed iteratively. Otherwise, the value is printed.
3.133.138.177