Many websites use cookies to store their various information on to your local disk. You would like to see this cookie information and perhaps log in to that website automatically using cookies.
Let us try to pretend to log in to a popular code-sharing website, www.bitbucket.org. We would like to submit the login information on the login page, https://bitbucket.org/account/signin/?next=/. The following screenshot shows the login page:
So, we note down the form element IDs and decide which fake values should be submitted. We access this page the first time, and the next time, we access the home page to observe what cookies have been set up.
Listing 4.3 explains extracting cookie information as follows:
#!/usr/bin/env python # Python Network Programming Cookbook -- Chapter - 4 # This program is optimized for Python 2.7. # It may run on any other version with/without modifications. import cookielib import urllib import urllib2 ID_USERNAME = 'id_username' ID_PASSWORD = 'id_password' USERNAME = '[email protected]' PASSWORD = 'mypassword' LOGIN_URL = 'https://bitbucket.org/account/signin/?next=/' NORMAL_URL = 'https://bitbucket.org/' def extract_cookie_info(): """ Fake login to a site with cookie""" # setup cookie jar cj = cookielib.CookieJar() login_data = urllib.urlencode({ID_USERNAME : USERNAME, ID_PASSWORD : PASSWORD}) # create url opener opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) resp = opener.open(LOGIN_URL, login_data) # send login info for cookie in cj: print "----First time cookie: %s --> %s" %(cookie.name, cookie.value) print "Headers: %s" %resp.headers # now access without any login info resp = opener.open(NORMAL_URL) for cookie in cj: print "++++Second time cookie: %s --> %s" %(cookie.name, cookie.value) print "Headers: %s" %resp.headers if __name__ == '__main__': extract_cookie_info()
Running this recipe results in the following output:
$ python 4_3_extract_cookie_information.py ----First time cookie: bb_session --> aed58dde1228571bf60466581790566d Headers: Server: nginx/1.2.4 Date: Sun, 05 May 2013 15:13:56 GMT Content-Type: text/html; charset=utf-8 Content-Length: 21167 Connection: close X-Served-By: bitbucket04 Content-Language: en X-Static-Version: c67fb01467cf Expires: Sun, 05 May 2013 15:13:56 GMT Vary: Accept-Language, Cookie Last-Modified: Sun, 05 May 2013 15:13:56 GMT X-Version: 14f9c66ad9db ETag: "3ba81d9eb350c295a453b5ab6e88935e" X-Request-Count: 310 Cache-Control: max-age=0 Set-Cookie: bb_session=aed58dde1228571bf60466581790566d; expires=Sun, 19-May-2013 15:13:56 GMT; httponly; Max-Age=1209600; Path=/; secure Strict-Transport-Security: max-age=2592000 X-Content-Type-Options: nosniff ++++Second time cookie: bb_session --> aed58dde1228571bf60466581790566d Headers: Server: nginx/1.2.4 Date: Sun, 05 May 2013 15:13:57 GMT Content-Type: text/html; charset=utf-8 Content-Length: 36787 Connection: close X-Served-By: bitbucket02 Content-Language: en X-Static-Version: c67fb01467cf Vary: Accept-Language, Cookie X-Version: 14f9c66ad9db X-Request-Count: 97 Strict-Transport-Security: max-age=2592000 X-Content-Type-Options: nosniff
We have used Python's cookielib
and set up a cookie jar, cj
. The login data has been encoded using urllib.urlencode
. urllib2
has a build_opener()
method, which takes the predefined cookie jar with an instance of HTTPCookieProcessor()
and returns a URL opener. We call this opener twice: once for the login page and once for the home page of the website. It seems that only one cookie, bb_session
, was set with the set-cookie directive present in the page header. More information about cookielib
can be found on the official Python documentation site at http://docs.python.org/2/library/cookielib.html.
18.227.134.133