Introduction to urllib2

urllib2 can read data from a URL using various protocols, such as HTTP, HTTPS, FTP, or Gopher. This module provides urlopen function used to create an object similar to a file with which can to read from the URL. This object has methods such as read(), readline(), readlines(), and close(), which work exactly the same as in the file objects, although in reality we are working with a wrapper that abstracts us from using a socket at low level.

The read method, as you will remember, is used to read the complete "file" or the number of bytes specified as a parameter, readline to read a line, and readlines to read all the lines and return a list with them.

We also have a couple of geturl methods, to get the URL of the one we are reading (which can be useful to check whether there was a redirection) and info that returns an object with the server response headers (which can also be accessed through the headers attribute).

In the next example we open a web page using urlopen(). When we pass a URL to the urlopen() method, it will return an object, we can use the read() attribute to get the data from this object in a string format.

You can find the following code in the urllib2_basic.py file:

import urllib2
try:
response = urllib2.urlopen("http://www.python.org")
print response.read()
response.close()
except HTTPError, e:
print e.code
except URLError, e:
print e.reason

When working with urllib2 module, also we need manage errors and exception type URLError.  If we work with HTTP, we can also find errors in the subclass of URLError HTTPError, which are thrown when the server returns an HTTP error code, such as 404 error when the resource is not found.

The urlopen function has an optional data parameter with which to send information to HTTP addresses using POST (parameters are sent in the request itself), for example to respond to a form. This parameter is a properly-encoded string, following the format used in the URLs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.230.81