Response objects

Let's take a closer look at our response object. We can see from the preceding example that urlopen() returns an http.client.HTTPResponse instance. The response object gives us access to the data of the requested resource, and the properties and the metadata of the response. To view the URL for the response that we received in the previous section, do this:

>>> response.url
'http://www.debian.org'

We get the data of the requested resource through a file-like interface using the readline() and read() methods. We saw the readline() method in the previous section. This is how we use the read() method:

>>> response = urlopen('http://www.debian.org')
>>> response.read(50)
b'g="en">
<head>
  <meta http-equiv="Content-Type" c'

The read() method returns the specified number of bytes from the data. Here it's the first 50 bytes. A call to the read() method with no argument will return all the data in one go.

The file-like interface is limited. Once the data has been read, it's not possible to go back and re-read it by using either of the aforementioned functions. To demonstrate this, try doing the following:

>>> response = urlopen('http://www.debian.org')
>>> response.read()
b'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
  <meta http-equiv
...
>>> response.read()
b''

We can see that when we call the read() function a second time it returns an empty string. There are no seek() or rewind() methods, so we cannot reset the position. Hence, it's best to capture the read() output in a variable.

Both readline() and read() functions return bytes objects, and neither http nor urllib will make any effort to decode the data that they receive to Unicode. Later on in the chapter, we'll be looking at a way in which we can handle this with the help of the Requests library.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.199.56