Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

The urlparse Module

The urlparse module contains functions to process URLs, and to convert between URLs and platform-specific filenames. Example 7-16 demonstrates.

Example 7-16. Using the urlparse Module

File: urlparse-example-1.py

import urlparse

print urlparse.urlparse("http://host/path;params?query#fragment")

('http', 'host', '/path', 'params', 'query', 'fragment')

A common use is to split an HTTP URL into host and path components (an HTTP request involves asking the host to return data identified by the path), as shown in Example 7-17.

Example 7-17. Using the urlparse Module to Parse HTTP Locators

File: urlparse-example-2.py

import urlparse

scheme, host, path, params, query, fragment =
        urlparse.urlparse("http://host/path;params?query#fragment")

if scheme == "http":
    print "host", "=>", host
    if params:
        path = path + ";" + params
    if query:
        path = path + "?" + query
    print "path", "=>", path

host => host
path => /path;params?query

Alternatively, Example 7-18 shows how you can use the urlunparse function to put the URL back together again.

Example 7-18. Using the urlparse Module to Parse HTTP Locators

File: urlparse-example-3.py

import urlparse

scheme, host, path, params, query, fragment =
        urlparse.urlparse("http://host/path;params?query#fragment")

if scheme == "http":
    print "host", "=>", host
    print "path", "=>", urlparse.urlunparse(
    (None, None, path, params, query, None)
    )

host => host
path => /path;params?query

Example 7-19 uses the urljoin function to combine an absolute URL with a second, possibly relative URL.

Example 7-19. Using the urlparse Module to Combine Relative Locators

File: urlparse-example-4.py

import urlparse

base = "http://spam.egg/my/little/pony"

for path in "/index", "goldfish", "../black/cat":
    print path, "=>", urlparse.urljoin(base, path)

/index => http://spam.egg/index
goldfish => http://spam.egg/my/little/goldfish
../black/cat => http://spam.egg/my/black/cat

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for The urlparse Module

Create new playlist

Sign In

Sign Up

The urlparse Module

Table of Contents for
The urlparse Module