5. Networking

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5. Networking

Networking often refers to connecting multiple computers together for the purpose of allowing some communication among them. But, for our purposes, we are less interested in allowing computers to communicate with one another and more interested in allowing processes to communicate with one another. Whether the processes are on the same computer or different computers is irrelevant for the techniques that we’re going to show.

This chapter will focus on writing Python programs that connect to other processes using the standard socket library (as well as libraries built on top of socket) and then interacting with those other processes.

Network Clients

While servers sit and wait for a client to connect to them, clients initiate connections. The Python Standard Library contains implementations of many used network clients. This section will discuss some of the more common and frequently useful clients.

socket

The socket module provides a Python interface to your operating system’s socket implementation. This means that you can do whatever can be done to or with sockets, using Python. In case you have never done any network programming before, this chapter does provide a brief overview of networking. It should give you a flavor of what kinds of things you can do with the Python networking libraries.

The socket module provides the factory function, socket(). The socket() function, in turn, returns a socket object. While there are a number of arguments to pass to socket() for specifying the kind of socket to create, calling the socket() factory function with no arguments returns a socket object with sensible defaults—a TCP/IP socket:

In [1]: import socket

In [2]: s = socket.socket()

In [3]: s.connect(('192.168.1.15', 80))

In [4]: s.send("GET / HTTP/1.0

")
Out[4]: 16

In [5]: s.recv(200)
Out[5]: 'HTTP/1.1 200 OK

Date: Mon, 03 Sep 2007 18:25:45 GMT

Server: Apache/2.0.55 (Ubuntu) DAV/2 PHP/5.1.6

Content-Length: 691

Connection: close

Content-Type: text/html; charset=UTF-8



<!DOCTYPE HTML P'
In [6]: s.close()

This example created a socket object called s from the socket() factory function. It then connected to a local default web server, indicated by port 80, which is the default port for HTTP. Then, it sent the server the text string "GET / HTTP/1.0 " (which is simply an HTTP request). Following the send, it received the first 200 bytes of the server’s response, which is a 200 OK status message and HTTP headers. Finally, we closed the connection.

The socket methods demonstrated in this example represent the methods that you are likely to find yourself using most often. Connect() establishes a communication channel between your socket object and the remote (specifically meaning “not this socket object”). Send() transmits data from your socket object to the remote end. Recv() receives any data that the remote end has sent back. And close() terminates the communication channel between the two sockets. This is a really simple example that shows the ease with which you can create socket objects and then send and receive data over them.

Now we’ll look at a slightly more useful example. Suppose you have a server that is running some sort of network application, such as a web server. And suppose that you are interested in watching this server to be sure that, over the course of a day, you can make a socket connection to the web server. This sort of monitoring is minimal, but it proves that the server itself is still up and that the web server is still listening on some port. See Example 5-1.

Example 5-1. TCP port checker

#!/usr/bin/env python

import socket
import re
import sys

def check_server(address, port):
    #create a TCP socket
    s = socket.socket()
    print "Attempting to connect to %s on port %s" % (address, port)
    try:
        s.connect((address, port))
        print "Connected to %s on port %s" % (address, port)
        return True
    except socket.error, e:
        print "Connection to %s on port %s failed: %s" % (address, port, e)
        return False

if __name__ == '__main__':
    from optparse import OptionParser
    parser = OptionParser()

    parser.add_option("-a", "--address", dest="address", default='localhost',
                      help="ADDRESS for server", metavar="ADDRESS")

    parser.add_option("-p", "--port", dest="port", type="int", default=80,
                      help="PORT for server", metavar="PORT")

    (options, args) = parser.parse_args()
    print 'options: %s, args: %s' % (options, args)
    check = check_server(options.address, options.port)
    print 'check_server returned %s' % check
    sys.exit(not check)

All of the work occurs in the check_server() function. Check_server() creates a socket object. Then, it tries to connect to the specified address and port number. If it succeeds, it returns True. If it fails, the socket.connect() call will throw an exception, which is handled, and the function returns False. The main section of the code calls check_server(). This “main” section parses the arguments from the user and puts the user requested arguments into an appropriate format to pass in to check_server(). This whole script prints out status messages as it goes along. The last thing it prints out is the return value of check_server(). The script returns the opposite of the check_server() return code to the shell. The reason that we return the opposite of this return code is to make this script a useful scriptable utility. Typically, utilities like this return 0 to the shell on success and something other than 0 on failure (typically something positive). Here is an example of the piece of code successfully connecting to the web server we connected to earlier:

jmjones@dinkgutsy:code$ python port_checker_tcp.py -a 192.168.1.15 -p 80
options: {'port': 80, 'address': '192.168.1.15'}, args: []
Attempting to connect to 192.168.1.15 on port 80
Connected to 192.168.1.15 on port 80
check_server returned True

The last output line, which contains check_server returned True, means that the connection was a success.

Here is an example of a connection call that failed:

jmjones@dinkgutsy:code$ python port_checker_tcp.py -a 192.168.1.15 -p 81
options: {'port': 81, 'address': '192.168.1.15'}, args: []
Attempting to connect to 192.168.1.15 on port 81
Connection to 192.168.1.15 on port 81 failed: (111, 'Connection refused')
check_server returned False

The last log line, which contains check_server returned False, means that the connection was a failure. In the penultimate output line, which contains Connection to 192.168.1.15 on port 81 failed, we also see the reason, 'Connection refused'. Just a wild guess here, but it may have something to do with there being nothing running on port 81 of this particular server.

We’ve created three examples to demonstrate how you can use this utility in shell scripts. First, we give a shell command to run the script and to print out SUCCESS if the script succeeds. We use the && operator in place of an if-then statement:

$ python port_checker_tcp.py -a 192.168.1.15 -p 80 && echo "SUCCESS"
options: {'port': 80, 'address': '192.168.1.15'}, args: []
Attempting to connect to 192.168.1.15 on port 80
Connected to 192.168.1.15 on port 80
check_server returned True
SUCCESS

This script succeeded, so after executing and printing status results, the shell prints SUCCESS:

$ python port_checker_tcp.py -a 192.168.1.15 -p 81 && echo "FAILURE"
options: {'port': 81, 'address': '192.168.1.15'}, args: []
Attempting to connect to 192.168.1.15 on port 81
Connection to 192.168.1.15 on port 81 failed: (111, 'Connection refused')
check_server returned False

This script failed, so it never printed FAILURE:

$ python port_checker_tcp.py -a 192.168.1.15 -p 81 || echo "FAILURE"
options: {'port': 81, 'address': '192.168.1.15'}, args: []
Attempting to connect to 192.168.1.15 on port 81
Connection to 192.168.1.15 on port 81 failed: (111, 'Connection refused')
check_server returned False
FAILURE

This script failed, but we changed the && to ||. This just means if the script returns a failure result, print FAILURE. So it did.

The fact that a web server allows a connection on port 80 doesn’t mean that there is an HTTP server available for the connection. A test that will help us better determine the status of a web server is whether the web server generates HTTP headers with the expected status code for some specific URL. Example 5-2 does just that.

Example 5-2. Socket-based web server checker

#!/usr/bin/env python

import socket
import re
import sys


def check_webserver(address, port, resource):
    #build up HTTP request string
    if not resource.startswith('/'):
        resource = '/' + resource
    request_string = "GET %s HTTP/1.1
Host: %s

" % (resource, address)
    print 'HTTP request:'
    print '|||%s|||' % request_string

    #create a TCP socket
    s = socket.socket()
    print "Attempting to connect to %s on port %s" % (address, port)
    try:
        s.connect((address, port))
        print "Connected to %s on port %s" % (address, port)
        s.send(request_string)
        #we should only need the first 100 bytes or so
        rsp = s.recv(100)
        print 'Received 100 bytes of HTTP response'
        print '|||%s|||' % rsp
    except socket.error, e:
        print "Connection to %s on port %s failed: %s" % (address, port, e)
        return False
    finally:
        #be a good citizen and close your connection
        print "Closing the connection"
        s.close()
    lines = rsp.splitlines()
    print 'First line of HTTP response: %s' % lines[0]
    try:
        version, status, message = re.split(r's+', lines[0], 2)
        print 'Version: %s, Status: %s, Message: %s' % (version, status, message)
    except ValueError:
        print 'Failed to split status line'
        return False
    if status in ['200', '301']:
        print 'Success - status was %s' % status
        return True
    else:
        print 'Status was %s' % status
        return False

if __name__ == '__main__':
    from optparse import OptionParser
    parser = OptionParser()
    parser.add_option("-a", "--address", dest="address", default='localhost',
                      help="ADDRESS for webserver", metavar="ADDRESS")

    parser.add_option("-p", "--port", dest="port", type="int", default=80,
                      help="PORT for webserver", metavar="PORT")

    parser.add_option("-r", "--resource", dest="resource", default='index.html',
                      help="RESOURCE to check", metavar="RESOURCE")

    (options, args) = parser.parse_args()
    print 'options: %s, args: %s' % (options, args)
    check = check_webserver(options.address, options.port, options.resource)
    print 'check_webserver returned %s' % check
    sys.exit(not check)

Similar to the previous example where check_server() did all the work, check_webserver() does all the work in this example, too. First, check_webserver() builds up the HTTP request string. The HTTP protocol, in case you don’t know, is a well-defined way that HTTP clients and servers communicate. The HTTP request that check_webserver() builds is nearly the simplest HTTP request possible. Next, check_webserver() creates a socket object, connects to the server, and sends the HTTP request to the server. Then, it reads back the response from the server and closes the connection. When there is a socket error, check_webserver() returns False, indicating that the check failed. It then takes what it read from the server, and extracts the status code from it. If the status code is either 200 meaning “OK,” or 301, meaning “Moved Permanently,” check_webserver() returns True, otherwise, it returns False. The main portion of the script parses the input from the user and calls check_webserver(). After it gets the result back from check_webserver(), it returns the opposite of the return value from check_webserver() to the shell. The concept here is similar to what we did with the plain socket checker. We want to be able to call this from a shell script and see if it succeeded or failed. Here is the code in action:

$ python web_server_checker_tcp.py -a 192.168.1.15 -p 80 -r apache2-default
options: {'resource': 'apache2-default', 'port': 80, 'address': 
'192.168.1.15'}, args: []
HTTP request:
|||GET /apache2-default HTTP/1.1
Host: 192.168.1.15

|||
Attempting to connect to 192.168.1.15 on port 80
Connected to 192.168.1.15 on port 80
Received 100 bytes of HTTP response
|||HTTP/1.1 301 Moved Permanently
Date: Wed, 16 Apr 2008 23:31:24 GMT
Server: Apache/2.0.55 (Ubuntu) |||
Closing the connection
First line of HTTP response: HTTP/1.1 301 Moved Permanently
Version: HTTP/1.1, Status: 301, Message: Moved Permanently
Success - status was 301
check_webserver returned True

The last four output lines show that the HTTP status code for /apache2-default on this web server was 301, so this run was successful.

Here is another run. This time, we’ll intentionally specify a resource that isn’t there to show what happens when the HTTP call is False:

$ python web_server_checker_tcp.py -a 192.168.1.15 -p 80 -r foo
options: {'resource': 'foo', 'port': 80, 'address': '192.168.1.15'}, args: []
HTTP request:
|||GET /foo HTTP/1.1
Host: 192.168.1.15

|||
Attempting to connect to 192.168.1.15 on port 80
Connected to 192.168.1.15 on port 80
Received 100 bytes of HTTP response
|||HTTP/1.1 404 Not Found
Date: Wed, 16 Apr 2008 23:58:55 GMT
Server: Apache/2.0.55 (Ubuntu) DAV/2 PH|||
Closing the connection
First line of HTTP response: HTTP/1.1 404 Not Found
Version: HTTP/1.1, Status: 404, Message: Not Found
Status was 404
check_webserver returned False

Just as the last four lines of the previous example showed that the run was successful, the last four lines of this example show that it was unsuccessful. Because there is no /foo on this web server, this checker returned False.

This section showed how to construct low-level utilities to connect to network servers and perform basic checks on them. The purpose of these examples was to introduce you to what happens behind the scenes when clients and servers communicate with one another. If you have an opportunity to write a network component using a higher library than the socket module, you should take it. It is not desirable to spend your time writing network components using a low-level library such as socket.

httplib

The previous example showed how to make an HTTP request using the socket module directly. This example will show how to use the httplib module. When should you consider using the httplib module rather than the socket module? Or more generically, when should you consider using a higher level library rather than a lower level library? A good rule of thumb is any chance you get. Sometimes using a lower level library makes sense. You might need to accomplish something that isn’t already in an available library, for example, or you might need to have finer-grained control of something already in a library, or there might be a performance advantage. But in this case, there is no reason not to use a higher-level library such as httplib over a lower-level library such as socket.

Example 5-3 accomplishes the same functionality as the previous example did with the httplib module.

Example 5-3. httplib-based web server checker

#!/usr/bin/env python

import httplib
import sys


def check_webserver(address, port, resource):
    #create connection
    if not resource.startswith('/'):
        resource = '/' + resource
    try:
        conn = httplib.HTTPConnection(address, port)
        print 'HTTP connection created successfully'
        #make request
        req = conn.request('GET', resource)
        print 'request for %s successful' % resource
        #get response
        response = conn.getresponse()
        print 'response status: %s' % response.status
    except sock.error, e:
        print 'HTTP connection failed: %s' % e
        return False
    finally:
        conn.close()
        print 'HTTP connection closed successfully'
    if response.status in [200, 301]:
        return True
    else:
        return False

if __name__ == '__main__':
    from optparse import OptionParser
    parser = OptionParser()
    parser.add_option("-a", "--address", dest="address", default='localhost',
                      help="ADDRESS for webserver", metavar="ADDRESS")
    parser.add_option("-p", "--port", dest="port", type="int", default=80,
                      help="PORT for webserver", metavar="PORT")
    parser.add_option("-r", "--resource", dest="resource", default='index.html',
                      help="RESOURCE to check", metavar="RESOURCE")
    (options, args) = parser.parse_args()
    print 'options: %s, args: %s' % (options, args)
    check = check_webserver(options.address, options.port, options.resource)
    print 'check_webserver returned %s' % check
    sys.exit(not check)

In its conception, this example follows the socket example pretty closely. Two of the biggest differences are that you don’t have to manually create the HTTP request and that you don’t have to manually parse the HTTP response. The httplib connection object has a request() method that builds and sends the HTTP request for you. The connection object also has a getresponse() method that creates a response object for you. We were able to access the HTTP status by referring to the status attribute on the response object. Even if it isn’t that much less code to write, it is nice to not have to go through the trouble of keeping up with creating, sending, and receiving the HTTP request and response. This code just feels more tidy.

Here is a run that uses the same command-line parameters the previous successful scenario used. We’re looking for / on our web server, and we find it:

$ python web_server_checker_httplib.py -a 192.168.1.15 -r /
options: {'resource': '/', 'port': 80, 'address': '192.168.1.15'}, args: []
HTTP connection created successfully
request for / successful
response status: 200
HTTP connection closed successfully
check_webserver returned True

And here is a run with the same command-line parameters as the failure scenario earlier. We’re looking for /foo, and we don’t find it:

$ python web_server_checker_httplib.py -a 192.168.1.15 -r /foo
options: {'resource': '/foo', 'port': 80, 'address': '192.168.1.15'}, args: []
HTTP connection created successfully
request for /foo successful
response status: 404
HTTP connection closed successfully
check_webserver returned False

As we said earlier, any time you have a chance to use a higher-level library, you should use it. Using httplib rather than using the socket module alone was a simpler, cleaner process. And the simpler you can make your code, the fewer bugs you’ll have.

ftplib

In addition to the socket and httplib modules, the Python Standard Library also contains an FTP client module named ftplib. ftplib is a full-featured FTP client library that will allow you to programmatically perform any tasks you would normally use an FTP client application to perform. For example, you can log in to an FTP server, list files in a particular directory, retrieve files, put files, change directories, and logout, all from within a Python script. You can even use one of the many GUI frameworks available in Python and build your own GUI FTP application.

Rather than give a full overview of this library, we’ll show you Example 5-4 and then explain how it works.

Example 5-4. FTP URL retriever using ftplib

#!/usr/bin/env python

from ftplib import FTP
import ftplib
import sys
from optparse import OptionParser

parser = OptionParser()

parser.add_option("-a", "--remote_host_address", dest="remote_host_address",
    help="REMOTE FTP HOST.", 
    metavar="REMOTE FTP HOST")

parser.add_option("-r", "--remote_file", dest="remote_file",
    help="REMOTE FILE NAME to download.", 
    metavar="REMOTE FILE NAME")

parser.add_option("-l", "--local_file", dest="local_file",
    help="LOCAL FILE NAME to save remote file to", metavar="LOCAL FILE NAME")

parser.add_option("-u", "--username", dest="username",
    help="USERNAME for ftp server", metavar="USERNAME")

parser.add_option("-p", "--password", dest="password",
    help="PASSWORD for ftp server", metavar="PASSWORD")

(options, args) = parser.parse_args()

if not (options.remote_file and 
        options.local_file and 
        options.remote_host_address):
    parser.error('REMOTE HOST, LOCAL FILE NAME, ' 
            'and REMOTE FILE NAME are mandatory')

if options.username and not options.password:
    parser.error('PASSWORD is mandatory if USERNAME is present')

ftp = FTP(options.remote_host_address)
if options.username:
    try:
        ftp.login(options.username, options.password)
    except ftplib.error_perm, e:
        print "Login failed: %s" % e
        sys.exit(1)
else:
    try:
        ftp.login()
    except ftplib.error_perm, e:
        print "Anonymous login failed: %s" % e
        sys.exit(1)
try:
    local_file = open(options.local_file, 'wb')
    ftp.retrbinary('RETR %s' % options.remote_file, local_file.write)
finally:
    local_file.close()
    ftp.close()

The first part of the working code (past all the command-line parsing) creates an FTP object by passing the FTP server’s address to FTP’s constructor. Alternatively, we could have created an FTP object by passing nothing to the constructor and then calling the connect() method with the FTP server’s address. The code then logs into the FTP server, using the username and password if they were provided, or anonymous authentication if they were not. Next, it creates a file object to store the data from the file on the FTP server. Then it calls the retrbinary() method on the FTP object. Retrbinary(), as the name implies, retrieves a binary file from an FTP server. It takes two parameters: the FTP retrieve command and a callback function. You might notice that our callback function is the write method on the file object we created in the previous step. It is important to note that we are not calling the write() method in this case. We are passing the write method in to the retrbinary() method so that retrbinary() can call write(). Retrbinary() will call whatever callback function we pass it with each chunk of data that it receives from the FTP server. This callback function could do anything with the data. The callback function could just log that it received N number of bytes from the FTP server. Passing in a file object’s write method causes the script to write the contents of the file from the FTP server to the file object. Finally, it closes the file object and the FTP connection. We did a little error handling in the process: we set up a try block around retrieving the binary file from the FTP server and a finally block around the call to close the local file and FTP connection. If anything bad happens, we want to clean up our files before the script terminates. For a brief discussion of callbacks, see the Appendix.

urllib

Moving up the standard library modules to a higher-level library, we arrive at urllib. When you think of urllib, it’s easy to think of HTTP libraries only and forget that FTP resources also can be identified by URLs. Consequently, you might not have considered using urllib to retrieve FTP resources, but the functionality is there. Example 5-5 is the same as the ftplib example earlier, except it uses urllib.

Example 5-5. FTP URL retriever using urllib

#!/usr/bin/env python
"""
url retriever

Usage:
    
url_retrieve_urllib.py URL FILENAME

URL:
If the URL is an FTP URL the format should be:
ftp://[username[:password]@]hostname/filename
If you want to use absolute paths to the file to download,
you should make the URL look something like this:
ftp://user:password@host/%2Fpath/to/myfile.txt
Notice the '%2F' at the beginning of the path to the file.

FILENAME:
absolute or relative path to the filename to save downloaded file as
"""

import urllib
import sys

if '-h' in sys.argv or '--help' in sys.argv:
    print __doc__
    sys.exit(1)

if not len(sys.argv) == 3:
    print 'URL and FILENAME are mandatory'
    print __doc__
    sys.exit(1)
url = sys.argv[1]
filename = sys.argv[2]
urllib.urlretrieve(url, filename)

This script is short and sweet. It really shows off the power of urllib. There are actually more lines of usage documentation than code in it. There is even more argument parsing than code, which says a lot because there isn’t much of that, either. We decided to go with a very simple argument parsing routine with this script. Since both of the “options” were mandatory, we decided to use positional arguments rather than option switches. Effectively, the only line of code in this example that performs work is this one:

urllib.urlretrieve(url, filename)

After retrieving the options with sys.argv, this line of code pulls down the specified URL and saves it to the specified local filename. It works with HTTP URLs and FTP URLs, and will even work when the username and password are included in the URL.

A point worth emphasizing here is that if you think that something should be easier than the way you are doing it with another language, it probably is. There is probably some higher-level library out there somewhere that will do what you need to do frequently, and that library will be in the Python Standard Library. In this case, urllib did exactly what we wanted to do, and we didn’t have to go anywhere beyond the standard library docs to find out about it. Sometimes, you might have to go outside the Python Standard Library, but you will find other Python resources such as the Python Package Index (PyPI) at http://pypi.python.org/pypi.

urllib2

Another high level library is urllib2. Urllib2 contains pretty much the same functionality as urllib, but expands on it. For example, urllib2 contains better authentication support and better cookie support. So if you start using urllib and think it isn’t doing everything for you that it should, take a look at urllib2 to see if it meets your needs.

Remote Procedure Call Facilities

Typically, the reason for writing networking code is that you need interprocess communication (IPC). Often, plain IPC, such as HTTP or a plain socket, is good enough. However, there are times when it would be even more useful to execute code in a different process or even on a different computer, as though it were in the same process that the code you are working on is in. If you could, in fact, execute code remotely in some other process from your Python program, you might expect that the return values from the remote calls would be Python objects which you could deal more easily with than chunks of text through which you have to manually parse. The good news is that there are several tools for remote procedure call (RPC) functionality.

XML-RPC

XML-RPC exchanges a specifically formatted XML document between two processes to perform a remote procedure call. But you don’t need to worry about XML part; you’ll probably never have to know the format of the document that is being exchanged between the two processes. The only thing you really need to know to get started using XML-RPC is that there is an implementation of both the client and the server portions in the Python Standard Library. Two things that might be useful to know are XML-RPC is available for most programming languages, and it is very simple to use.

Example 5-6 is a simple XML-RPC server.

Example 5-6. Simple XML-RPC server

#!/usr/bin/env python

import SimpleXMLRPCServer
import os

def ls(directory):
    try:
        return os.listdir(directory)
    except OSError:
        return []

def ls_boom(directory):
    return os.listdir(directory)

def cb(obj):
    print "OBJECT::", obj
    print "OBJECT.__class__::", obj.__class__
    return obj.cb()

if __name__ == '__main__':
    s = SimpleXMLRPCServer.SimpleXMLRPCServer(('127.0.0.1', 8765))
    s.register_function(ls)
    s.register_function(ls_boom)
    s.register_function(cb)
    s.serve_forever()

This code creates a new SimpleXMLRPCServer object and binds it to port 8765 on 127.0.0.1, the loop back interface, which makes this accessible to processes only on this particular machine. It then registers the functions ls(), ls_boom(), and cb(), which we defined in the code. We’ll explain the cb() function in a few moments. The Ls() function will list the contents of the directory passed in using os.listdir() and return those results as a list. ls() masks any OSError exceptions that we may get. ls_boom() lets any exception that we hit find its way back to the XML-RPC client. Then, the code enters into the serve_forever() loop, which waits for a connection it can handle. Here is an example of this code used in an IPython shell:

In [1]: import xmlrpclib

In [2]: x = xmlrpclib.ServerProxy('http://localhost:8765')

In [3]: x.ls('.')
Out[3]:
['.svn',
 'web_server_checker_httplib.py',
....
 'subprocess_arp.py',
 'web_server_checker_tcp.py']

In [4]: x.ls_boom('.')
Out[4]:
['.svn',
 'web_server_checker_httplib.py',
....
 'subprocess_arp.py',
 'web_server_checker_tcp.py']

In [5]: x.ls('/foo')
Out[5]: []

In [6]: x.ls_boom('/foo')
---------------------------------------------------------------------------
<class 'xmlrpclib.Fault'>                 Traceback (most recent call last)
...
.
.
<<big nasty traceback>>
.
.
...
    786         if self._type == "fault":
--> 787             raise Fault(**self._stack[0])
    788         return tuple(self._stack)
    789

<class 'xmlrpclib.Fault'>: <Fault 1: "<type 'exceptions.OSError'>
    :[Errno 2] No such file or directory: '/foo'">

First, we created a ServerProxy() object by passing in the address of the XML-RPC server. Then, we called .ls('.') to see which files were in the server’s current working directory. The server was running in a directory that contains example code from this book, so those are the files you see from the directory listing. The really interesting thing is that on the client side, x.ls('.') returned a Python list. Had this server been implemented in Java, Perl, Ruby, or C#, you could expect the same thing. The language that implements the server would have done a directory listing; created a list, array, or collection of filenames; and the XML-RPC server code would have then created an XML representation of that list or array and sent it back over the wire to your client. We also tried out ls_boom(). Since ls_boom() lacks the exception handling of ls(), we can see that the exception passes from the server back to the client. We even see a traceback on the client.

The interoperability possibilities that XML-RPC opens up to you are certainly interesting. But perhaps more interesting is the fact that you can write a piece of code to run on any number of machines and be able to execute that code remotely whenever you wish.

XML-RPC is not without its limitations, though. Whether you think these limitations are problematic or not is a matter of engineering taste. For example, if you pass in a custom Python object, the XML-RPC library will convert that object to a Python dictionary, serialize it to XML, and pass it across the wire. You can certainly work around this, but it would require writing code to extract your data from the XML version of the dictionary so that you could pass it back into the original object that was dictified. Rather than go through that trouble, why not use your objects directly on your RPC server? You can’t with XML-RPC, but there are other options.

Pyro

Pyro is one framework that alleviates XML-RPC shortcomings. Pyro stands for Python Remote Objects (capitalization intentional). It lets you do everything you could do with XML-RPC, but rather than dictifying your objects, it maintains their types when you pass them across. If you do want to use Pyro, you will have to install it separately. It doesn’t come with Python. Also be aware that Pyro only works with Python, whereas XML-RPC can work between Python and other languages. Example 5-7 is an implementation of the same ls() functionality from the XML-RPC example.

Example 5-7. Simple Pyro server

#!/usr/bin/env python

import Pyro.core
import os
from xmlrpc_pyro_diff import PSACB

class PSAExample(Pyro.core.ObjBase):

    def ls(self, directory):
        try:
            return os.listdir(directory)
        except OSError:
            return []

    def ls_boom(self, directory):
        return os.listdir(directory)

    def cb(self, obj):
        print "OBJECT:", obj
        print "OBJECT.__class__:", obj.__class__
        return obj.cb()

if __name__ == '__main__':
    Pyro.core.initServer()
    daemon=Pyro.core.Daemon()
    uri=daemon.connect(PSAExample(),"psaexample")

    print "The daemon runs on port:",daemon.port
    print "The object's uri is:",uri

    daemon.requestLoop()

The Pyro example is similar to the XML-RPC example. First, we created a PSAExample class with ls(), ls_boom(), and cb() methods on it. We then created a daemon from Pyro’s internal plumbing. Then, we associated the PSAExample with the daemon. Finally, we told the daemon to start serving requests.

Here we access the Pyro server from an IPython prompt:

In [1]: import Pyro.core
/usr/lib/python2.5/site-packages/Pyro/core.py:11: DeprecationWarning: 
The sre module is deprecated, please import re.
  import sys, time, sre, os, weakref

In [2]: psa = Pyro.core.getProxyForURI("PYROLOC://localhost:7766/psaexample")
Pyro Client Initialized. Using Pyro V3.5

In [3]: psa.ls(".")
Out[3]:
['pyro_server.py',
....
 'subprocess_arp.py',
 'web_server_checker_tcp.py']

In [4]: psa.ls_boom('.')
Out[4]:
['pyro_server.py',
....
 'subprocess_arp.py',
 'web_server_checker_tcp.py']

In [5]: psa.ls("/foo")
Out[5]: []

In [6]: psa.ls_boom("/foo")
---------------------------------------------------------------------------
<type 'exceptions.OSError'>               Traceback (most recent call last)

/home/jmjones/local/Projects/psabook/oreilly/<ipython console> in <module>()
.
.
...
<<big nasty traceback>>
...
.
.
--> 115                 raise self.excObj
    116         def __str__(self):
    117                 s=self.excObj.__class__.__name__

<type 'exceptions.OSError'>: [Errno 2] No such file or directory: '/foo'

Nifty. It returned the same output as the XML-RPC example. We expected as much. But what happens when we pass in a custom object? We’re going to define a new class, create an object from it, and then pass it to the XML-RPC cb() function and the Pyro cb() method from the examples above. Example 5-8 shows the piece of code that we are going to execute.

Example 5-8. Differences between XML-RPC and Pyro

import Pyro.core
import xmlrpclib

class PSACB:
    def __init__(self):
        self.some_attribute = 1

    def cb(self):
        return "PSA callback"

if __name__ == '__main__':
    cb = PSACB()

    print "PYRO SECTION"
    print "*" * 20
    psapyro = Pyro.core.getProxyForURI("PYROLOC://localhost:7766/psaexample")
    print "-->>", psapyro.cb(cb)
    print "*" * 20

    print "XML-RPC SECTION"
    print "*" * 20
    psaxmlrpc = xmlrpclib.ServerProxy('http://localhost:8765')
    print "-->>", psaxmlrpc.cb(cb)
    print "*" * 20

The call to the Pyro and XML-RPC implementation of the cb() function should both call cb() on the object passed in to it. And in both instances, it should return the string PSA callback. And here is what happens when we run it:

jmjones@dinkgutsy:code$ python xmlrpc_pyro_diff.py
/usr/lib/python2.5/site-packages/Pyro/core.py:11: DeprecationWarning: 
The sre module is deprecated, please import re.
  import sys, time, sre, os, weakref
PYRO SECTION
********************
Pyro Client Initialized. Using Pyro V3.5
-->> PSA callback
********************
XML-RPC SECTION
********************
-->>
Traceback (most recent call last):
  File "xmlrpc_pyro_diff.py", line 23, in <module>
    print "-->>", psaxmlrpc.cb(cb)
  File "/usr/lib/python2.5/xmlrpclib.py", line 1147, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.5/xmlrpclib.py", line 1437, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.5/xmlrpclib.py", line 1201, in request
    return self._parse_response(h.getfile(), sock)
  File "/usr/lib/python2.5/xmlrpclib.py", line 1340, in _parse_response
    return u.close()
  File "/usr/lib/python2.5/xmlrpclib.py", line 787, in close
    raise Fault(**self._stack[0])
xmlrpclib.Fault: <Fault 1: "<type 'exceptions.AttributeError'>:'dict' object 
    has no attribute 'cb'">

The Pyro implementation worked, but the XML-RPC implementation failed and left us a traceback. The last line of the traceback says that a dict object has no attribute of cb. This will make more sense when we show you the output from the XML-RPC server. Remember that the cb() function had some print statements in it to show some information about what was going on. Here is the XML-RPC server output:

OBJECT:: {'some_attribute': 1}
OBJECT.__class__:: <type 'dict'>
localhost - - [17/Apr/2008 16:39:02] "POST /RPC2 HTTP/1.0" 200 -

In dictifying the object that we created in the XML-RPC client, some_attribute was converted to a dictionary key. While this one attribute was preserved, the cb() method was not.

Here is the Pyro server output:

OBJECT: <xmlrpc_pyro_diff.PSACB instance at 0x9595a8>
OBJECT.__class__: xmlrpc_pyro_diff.PSACB

Notice that the class of the object is a PSACB, which is how it was created. On the Pyro server side, we had to include code that imported the same code that the client was using. It makes sense that the Pyro server needs to import the client’s code. Pyro uses the Python standard pickle to serialize objects, so it makes sense that Pyro behaves similarly.

In summary, if you want a simple RPC solution, don’t want external dependencies, can live with the limitations of XML-RPC, and think that interoperability with other languages could come in handy, then XML-RPC is probably a good choice. On the other hand, if the limitations of XML-RPC are too constraining, you don’t mind installing external libraries, and you don’t mind being limited to using only Python, then Pyro is probably a better option for you.

SSH

SSH is an incredibly powerful, widely used protocol. You can also think of it as a tool since the most common implementation includes the same name. SSH allows you to securely connect to a remote server, execute shell commands, transfer files, and forward ports in both directions across the connection.

If you have the command-line ssh utility, why would you ever want to script using the SSH protocol? The main reason is that using the SSH protocol gives you the full power of SSH combined with the full power of Python.

The SSH2 protocol is implemented using the Python library called paramkio. From within a Python script, writing nothing but Python code, you can connect to an SSH server and accomplish those pressing SSH tasks. Example 5-9 is an example of connecting to an SSH server and executing a simple command.

Example 5-9. Connecting to an SSH server and remotely executing a command

#!/usr/bin/env python

import paramiko

hostname = '192.168.1.15'
port = 22
username = 'jmjones'
password = 'xxxYYYxxx'

if __name__ == "__main__":
    paramiko.util.log_to_file('paramiko.log')
    s = paramiko.SSHClient()
    s.load_system_host_keys()
    s.connect(hostname, port, username, password)
    stdin, stdout, stderr = s.exec_command('ifconfig')
    print stdout.read()
    s.close()

As you can see, we import the paramiko module and define three variables. Next, we create an SSHClient object. Then we tell it to load the host keys, which, on Linux, come from the “known_hosts” file. After that we connect to the SSH server. None of these steps is particularly complicated, especially if you’re already familiar with SSH.

Now we’re ready to execute a command remotely. The call to exec_command() executes the command that you pass in and returns three file handles associated with the execution of the command: standard input, standard output, and standard error. And to show that this is being executed on a machine with the same IP address as the address we connected to with the SSH call, we print out the results of ifconfig on the remote server:

jmjones@dinkbuntu:~/code$ python paramiko_exec.py
eth0      Link encap:Ethernet  HWaddr XX:XX:XX:XX:XX:XX
          inet addr:192.168.1.15  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: xx00::000:x0xx:xx0x:0x00/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:9667336 errors:0 dropped:0 overruns:0 frame:0
          TX packets:11643909 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1427939179 (1.3 GiB)  TX bytes:2940899219 (2.7 GiB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:123571 errors:0 dropped:0 overruns:0 frame:0
          TX packets:123571 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:94585734 (90.2 MiB)  TX bytes:94585734 (90.2 MiB)

It looks exactly as if we had run ifconfig on our local machine, except the IP address is different.

Example 5-10 shows you how to use paramiko to SFTP files between a remote machine and your local machine. This particular example only retrieves files from the remote machine using the get() method. If you want to send files to the remote machine, use the put() method.

Example 5-10. Retrieving files from an SSH server

#!/usr/bin/env python

import paramiko
import os

hostname = '192.168.1.15'
port = 22
username = 'jmjones'
password = 'xxxYYYxxx'
dir_path = '/home/jmjones/logs'

if __name__ == "__main__":
    t = paramiko.Transport((hostname, port))
    t.connect(username=username, password=password)
    sftp = paramiko.SFTPClient.from_transport(t)
    files = sftp.listdir(dir_path)
    for f in files:
        print 'Retrieving', f
        sftp.get(os.path.join(dir_path, f), f)
    t.close()

In case you want to use public/private keys rather than passwords, Example 5-11 is a modification of the remote execution example using an RSA key.

Example 5-11. Connecting to an SSH server and remotely executing a command—private keys enabled

#!/usr/bin/env python

import paramiko

hostname = '192.168.1.15'
port = 22
username = 'jmjones'
pkey_file = '/home/jmjones/.ssh/id_rsa'

if __name__ == "__main__":
    key = paramiko.RSAKey.from_private_key_file(pkey_file)
    s = paramiko.SSHClient()
    s.load_system_host_keys()
    s.connect(hostname, port, pkey=key)
    stdin, stdout, stderr = s.exec_command('ifconfig')
    print stdout.read()
    s.close()

And Example 5-12 is a modification of the sftp script using an RSA key.

Example 5-12. Retrieving files from an SSH server

#!/usr/bin/env python

import paramiko
import os

hostname = '192.168.1.15'
port = 22
username = 'jmjones'
dir_path = '/home/jmjones/logs'
pkey_file = '/home/jmjones/.ssh/id_rsa'

if __name__ == "__main__":
    key = paramiko.RSAKey.from_private_key_file(pkey_file)
    t = paramiko.Transport((hostname, port))
    t.connect(username=username, pkey=key)
    sftp = paramiko.SFTPClient.from_transport(t)
    files = sftp.listdir(dir_path)
    for f in files:
        print 'Retrieving', f
        sftp.get(os.path.join(dir_path, f), f)
    t.close()

Twisted

Twisted is an event-driven networking framework for Python that can tackle pretty much any type of network-related task you need it to. A comprehensive single solution has a price of complexity. Twisted will begin to make sense after you’ve used it a few times, but understanding it initially can be difficult. Further, learning Twisted is such a large project that finding a beginning point to solve a specific problem can often be daunting.

Despite that, though, we highly recommend that you become familiar with it and see if it fits the way you think. If you can easily tailor your thinking to “the Twisted way,” then learning Twisted will be likely to be a valuable investment. Twisted Network Programming Essentials by Abe Fettig (O’Reilly) is a good place to get started. This book helps to reduce the negative points we have mentioned.

Twisted is an event-driven network, meaning that rather than focusing on writing code that initiates connections being made and dropped and low-level details of data reception, you focus on writing code that handles those happenings.

What advantage would you gain by using Twisted? The framework encourages, and at times nearly requires, that you break your problems into small pieces. The network connection is decoupled from the logic of what occurs when connections are made. These two facts gain you some level of automatic re-usability from your code. Another thing that Twisted gains for you is that you won’t have to worry so much about lower level connection and error handling with network connections. Your part in writing network code is deciding what happens when certain events transpire.

Example 5-13 is a port checker that we’ve implemented in Twisted. It is very basic, but will demonstrate the event-driven nature of Twisted as we go through the code. But before we do that, we’ll go over a few basic concepts that you’ll need to know. The basics include reactors, factory, protocols, and deferreds. Reactors are the heart of a Twisted application’s main event loop. Reactors handle event dispatching, network communications, and threading. Factories are responsible for spawning new protocol instances. Each factory instance can spawn one type of protocol. Protocols define what to do with a specific connection. At runtime, a protocol instance is created for each connection. And deferreds are a way of chaining actions together.

Most folks who write code have a very strong intuition about the logical flow of a program or script: it’s like water running down hill, complete with damns, shunts, etc. As a result, such code is fairly easy to think about, both the writing and the debugging. Twisted code is quite different. Being asynchronous, one might say it’s more like droplets of water in a low-g environment than a river flowing downhill, but there the analogy really breaks down. A new component has been introduced: the event listener (reactor) and friends. To create and debug Twisted code, one must abandon preconceptions with a Zen-like attitude and begin building an intuition for a different logical flow.

Example 5-13. Port checker implemented in Twisted

#!/usr/bin/env python

from twisted.internet import reactor, protocol
import sys

class PortCheckerProtocol(protocol.Protocol):
    def __init__(self):
        print "Created a new protocol"
    def connectionMade(self):
        print "Connection made"
        reactor.stop()

class PortCheckerClientFactory(protocol.ClientFactory):
    protocol = PortCheckerProtocol
    def clientConnectionFailed(self, connector, reason):
        print "Connection failed because", reason
        reactor.stop()

if __name__ == '__main__':
    host, port = sys.argv[1].split(':')
    factory = PortCheckerClientFactory()
    print "Testing %s" % sys.argv[1]
    reactor.connectTCP(host, int(port), factory)
    reactor.run()

Notice that we defined two classes (PortCheckerProtocol and PortCheckerClientFactory), both of which inherit from Twisted classes. We tied our factory, PortCheckerClientFactory, to PortCheckerProtocol by assigning PortCheckerProtocol to PortCheckerClientFactory’s class attribute protocol. If a factory attempts to make a connection but fails, the factory’s clientConnectionFailed() method will be called. ClientConnectionFailed() is a method that is common to all Twisted factories and is the only method we defined for our factory. By defining a method that “comes with” the factory class, we are overriding the default behavior of the class. When a client connection fails, we want to print out a message to that effect and stop the reactor.

PortCheckerProtocol is one of the protocols we discussed earlier. An instance of this class will be created once we have established a connection to the server whose port we are checking. We have only defined one method on PortCheckerProtocol: connectionMade(). This is a method that is common to all Twisted protocol classes. By defining this method ourselves, we are overriding the default behavior. When a connection is successfully made, Twisted will call this protocol’s connectionMade() method. As you can see, it prints out a simple message and stops the reactor. (We’ll get to the reactor shortly.)

In this example, both connectionMade() and clientConnectionFailed() demonstrate the “event-driven” nature of Twisted. A connection being made is an event. So also is when a client connection fails to be made. When these events occur, Twisted calls the appropriate methods to handle the events, which are referred to as event handlers.

In the main section of this example, we create an instance of PortCheckerClientFactory. We then tell the Twisted reactor to connect to the hostname and port number, which were passed in as command-line arguments, using the specified factory. After telling the reactor to connect to a certain port on a certain host, we tell the reactor to run. If we had not told the reactor to run, nothing would have happened.

To summarize the flow chronologically, we start the reactor after giving it a directive. In this case, the directive was to connect to a server and port and use PortCheckerClientFactory to help dispatch events. If the connection to the given host and port fails, the event loop will call clientConnectionFailed() on PortCheckerClientFactory. If the connection succeeds, the factory creates an instance of the protocol, PortCheckerProtocol, and calls connectionMade() on that instance. Whether the connection succeeds or fails, the respective event handlers will shut the reactor down and the program will stop running.

That was a very basic example, but it showed the basics of Twisted’s event handling nature. A key concept of Twisted programming that we did not cover in this example is the idea of deferreds and callbacks. A deferred represents a promise to execute the requested action. A callback is a way of specifying an action to accomplish. Deferreds can be chained together and pass their results on from one to the next. This point is often difficult to really understand in Twisted. (Example 5-14 will elaborate on deferreds.)

Example 5-14 is an example of using Perspective Broker, an RPC mechanism that is unique to Twisted. This example is another implementation of the remote “ls” server that we implemented in XML-RPC and Pyro, earlier in this chapter. First, we will walk you through the server.

Example 5-14. Twisted Perspective Broker server

import os
from twisted.spread import pb
from twisted.internet import reactor

class PBDirLister(pb.Root):
    def remote_ls(self, directory):
        try:
            return os.listdir(directory)
        except OSError:
            return []

    def remote_ls_boom(self, directory):
        return os.listdir(directory)

if __name__ == '__main__':
    reactor.listenTCP(9876, pb.PBServerFactory(PBDirLister()))
    reactor.run()

This example defines one class, PBDirLister. This is the Perspective Broker (PB) class that will act as a remote object when the client connects to it. This example defines only two methods on this class: remote_ls() and remote_ls_boom(). Remote_ls() is, not surprisingly, one of the remote methods that the client will call. This remote_ls() method will simply return a listing of the specified directory. And remote_ls_boom() will do the same thing that remote_ls()will do, except that it won’t perform exception handling. In the main section of the example, we tell the Perspective Broker to bind to port 9876 and then run the reactor.

Example 5-15 is not as straightforward; it calls remote_ls().

Example 5-15. Twisted Perspective Broker client

#!/usr/bin/python

from twisted.spread import pb
from twisted.internet import reactor

def handle_err(reason):
    print "an error occurred", reason
    reactor.stop()

def call_ls(def_call_obj):
    return def_call_obj.callRemote('ls', '/home/jmjones/logs')

def print_ls(print_result):
    print print_result
    reactor.stop()

if __name__ == '__main__':
    factory = pb.PBClientFactory()
    reactor.connectTCP("localhost", 9876, factory)
    d = factory.getRootObject()
    d.addCallback(call_ls)
    d.addCallback(print_ls)
    d.addErrback(handle_err)
    reactor.run()

This client example defines three functions, handle_err(), call_ls(), and print_ls(). Handle_err() will handle any errors that occur along the way. Call_ls() will initiate the calling of the remote “ls” method. Print_ls() will print the results of the “ls” call. This seems a bit odd that there is one function to initiate a remote call and another to print the results of the call. But because Twisted is an asynchronous, event-driven network framework, it makes sense in this case. The framework intentionally encourages writing code that breaks work up into many small pieces.

The main section of this example shows how the reactor knows when to call which callback function. First, we create a client Perspective Broker factory and tell the reactor to connect to localhost:9876, using the PB client factory to handle requests. Next, we get a placeholder for the remote object by calling factory.getRootObject(). This is actually a deferred, so we can pipeline activity together by calling addCallback() to it.

The first callback that we add is the call_ls() function call. Call_ls() calls the callRemote() method on the deferred object from the previous step. CallRemote() returns a deferred as well. The second callback in the processing chain is print_ls(). When the reactor calls print_ls(), print_ls() prints the result of the remote call to remote_ls() in the previous step. In fact, the reactor passes in the results of the remote call into print_ls(). The third callback in the processing chain is handle_err(), which is simply an error handler that lets us know if an error occurred along the way. When either an error occurs or the pipeline reaches print_ls(), the respective methods shut the reactor down.

Here is what running this client code looks like:

jmjones@dinkgutsy:code$ python twisted_perspective_broker_client.py 
['test.log']

The output is a list of files in the directory we specified, exactly as we would have expected.

This example seems a bit complicated for the simple RPC example we laid out here. The server side seems pretty comparable. Creating the client seemed to be quite a bit more work with the pipeline of callbacks, deferreds, reactors, and factories. But this was a very simple example. The structure that Twisted provides really shines when the task at hand is of a higher level of complexity.

Example 5-16 is a slight modification to the Perspective Broker client code that we just demonstrated. Rather than calling ls on the remote side, it calls ls_boom. This will show us how the client and server deal with exceptions.

Example 5-16. Twisted Perspective Broker client—error

#!/usr/bin/python

from twisted.spread import pb
from twisted.internet import reactor

def handle_err(reason):
    print "an error occurred", reason
    reactor.stop()

def call_ls(def_call_obj):
    return def_call_obj.callRemote('ls_boom', '/foo')

def print_ls(print_result):
    print print_result
    reactor.stop()

if __name__ == '__main__':
    factory = pb.PBClientFactory()
    reactor.connectTCP("localhost", 9876, factory)
    d = factory.getRootObject()
    d.addCallback(call_ls)
    d.addCallback(print_ls)
    d.addErrback(handle_err)
    reactor.run()

Here is what happens when we run this code:

jmjones@dinkgutsy:code$ python twisted_perspective_broker_client_boom.py an
error occurred [Failure instance: Traceback from remote host -- Traceback
unavailable
]

And on the server:

Peer will receive following PB traceback:
Traceback (most recent call last):
...
<more traceback>
...
    state = method(*args, **kw)
  File "twisted_perspective_broker_server.py", line 13, in remote_ls_boom
    return os.listdir(directory)
exceptions.OSError: [Errno 2] No such file or directory: '/foo'

The specifics of the error were in the server code rather than the client. In the client, we only knew that an error had occurred. If Pyro or XML-RPC had behaved like this, we would have considered that to be a bad thing. However, in the Twisted client code, our error handler was called. Since this is a different model of programming from Pyro and XML-RPC (event-based), we expect to have to handle our errors differently, and the Perspective Broker code did what we would have expected it to do.

We gave a less-than-tip-of-the-iceberg introduction to Twisted here. Twisted can be a bit difficult to get started with because it is such a comprehensive project and takes such a different approach than what most of us are accustomed to. Twisted is definitely worth investigating further and having in your toolbox when you need it.

Scapy

If you like writing network code, you are going to love Scapy. Scapy is an incredibly handy interactive packet manipulation program and library. Scapy can discover networks, perform scans, traceroutes, and probes. There is also excellent documentation available for Scapy. If you like this intro, you should buy the book for even more details on Scapy.

The first thing to figure out about Scapy is that, as of this writing, it is kept in a single file. You will need to download the latest copy of Scapy here: http://hg.secdev.org/scapy/raw-file/tip/scapy.py. Once you download Scapy, you can run it as a standalone tool or import it and use it as a library. Let’s get started by using it as an interactive tool. Please keep in mind that you will need to run Scapy with root privileges, as it needs privileged control of your network interfaces.

Once you download and install Scapy, you will see something like this:

Welcome to Scapy (1.2.0.2)
>>>

You can do anything you would normally do with a Python interpreter,and there are special Scapy commands as well. The first thing we are going to do is run a Scapy ls() function, which lists all available layers:

>>> ls()
ARP        : ARP
ASN1_Packet : None
BOOTP      : BOOTP
CookedLinux : cooked linux
DHCP       : DHCP options
DNS        : DNS
DNSQR      : DNS Question Record
DNSRR      : DNS Resource Record
Dot11      : 802.11
Dot11ATIM  : 802.11 ATIM
Dot11AssoReq : 802.11 Association Request
Dot11AssoResp : 802.11 Association Response
Dot11Auth  : 802.11 Authentication
[snip]

We truncated the output as it is quite verbose. Now, we’ll perform a recursive DNS query of http://www.oreilly.com using Caltech University’s public DNS server:

>>> sr1(IP(dst="131.215.9.49")/UDP()/DNS(rd=1,qd=DNSQR(qname="www.oreilly.com")))
Begin emission:
Finished to send 1 packets.
...*
Received 4 packets, got 1 answers, remaining 0 packets
IP  version=4L ihl=5L tos=0x0 len=223 id=59364 flags=DF 
  frag=0L ttl=239 proto=udp chksum=0xb1e src=131.215.9.49 dst=10.0.1.3 options=''
|UDP  sport=domain dport=domain len=203 chksum=0x843 |
DNS  id=0 qr=1L opcode=QUERY aa=0L tc=0L rd=1L ra=1L z=0L 
  rcode=ok qdcount=1 ancount=2 nscount=4 arcount=3 qd=
DNSQR  qname='www.oreilly.com.' qtype=A qclass=IN |> 
  an=DNSRR  rrname='www.oreilly.com.' type=A rclass=IN ttl=21600 rdata='208.201.239.36'
[snip]

Next, let’s perform a traceroute:

>>> ans,unans=sr(IP(dst="oreilly.com", 
>>> ttl=(4,25),id=RandShort())/TCP(flags=0x2))
Begin emission:
..............*Finished to send 22 packets.
*...........*********.***.***.*.*.*.*.*
Received 54 packets, got 22 answers, remaining 0 packets
>>> for snd, rcv in ans:
...    print snd.ttl, rcv.src, isinstance(rcv.payload, TCP)
...
[snip]
20 208.201.239.37 True
21 208.201.239.37 True
22 208.201.239.37 True
23 208.201.239.37 True
24 208.201.239.37 True
25 208.201.239.37 True

Scapy can even do pure packet dumps like tcpdump:

    >>> sniff(iface="en0", prn=lambda x: x.show())
    ###[ Ethernet ]###
    dst= ff:ff:ff:ff:ff:ff
    src= 00:16:cb:07:e4:58
    type= IPv4
    ###[ IP ]###
    version= 4L
    ihl= 5L
    tos= 0x0
    len= 78
    id= 27957
    flags= 
    frag= 0L
    ttl= 64
    proto= udp
    chksum= 0xf668
    src= 10.0.1.3
    dst= 10.0.1.255
    options= ''
  [snip]

You can also do some very slick network visualization of traceroutes if you install graphviz and imagemagic. This example is borrowed from the official Scapy documentation:

>>> res,unans = traceroute(["www.microsoft.com","www.cisco.com","www.yahoo.com",
  "www.wanadoo.fr","www.pacsec.com"],dport=[80,443],maxttl=20,retry=-2)
Begin emission:
************************************************************************
 Finished to send 200 packets.
******************Begin emission:
*******************************************Finished to send 110 packets.
**************************************************************Begin emission:
Finished to send 5 packets.
Begin emission:
Finished to send 5 packets.
    
Received 195 packets, got 195 answers, remaining 5 packets
193.252.122.103:tcp443 193.252.122.103:tcp80 198.133.219.25:tcp443 198.133.219.25:tcp80 
  207.46.193.254:tcp443 207.46.193.254:tcp80 69.147.114.210:tcp443 69.147.114.210:tcp80 
  72.9.236.58:tcp443 72.9.236.58:tcp80

You can now create a fancy graph from those results:

>>> res.graph()                          
>>> res.graph(type="ps",target="| lp")   
>>> res.graph(target="> /tmp/graph.svg")

Now that you’ve installed graphviz and imagemagic, the network visualization will blow your mind!

The real fun in using Scapy, though, is when you create custom command-line tools and scripts. In the next section, we will take a look at Scapy the library.

Creating Scripts with Scapy

Now that we can build something permanent with Scapy, one interesting thing to show right off the bat is an arping tool. Let’s look at a platform-specific arping tool first:

#!/usr/bin/env python
import subprocess
import re
import sys

def arping(ipaddress="10.0.1.1"):
    """Arping function takes IP Address or Network, returns nested mac/ip list"""

    #Assuming use of arping on Red Hat Linux
    p = subprocess.Popen("/usr/sbin/arping -c 2 %s" % ipaddress, shell=True,
                        stdout=subprocess.PIPE)
    out = p.stdout.read()
    result = out.split()
    #pattern = re.compile(":")
    for item in result:
        if ':' in item:
            print item

if __name__ == '__main__':
    if len(sys.argv) > 1:
        for ip in sys.argv[1:]:
            print "arping", ip
            arping(ip)
    else:
        arping()

Now let’s look at how we can create that exact same thing using Scapy, but in a platform-neutral way:

#!/usr/bin/env python
from scapy import srp,Ether,ARP,conf
import sys

def arping(iprange="10.0.1.0/24"):
    """Arping function takes IP Address or Network, returns nested mac/ip list"""

    conf.verb=0
    ans,unans=srp(Ether(dst="ff:ff:ff:ff:ff:ff")/ARP(pdst=iprange),
              timeout=2)

    collection = []
    for snd, rcv in ans:
        result = rcv.sprintf(r"%ARP.psrc% %Ether.src%").split()
        collection.append(result)
    return collection

if __name__ == '__main__':
    if len(sys.argv) > 1:
        for ip in sys.argv[1:]:
            print "arping", ip
            print arping(ip)
    else:
        print arping()

As you can see, the information contained in the output is quite handy, as it gives us the Mac and IP addresses of everyone on the subnet:

# sudo python scapy_arp.py
[['10.0.1.1', '00:00:00:00:00:10'], ['10.0.1.7', '00:00:00:00:00:12'],
['10.0.1.30', '00:00:00:00:00:11'], ['10.0.1.200', '00:00:00:00:00:13']]

From these examples, you should get the impression of how handy Scapy is and how easy it is to use.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 5. Networking

Create new playlist

Sign In

Sign Up