NNTP: Accessing Newsgroups

So far in this chapter, we have focused on Python’s FTP and email processing tools and have met a handful of client-side scripting modules along the way: ftplib, poplib, smtplib, email, mimetools, urllib, and so on. This set is representative of Python’s client-side library tools for transferring and processing information over the Internet, but it’s not at all complete.

A more or less comprehensive list of Python’s Internet-related modules appears at the start of the previous chapter. Among other things, Python also includes client-side support libraries for Internet news, Telnet, HTTP, XML-RPC, and other standard protocols. Most of these are analogous to modules we’ve already met—they provide an object-based interface that automates the underlying sockets and message structures.

For instance, Python’s nntplib module supports the client-side interface to NNTP—the Network News Transfer Protocol—which is used for reading and posting articles to Usenet newsgroups on the Internet. Like other protocols, NNTP runs on top of sockets and merely defines a standard message protocol; like other modules, nntplib hides most of the protocol details and presents an object-based interface to Python scripts.

We won’t get into protocol details here, but in brief, NNTP servers store a range of articles on the server machine, usually in a flat-file database. If you have the domain or IP name of a server machine that runs an NNTP server program listening on the NNTP port, you can write scripts that fetch or post articles from any machine that has Python and an Internet connection. For instance, the script in Example 14-28 by default fetches and displays the last 10 articles from Python’s Internet newsgroup, comp.lang.python, from the news.rmi.net NNTP server at my ISP.

Example 14-28. PP3EInternetOther eadnews.py

############################################################################
# fetch and print usenet newsgroup posting from comp.lang.python via the
# nntplib module 
 which really runs on top of sockets; nntplib also
 supports
# posting new messages, etc.; note: posts not deleted after they are read;
############################################################################
listonly = 0
showhdrs = ['From', 'Subject', 'Date', 'Newsgroups', 'Lines']
try:
    import sys
    servername, groupname, showcount = sys.argv[1:]
    showcount  = int(showcount)
except:
    servername = 'news.rmi.net'
    groupname  = 'comp.lang.python'          # cmd line args or defaults
    showcount  = 10                          # show last showcount posts

# connect to nntp server
print 'Connecting to', servername, 'for', groupname
from nntplib import NNTP
connection = NNTP(servername)
(reply, count, first, last, name) = connection.group(groupname)
print '%s has %s articles: %s-%s' % (name, count, first, last)

# get request headers only
fetchfrom = str(int(last) - (showcount-1))
(reply, subjects) = connection.xhdr('subject', (fetchfrom + '-' + last))

# show headers, get message hdr+body
for (id, subj) in subjects:                  # [-showcount:] if fetch all hdrs
    print 'Article %s [%s]' % (id, subj)
    if not listonly and raw_input('=> Display?') in ['y', 'Y']:
        reply, num, tid, list = connection.head(id)
        for line in list:
            for prefix in showhdrs:
                if line[:len(prefix)] == prefix:
                    print line[:80]; break
        if raw_input('=> Show body?') in ['y', 'Y']:
            reply, num, tid, list = connection.body(id)
            for line in list:
                print line[:80]
    print
print connection.quit( )

As for FTP and email tools, the script creates an NNTP object and calls its methods to fetch newsgroup information and articles’ header and body text. The xhdr method, for example, loads selected headers from a range of messages.

For NNTP servers that require authentication, you may also have to pass a username, a password, and possibly a reader-mode flag to the NNTP call, and you may need to be connected directly to the server’s network in order to access it at all (e.g., not connected to an intermediate broadband provider). See the Python Library manual for more on other NNTP parameters and object methods.

When run, this program connects to the server and displays each article’s subject line, pausing to ask whether it should fetch and show the article’s header information lines (headers listed in the variable showhdrs only) and body text:

C:...PP3EInternetOther>python readnews.py
Connecting to news.rmi.net for comp.lang.python
comp.lang.python has 3376 articles: 30054-33447
Article 33438 [Embedding? file_input and eval_input]
=> Display?

Article 33439 [Embedding? file_input and eval_input]
=> Display?y
From: James Spears <[email protected]>
Newsgroups: comp.lang.python
Subject: Embedding? file_input and eval_input
Date: Fri, 11 Aug 2000 10:55:39 -0700
Lines: 34
=> Show body?

Article 33440 [Embedding? file_input and eval_input]
=> Display?

Article 33441 [Embedding? file_input and eval_input]
=> Display?

Article 33442 [Embedding? file_input and eval_input]
=> Display?

Article 33443 [Re: PYTHONPATH]
=> Display?y
Subject: Re: PYTHONPATH
Lines: 13
From: sp00fd <[email protected]>
Newsgroups: comp.lang.python
Date: Fri, 11 Aug 2000 11:06:23 -0700
=> Show body?y
Is this not what you were looking for?

Add to cgi script:
import sys
sys.path.insert(0, "/path/to/dir")
import yourmodule

-----------------------------------------------------------
Got questions?  Get answers over the phone at Keen.com.
Up to 100 minutes free!
http://www.keen.com

Article 33444 [Loading new code...]
=> Display?
Article 33445 [Re: PYTHONPATH]
=> Display?

Article 33446 [Re: Compile snags on AIX & IRIX]
=> Display?

Article 33447 [RE: string.replace( ) can't replace newline characters???]
=> Display?

205 GoodBye

We can also pass this script an explicit server name, newsgroup, and display count on the command line to apply it in different ways. Here is this Python script checking the last few messages in Perl and Linux newsgroups:

C:...PP3EInternetOther>python readnews.py news.rmi.net comp.lang.perl.misc 5
Connecting to news.rmi.net for comp.lang.perl.misc
comp.lang.perl.misc has 5839 articles: 75543-81512
Article 81508 [Re: Simple Argument Passing Question]
=> Display?

Article 81509 [Re: How to Access a hash value?]
=> Display?

Article 81510 [Re: London =?iso-8859-1?Q?=A330-35K?= Perl Programmers Required]
=> Display?

Article 81511 [Re: ODBC question]
=> Display?

Article 81512 [Re: ODBC question]
=> Display?

205 GoodBye


C:...PP3EInternetOther>python readnews.py news.rmi.net comp.os.linux 4
Connecting to news.rmi.net for comp.os.linux
comp.os.linux has 526 articles: 9015-9606
Article 9603 [Re: Simple question about CD-Writing for Linux]
=> Display?

Article 9604 [Re: How to start the ftp?]
=> Display?

Article 9605 [Re: large file support]
=> Display?

Article 9606 [Re: large file support]
=> Display?y
From: [email protected] (Andreas Schweitzer)
Newsgroups: comp.os.linux.questions,comp.os.linux.admin,comp.os.linux
Subject: Re: large file support
Date: 11 Aug 2000 18:32:12 GMT Lines: 19
=> Show body?n

205 GoodBye

With a little more work, we could turn this script into a full-blown news interface. For instance, new articles could be posted from within a Python script with code of this form (assuming the local file already contains proper NNTP header lines):

# to post, say this (but only if you really want to post!)
connection = NNTP(servername)
localfile = open('filename')      # file has proper headers
connection.post(localfile)        # send text to newsgroup
connection.quit( )

We might also add a Tkinter-based GUI frontend to this script to make it more usable, but we’ll leave such an extension on the suggested exercise heap (see also the PyMailGUI interface’s suggested extensions at the end of the next chapter—email and news messages have a similar structure).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.122.68