A Subscriber-Based Notification System

For now, let’s put LDAP and Active Directory on the back burner and build a simple notification system based on group membership. Notification enables groupware systems to find a middle ground between push and pull technologies. To use an HTTP or NNTP docbase, a user has to actively pull information from it. People are busy, though, and the cornucopia of information sources to which the browser gives access is overwhelming. Many useful docbases languish because, as their creators like to complain, “Nobody checks them.” To combat this problem, an old method, email, was given a new name: push technology. Docbases can use email updates to reach out and touch their users. To do this most effectively, you should strike a balance between push and pull. For example, to showcase the improved HTML rendering in the Navigator 3.0 mailreader, Netscape launched a campaign called InBox Direct, which delivered web pages from participating sites into subscribers’ email inboxes. This was a good way to make a point about HTML email, but wholesale replication of web pages into email inboxes can create a lot of clutter.

The Hybrid Push/Pull Technique

There is a hybrid solution that strikes a balance between the push and pull extremes. ThePointCast system, the first of a new breed of push technologies that supplanted email with special protocols and receivers, had the right idea. A PointCast feed pushes headlines and summaries at you but links these items to more detailed documents that you can—at your discretion—pull from the Web. This is just good old-fashioned information layering and packaging, something newspapers and magazines have done for hundreds of years. You can’t put everything on the front page or the cover, but you can and should expose a little bit of every item there and link each headline or summary to its corresponding document. Docbase managers should do this too. It doesn’t require a PointCast or Active Desktop channel. Email can work well for this purpose, so long as you properly layer and package what you transmit.

Let’s apply this idea in a notification system for the ProductAnalysis docbase. Subscribers to the docbase will receive email updates that summarize and link to newly added reports, as shown in Figure 11.2.

An email update from the ProductAnalysis notifier

Figure 11-2. An email update from the ProductAnalysis notifier

Two strategies limit the demands this notifier places on its subscribers. First, it sends just one message per update cycle. The message lists all the new reports that appeared in that cycle. Second, it associates each subscriber with a list of companies and includes only reports pertaining to companies on that list.

Directory Options for the Docbase Notifier

What role should a directory play in this application? We could use it to define the group of subscribers, to store their email addresses, and perhaps even to store their lists of preferred companies. On an intranet it would make sense to use an NOS directory for this purpose if it already enumerates all potential subscribers. Ideally you’d reuse that list of names and just assign a subset of them to the subscribers group. This approach works especially well when you can tie the web component of the application to the NOS directory. In an NT environment running Microsoft’s IIS web server, for example, HTTP authentication relies on the same user and group permissions that govern the regular filesystem. Since the emailed updates link back to web-based reports, it’s very convenient to be able to send URLs to people who, by virtue of their membership in an NOS-defined group, are precisely the ones whose NOS credentials give them access to those files.

On a large intranet, though, and certainly on the Internet, the NOS directory probably isn’t the way to go. Some intranets are bound together with global directories such as StreetTalk or NDS, but most aren’t. There’s no global directory on the Internet, so HTTP and NNTP have evolved their own ways of defining users and groups. Unix-based web servers, for example, have traditionally supported .htpasswd and .htgroup files that parallel the native NOS /etc/passwd and /etc/group. Why? It’s not a good idea to define users of public services in the same way that you define users of the internal network. Segregating these two populations from each other is a basic tenet of computer security.

Internal and External Populations

Unfortunately neither the Microsoft approach, which joins the two populations, nor the Unix way, which divides them, is really practical in its pure form. TCP/IP knows no boundaries. It wants to connect everything to everything, and we want that too. It’s not enough to deploy public services on the Internet and private services on the intranet. We want to use the Internet as an extension of our intranets, so that docbases and applications that we use in our offices are equally available to us when we’re working at home or on the road. And we want to let transient collaborators use these docbases and applications too. So Internet services wind up being neither purely public nor purely private but an awkward mixture of both.

Netscape’s SuiteSpot servers offer one interesting solution to this problem. If you define users and groups in Netscape’s LDAP-based Directory Server, the web, mail, news, and calendar servers can all share a common directory. That directory can stand apart from the native NOS directory, but it doesn’t have to. In Windows NT environments you can arrange to synchronize the LDAP and NOS directories. If Netscape’s Enterprise Server were the web engine, you’d get the same effect as with IIS: URLs sent to members of the group of subscribers could be governed by permissions defined in terms of that same group. What if, instead of Enterprise Server, you wanted to use a more popular server such as Apache or IIS? You’d need to empower your web-based applications to do the LDAP lookups themselves. That’s doable, as we’ll see shortly.

If your application is going to have to look up user and group information, does it matter whether it finds that information in a directory or in some other kind of database? The application doesn’t really care. It’s a question of what makes administrative sense. If your NOS directory already lists the email addresses of the users to whom you want to send updates, then use it—directly if you can or else indirectly by means of replication. If most or all of the users are outside your company, then you’ll probably wind up listing them in a conventional database. What about the lists of companies? With X.500-derived directories such as NDS, StreetTalk, and Netscape’s Directory Server, you can attach new attributes to directory entries; with conventional NOS directories, you can’t. But just because you can store these kinds of user preferences in a directory doesn’t mean that you should. In principle it makes sense to centralize user-related data. In practice, depending on the nature and quantity of that data, it may not.

A Simple Perl-based Group Directory

Enough theory. Let’s build a simple directory, get the notifier up and running, and then explore some alternate directory modules. Build a directory? That may sound strange, but it turns out that you can provide the basic ingredients—the ability to associate a user with a group, get and set user entries with simple or complex attributes, and test for a user’s membership in a group—in just a few lines of Perl, as shown in Example 11.1.

Example 11-1. Group::SimpleGroup, a Minimal User Directory Module

package Group::SimpleGroup;

use strict;
no strict 'refs';
use Data::Dumper;

my $root = "/subscribers";

sub new
  {
  my ($pkg,$group_db,$group_name) = @_;
  do "$root/$group_db";
  my $self = 
    {
    'group_object'      =>  ${$group_name},
    'group_db'          =>  $group_db,
    'group_name'        =>  $group_name,
    };
  bless $self,$pkg;
  return $self;
  }

sub members
  {
  my ($self) = @_;
  return keys %{$self->{group_object}};
  }

sub isMember
  {
  my ($self,$member) = @_;
  return ( defined $self->{group_object}->{$member} ) ? 1 : 0 ;
  }

sub setMember
  {
  my ($self,$member) = @_;
  $self->{group_object}->{$member} = {};
  }

sub getMember
  {
  my ($self,$member) = @_;
  return $self->{group_object}->{$member};
  }

sub setProperty
  {
  my ($self,$member,$prop_name,$prop_val) = @_;
  $self->{group_object}->{$member}->{$prop_name} = $prop_val;
  }

sub getProperty
  {
  my ($self,$member,$prop_name) = @_;
  return $self->{group_object}->{$member}->{$prop_name};
  }

sub dumpGroup
  {
  my ($self) = @_;
  print Dumper($self->{group_object});
  }

sub saveGroup
  {
  my ($self) = @_;
  my $dump = Data::Dumper->new([$self->{group_object}],
                               [$self->{group_name}]);
  my $db = "$root/$self->{group_db}";
  open (F,  ">$db") or die "cannot create $db $!";
  print F $dump->Dump;
  close F;
  }

1;

Group::SimpleGroup manages a file of Perl data structures in ASCII form; the file might look like this:

$subscribers = 
  {
  'joe' => 
    {
    'email' => '[email protected]',
    'phone' => 
      {
      'number' => '555-1212',
      'ext'    => '374',
      },
    },
  'sharon' => 
    {
    'email' => '[email protected]',
    'phone' => 
      {
      'number' => '555-1234',
      'ext'    => '393',
      }
    },
  };

You can create the file using a text editor or by means of scripts that make Group::SimpleGroup calls. As we’ve seen before, Perl enables you to compose nested data structures directly. Example 11.2 shows how to create a new entry for user Ringo, add an email property to the entry, fetch Ringo’s email address, list the contents of the directory, and save the directory.

Example 11-2. Using the Group::SimpleGroup Module

use Group::SimpleGroup;
                              
my $d =                                           # load the subscribers directory
  Group::SimpleGroup->new("subscriber_db", "subscribers"); 

                              
$d->setMember("ringo");                           # create user ringo 
                             
$d->setProperty("ringo",                          # set ringo's email address
  "email","[email protected]"); 
                             
my $email = $d->getProperty("ringo","email");     # get ringo's email address
                             
$d->dumpGroup();                                  # print out the subscriber
                                                  # directory
$d->saveGroup();                                  # rewrite the subscriber_db file

The text file written by saveGroup( ) is simultaneously code and data. When you create an instance of Group::SimpleGroup, the module’s constructor loads the file whose name you pass to it and interprets it as though it were a script—which it is—using Perl’s do statement. The result is a live in-memory representation of the data structure described in the file. That structure is a hashtable that lists members of a group and defines their properties. Its name, in this example, is $subscribers. The constructor receives the text of that name (“subscribers”), forms a symbolic reference ( ${"subscribers"} ), and stores that reference in the instance variable $self->{group}. If you alter the structure with setMember( ) and setProperty( ) calls and then call saveGroup( ), Perl’s Data::Dumper module externalizes it again as as text. The first argument to Dumper->new( ) names the file that receives the ASCII version of the structure, and the second argument names the variable ($subscribers) that appears on the left hand of the assignment statement written by saveGroup( ).

A Data-Prototyping Strategy

This technique is to the realm of data management what scripting is to the world of programming: a way to get prototypes up and running in minutes or hours rather than days or weeks. Even if you plan eventually to use a directory, or an SQL or object database, it can be very useful to deploy a first version of your application on this kind of data-store prototype. Data structures need to evolve as rapidly and fluidly as code does, but industrial-strength data-management tools don’t allow the kind of freehand data modeling that can really speed up initial development.

Sometimes the prototype is all that you need. I once ran a notification system for 50 users based on the methods shown here. It wasn’t a scalable solution, but it didn’t need to be. Managing relatively static membership and subscription data in a text file was an appropriate solution for that application and that group of users. Similar opportunities abound for groupware developers. Try the simplest thing that could possibly work, get something up and running, and let usage determine your next steps. If you’ve built the wrong application, it’s doomed anyway. There’s no reason to waste effort preparing for a scale-up that won’t ever happen; cut your losses and try another tack. If you’ve built the right application, you can always swap out a simple-minded data store for a more robust one, if and when the need arises.

Anatomy of the Docbase Update Notifier

The notifier shown in Example 11.3 draws on three data sources:

A directory

The Group::SimpleGroup module defines the group of subscribers and lists their email addresses.

A docbase

The notifier tracks the records in the ProductAnalysis docbase and issues updates when new records appear.

An SQL database

One of the tables in this database, cmp_docs, associates docbase reports with the companies about which they’re written. Another table, cmp_users, maps subscribers to companies. We’ll assume these and related tables exist, are managed by another application, and are also used elsewhere. For example, the assignment mechanism described in Chapter 6, would use a company table in this database to populate the corresponding picklist in the report-assignment form. A database-backed web application would enable subscribers to define and update their own lists of subscribed companies.

Example 11.3 shows the docbase update notifer.

Example 11-3. The ProductAnalysis Notifier

use strict;

use Docbase::Docbase;
use Group::SimpleGroup;
use Net::SMTP;
use DBI;

# docbase setup
my $db = Docbase::Docbase->new('ProductAnalysis'),

# directory setup
my $sg = Group::SimpleGroup->new('subscriber_db','subscribers'),

# mail setup 
my $smtp = Net::SMTP->new('smtp.udell.roninhouse.com'),

# db setup
my $dbh = DBI->connect('DBI:ODBC:SUBS','','') 
  or die ("connect, $DBI::errstr");
$dbh->{RaiseError} = 1;

# other setudoc dhp
my $docdir    = "$db->{docbase_web_absolute}/$db->{app}/docs";
my $highwater = getHighwater();
my $hostname  = 'udell.roninhouse.com';


my (@docs) = reverse sort <$doc_dir/*.*>;       # load doc list
setHighwater($docs[0]);                         # remember highwater mark
my $metadata = {};                              # initialize metadata cache
loadDocsByCompany();                            # load metadata cache and SQL table
sendMessages();                                 # issue update messages

# fini
$dbh->disconnect;
$smtp->quit;

sub loadDocsByCompany
  {
  my $st = "delete from cmp_docs";              # empty out the table
  dbSqlExecute($dbh,$st);           
  foreach my $doc (@docs)                       # enumerate docbase
    {
    $doc =~ m#(d{6,6})#;                       # isolate sequence number
    if ($1 gt $highwater)                       # compare to highwater mark
      {
      my $metarecord = $db->getMetadata($doc);  # get metadata for doc
      $metadata->{$doc} = $metarecord;          # save it
      my $company = $metarecord->{company};     # extract company name
      my $st = "insert into cmp_docs (cmp,doc) values ('$company','$doc')";
      dbSqlExecute($dbh,$st);                   # load a row of cmp_docs
      }
    }
  }

sub sendMessages
  {
  my $st = "select distinct cmp_users.user,cmp_docs.doc 
       from cmp_docs,cmp_users where cmp_docs.cmp = cmp_users.cmp";
  my $allrows = dbSqlReturnAllRows($dbh,$st);
  my $messages = {};
  foreach my $row (@$allrows)
    {
    my ($user,$doc) = @$row;
    $user = allTrim($user);
    $doc = allTrim($doc);
    my $url = $doc;
    my $email = $sg->getProperty($user,'email'),
    push (@{$messages->{$email}}, "<a href="http://$hostname/$url">
      $metadata->{$doc}->{company}: $metadata->{$doc}->{product} 
      (by $metadata->{$doc}{analyst})</a>");
    }

  foreach my $email (keys %$messages)
    {
    $smtp->mail($email);
    $smtp->to($email'),
    $smtp->data();
    $smtp->datasend("To: $email
");
    $smtp->datasend("From: notifier
");
    $smtp->datasend("Subject: new ProductAnalysis reports
");
    $smtp->datasend("Content-type: text/html
");
    $smtp->datasend("
");
    $smtp->datasend( join ("<p>",@{$messages->{$email}}));
    $smtp->dataend();
    }
  }

sub allTrim
  {
  my ($s) = @_;
  $s =~ s/^s+//;
  $s =~ s/s+$//;
  return $s;
  }

sub getHighwater
  {
  open (F,"highwater") or die "cannot create highwater $!";
  my $hw = <F>;
  $hw =~ m#(d{6,6})#;                          # isolate sequence number
  close F;
  return $1;                                    # return sequence number
  }

sub setHighwater
  {
  my ($highwater) = @_;
  open (F,">highwater") or die "cannot create highwater $!";
  print F $highwater;
  close F;
  }

sub dbSqlExecute
  {
  my ($dbh,$st) = @_;
  my $sth = $dbh->prepare($st) or die ("prepare, $DBI::errstr");
  my $rv = $sth->execute or die ("execute, $DBI::errstr");
  $sth->finish;
  }

sub dbSqlReturnAllRows
  {
  my ($dbh,$st) = @_;
  my $sth = $dbh->prepare($st) or die ("prepare, $DBI::errstr");
  my $rv = $sth->execute or die ("execute, $DBI::errstr");
  my $allrows = $sth->fetchall_arrayref;
  $sth->finish;
  return $allrows;
  }

The modules used by the notifier are as follows:

Docbase::Docbase

A docbase is a collection of meta-tagged HTML files. The notifier uses Docbase::Docbase::getMetadata( ) to extract the structured header from each docbase record.

Group::SimpleGroup

The notifier uses this module to look up email addresses in a simple text-file-based directory. Later in this chapter, we’ll build two alternate directory modules, one for Windows NT domains and one for LDAP directories.

DBI (a CPAN module)

Perl’s universal database interface works on the Windows Open Database Connectivity (ODBC) model. A driver manager, DBI , accepts plug-in modules for individual databases—DBD::Oracle, DBD::Solid, and so on. One of these plug-ins, DBD::ODBC, adapts DBI for use with any data source that accepts ODBC connections. As a result, the notifier—or any Perl DBI application—will work the same way on a Unix system running the Solid database as on a Win32 system running Oracle. You can even use ODBC and the Microsoft Jet engine to control a Microsoft Access database. This technique, which bypasses Access and manages a .MDB file directly, is very convenient for standalone development in Win32 environments. It’s also surprisingly effective for low-intensity production use—see Chapter 15, for details.

Net::SMTP (a CPAN module)

Another of the modules in Perl’s LibNet family, Net::SMTP is a simple and effective way to send Internet mail messages programmatically. Why not just use a command-line mailer? You can if you like, but the availability and behavior of such tools varies from system to system. On Windows, there is no standard Internet mailer. The advantage of Net::SMTP is that it assumes only a TCP/IP connection, and it works the same way everywhere.

The notifier begins by loading and configuring each of its supporting modules. In this example, the DBI constructor names an ODBC data source that happens to refer to a .MDB file. In an Oracle environment, the connection call might look like DBI->connect("DBI::Oracle:","user","password"). The Net::SMTP constructor names the mail server that will relay messages to subscribers. The Group::SimpleGroup constructor names the file that lists subscribers and their email addresses. The Docbase::Docbase constructor names the docbase whose records the notifier is tracking.

The notifier stores a high-water mark—that is, the name of the most recent docbase record from the last update cycle—in a file called highwater. getHighwater( ) fetches that name into a variable. The array @docs receives a reverse-sorted list of docbase records. setHighwater( ) stores the first element of that list—the most recent record for the current update cycle—as the new high-water mark. The variable $metadata, initially an empty hashtable, will store the metadata extracted from each record in the current cycle’s update set.

Implementing Attribute-Based Docbase Subscription

loadDocsByCompany( ) populates the two data structures used by the sendMessages( ) function. One of these is the metadata cache, $metadata. The other is the SQL table cmp_docs, which maps between company names and docbase records. loadDocsByCompany( ) begins by emptying the cmp_docs table. Then, for each record, it updates $metadata with the results of a call to getMetadata( ). At the same time it inserts a company-to-docname mapping into the cmp_docs table. Since the records appear in reverse order, loadDocsByCompany( ) can halt when it sees the record that matches the old high-water mark. Figure 11.3 illustrates the data structures in play at this point and the desired set of notification lists.

Notifier’s data structures and resulting lists

Figure 11-3. Notifier’s data structures and resulting lists

sendMessages( ) begins by combining two SQL tables—cmp_docs, which was just built, and cmp_users, which is separately managed—to create a result set that expresses the docbase records to be included in each subscriber’s update message.

An Alternate, Non-SQL Approach to Attribute-Based Subscription

The SELECT DISTINCT statement is one way to map subscribers to docbase records, but suppose that the subscription lists were stored in a directory rather than an SQL database. That might make sense if the subscription were considered to be part of an employee’s organizational role and if the definitions of those roles were maintained in an LDAP directory.

In that case you could still transfer the data from LDAP to SQL in order to be able to execute the same SELECT DISTINCT statement. But once you’ve pulled the lists out of the directory, you wouldn’t have to feed them to SQL. You could just combine the structures directly in Perl. Here’s an alternate version of loadDocsByCompany( ) that puts the docname-to-company mappings in a hashtable rather than an SQL table:

sub loadDocsByCompany
  {
  my (@docs,$highwater) = @_;
  $docs_by_company = {};
  foreach $doc (@docs)
    {
    last if ( $doc eq $highwater);
    my $metarecord = $db->getMetadata($doc);
    my $company = $metarecord->{company};
    push ( @{$docs_by_company->{$company}}, $doc);
    }
  return $docs_by_company;
  }

This function produces a hashtable whose keys are company names and whose values are lists of docbase records. Given this mapping, here’s a function to combine it with the per-user lists of subscribed companies:

sub getDocsByPerson
  {
  my ($docs_by_company) = @_;
  my $docs_by_person = {};
  my ($companies,$company);
  foreach $person ( $sg->members )                # e.g., 'Joe'
    {
    $companies =                                  # e.g., [Netscape,Adobe]
        $sg->getProperty($person,'companies'), 
    foreach $company (@$companies)                # e.g., 'Adobe'
      { push ( @{$docs_by_person->{$person}}, @{$docs_by_company->{$company}}); }
    }
  return $docs_by_person;
  }

For each subscriber in the group, $companies gets that person’s list of preferred companies. Since we’re using the Perl-based Group::SimpleGroup here, the value returned from the members( ) method is a native Perl list. If another kind of directory stored the list in comma-separated variable format, Perl’s split function could turn that into a list like this:

my @list = split(',' , "Netscape, Adobe")

The structure produced this way is actually a bit more refined than what SELECT DISTINCT yields. It you refer back to Example 11.3, you’ll see that the sendMessages( ) routine still has to massage the SQL result set, using a hashtable to condense multiple rows of per-user data into single per-user lists.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.69.152