Using Web APIs as Automation Interfaces

A standard web application is necessarily automatable. By “standard web application” I mean a server-based program that collects input from, and displays results in, an HTML/JavaScript browser. It’s fashionable nowadays to sneer at this basic model of web computing and to regard it as a poor first approximation of what newer browsers can do more gracefully using Java or dynamic HTML.

Java and/or DHTML may in fact usher in an era of more GUI-like web applications with capabilities—like drag-and-drop—that are beyond the scope of the standard HTML widget set and JavaScript object model. But the last web applications to exploit these capabilities will be the ones that most people rely on most of the time for most of the useful services that the Web offers: search, online shopping, airline reservations. Very few of these kinds of applications require drag-and-drop. Arguably none of them do. What they do require is what first-generation CGI technology has been delivering effectively for about four years: zero-install electronic forms connected to middle-tier logic and backend databases.

Public web-based services stick with this model because it’s simple and universal. Unless there are compelling reasons not to, your web-based groupware applications should stick with it too, and not only for reasons of simplicity and universality. Every application written to this model is at once interactive and scriptable. The implications of that fact are profound and far-reaching, and they take time to sink in.

My moment of epiphany came when I worked for a large publishing corporation with many divisional web sites. There was a corporate mandate to present the individual sites as a single coherent megasite, but no one could agree on how to aggregate all the content or unify the many different search mechanisms. Driving home from a meeting at which an IT executive had opined that multisite search was an eight-man-month, hundred-thousand-dollar problem, I suddenly saw that it was really a four-hour, zero-dollar problem that I could solve that very evening. How? All the necessary components were already deployed and web-accessible. The individual sites were periodically scanned and indexed by the public search engines. Some of those engines support advanced search pages that can be used to restrict search to pages drawn from specific sites. On AltaVista, for example, you can issue the following query:

host:udell.roninhouse.com and groupware

This query returns only pages on udell.roninhouse.com (my own server) that mention groupware. Similarly, for that set of divisional web sites, AltaVista had already done the necessary aggregation, and it already enabled a sophisticated user to search our sites! Documenting how to do that interactively, using AltaVista’s query syntax, was one solution to the problem. Much more compelling, though, was the solution I actually delivered the next morning. It was simply a form, hosted on my site, that listed the divisional sites. I wired the form to a script that created an advanced AltaVista query, shipped it to the AltaVista site, and reorganized and displayed the results.

How did that metasearch script work? We’ve already seen the basic ingredients. A web-client library (such as Perl’s LWP) transmits URL-formatted requests to target sites and fetches web pages in response. A script analyzes the resulting web pages, using regular expressions to match patterns and rearrange the output. (We’ll see several examples of this procedure later in this chapter.) I call this technique pipelining the Web, because it most resembles the good old-fashioned Unix pipeline.

Pipelining the Web

One of Unix’s greatest sources of power and flexibility is its notion of a pipeline of text-based components. To count the number of Apache daemons currently running on my Red Hat Linux system, for example, I don’t need a count-apache-daemons command, because I can just type:

ps aux | grep httpd | wc -l

In other words: list all processes, restrict the list to lines containing httpd, then count those lines. If you find yourself using this idiom often, you can package it into an alias called count-apache-daemons, which you can then use in shell scripts indistinguishably from the native commands ps, grep, and wc. With just these two simple ideas—scripts and a pipeline to connect them—Unix invented lightweight software components and rapidapplication development.

The Unix way reigned supreme during the era of text-based computing. Then the rise of the graphical user interface threatened to make it obselete. In the GUI model, programs received input as a stream of user-interface events—mouseclicks and keystrokes—and produced output by manipulating a bitmapped window display. For most people, the benefits of this new approach far outweighed the drawbacks. The prime benefit—what people really mean when they talk about “point-and-click ease of use”—is discoverability. All of the functions of a GUI-based system can be discovered by exploring its toolbars, menu trees, and dialog boxes. The chief drawback—what Unix veterans curse when they find themselves managing NT systems—is that there often isn’t any way to automate those functions.

The notion of web applications as scriptable components is especially deep and powerful, because it addresses two completely different problems, both longstanding: how to make distributed computing easy enough for routine use and how to use graphical applications as pipelined components in the manner of ps, grep, and wc.

The web model radically simplified the programming of an important subset of the GUI widget set—namely, the widgets used in electronic forms. An easy way to create and use listboxes, radio buttons, and text-input fields was part of the story, but that in itself wasn’t new. For over a decade, from Hypercard’s debut in the mid-’80s through the many incarnations of Visual Basic, there have been scripting tools that could hide the arduous details of GUI programming—things like mouse-hit detection and event-loop management.

What the web model brought to the party was a simple, text-based, language-neutral interface to the GUI. People could easily read and write the descriptions that drove web pages and forms, and so could programs written in any script language. In an environment in which those web pages and forms were often built dynamically by CGI programs, the interface to those CGI programs was again text based and language neutral—URLs that could be read and written by people or by scripts written in any language.

Web Interfaces Versus GUI Interfaces

Web APIs are both discoverable and scriptable. They’re discoverable because, by interactive use of a web application (such as AltaVista), we can expose the URLs that control its functions. They’re scriptable because we can wield those URLs using web-client tools. It’s easy to take these properties of web APIs for granted. To put things in perspective, think about what you can’t do with conventional GUI applications that lack these properties.

Consider a typical GUI application, Windows NT’s User Manager. It enables you to point and click your way through a series of dialog boxes but affords no means of automating those actions. Administrators of large domains had to do a whole lot of pointing and clicking until ADSI (see Chapter 11) made User Manager’s underlying functions available to scripts.

Notice, though, that the kind of automation made possible by ADSI is not achieved by way of User Manager, which remains trapped in its point-and-click world. Instead ADSI provides alternate paths to the underlying functions. Could User Manager itself be scriptable? Yes. A number of Windows applications, notably Word and Excel, expose many of their interactive functions to scripts by way of OLE automation interfaces. But Windows applications don’t automatically expose automation interfaces to scripts. It takes a lot of extra work to do that.

Thought Experiment: a Web-Style Win32 Application

Imagine a web-style version of User Manager. Why not? There’s nothing particularly graphical about this tool. It just displays lists of names, adds and deletes names, and pushes names from one list to another. These functions are well within the capability of server-generated HTML.

As an HTML application, User Manager would present forms to list users, edit their accounts, adjust their group memberships. The HTTP GET or POST requests issuing from those forms would define User Manager’s API. Even if that API weren’t explicitly documented, you could discover it by just browsing and using your browser’s View Source function to inspect each form’s widget names and CGI wiring.

This ability to inspect, clone, and modify web forms helps make the Web a nearly frictionless environment for software development. If a form uses the HTTP GET method, the script name and all its arguments will be left sitting in what we might call the browser’s command line—that is, the Location window in Navigator or the Address window in Internet Explorer. If the form uses the HTTP POST method, then only the script name appears on the browser’s command line. How do you discover the script’s arguments? The form’s HTML source contains all their names, so you can look there. Alternatively you can save a copy of the form, change POST to GET, and then submit it. Now the arguments will appear on the browser’s command line.

This version of User Manager wouldn’t have to do anything special to expose a discoverable and scriptable API. It would do so just because it was a web application. And it would also exhibit the following useful properties.

Local and remote capability

Many of the NT administrative tools, including User Manager, can use remote procedure call (RPC) technology to connect to remote machines. That means you can use an instance of User Manager running on one machine to manage the user database on another machine. Like an OLE automation interface, this kind of RPC interface is an optional, not intrinsic, feature of a Win32 applicatation. A web-enabled User Manager, built on an HTTP service, would inherently support local or remote clients. We’ll see examples of this technique in the next chapter.

Bookmarks

Many of NT’s admin tools are primarily navigators and editors of specialized information spaces. A web-enabled User Manager would be able to remember and replay paths through its space—the directory—using bookmarks. The tool most in need of this capability isn’t User Manager, though. It’s RegEdit, the registry editor. Spend a day within earshot of an NT administrator and you’ll hear mantras like this chanted repeatedly: “HKEY_LOCAL_MACHINE, System, CurrentControlSet, Services, W3SVC, Parameters...” Once the target key is found and altered (and the machine has been rebooted), the problem may still not be solved. So the administrator must repeat the same mantra and travel the same path to the same registry key for another try. This is nuts! In web mode, you’d bookmark that page after the second or third visit to it.

Pipelining

In a web-style environment, every one of User Manager’s dialog boxes would be a page with an address. You could visit those pages interactively, and scripts could visit them programmatically using the very same mechanism. The User Manager page that lists a user’s group affiliations, for example, might do so by writing an HTML <SELECT> statement. Knowing this, a script could treat User Manager as a component in a pipeline. It could invoke the page with a web-client call, parse the list of groups in the <SELECT> statement, then pass the list along to the next component in the pipeline—perhaps a report writer. As we’ll see shortly, XML-formatted responses can make this procedure simpler and more reliable.

During the run-up to Windows 98, Microsoft fueled speculation that the whole Windows environment would switch over to this web mode of operation. All views of the system would be dynamic web pages rendered by a local HTML engine, and web-style scripting would be the way to automate Windows. That mostly didn’t happen, apart from the common view of web and local filespaces provided by the (now deemphasized) Active Desktop. But although Windows hasn’t gone that way—at least not yet—it’s still a great idea.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.70.132