Chapter 6. Spam Whack-a-Mole

Spam is the swamp in which almost every Internet crime festers. If we want to stop the growth of Internet crime, we must drain the swamp.

The spam crisis illustrates the consequence of letting trivial problems grow unchecked. In the space of about 18 months, the problem of spam grew from a minor nuisance to the biggest and most serious security problem on the Internet.

Spam is not merely a nuisance; it is a conduit for crime. A study by the Federal Trade Commission in 2003 reported that 66 percent of spam messages sent contained a statement that was false.[1] This number represents the proportion of messages that can be proved beyond a doubt to be illegal. Other studies suggest that a much larger proportion of spam sent is outright criminal—at least 80 percent and possibly as much as 95 percent.

It is not surprising that so much spam is criminal. Few legitimate businesses would want to use a marketing scheme so disreputable that it is guaranteed to create at least 20 enemies for every sale. Spam is subject to a peculiar variation of Gresham’s law that bad money drives out the good. The bad spam—the spam that is merely an annoyance—has been driven out by the worst. It is no exaggeration to say that spam has become the primary form of Internet crime.

The Green Card Spam

Spam began with a dishonest message offering to fill out applications for the U.S. green card immigration lottery posted to every newsgroup in the Usenet news system on April 12, 1994 by husband and wife law firm Canter and Siegal. Within a few years, spam had rendered large parts of Usenet unusable, as discussions were drowned in a sea of indiscriminate advertisements.

Few people can have realized the commercial potential of the Internet as early as Laurence Canter and Martha Siegal, tried as hard to make a fortune and failed as utterly. There had been commercial messages on the Internet before the green card spam, but previous experiments had been short lived and had usually ended in an abject apology. Martha Siegal was unrepentant, considering herself to be a farsighted e-commerce visionary, the first to realize the true importance of the Internet. Martha dismissed her critics as unimportant nobodies too stupid to realize that the Internet would shortly be the property of people like herself.

Canter and Siegal did not “Make a Fortune on the Information Superhighway” as the title of Martha’s book proclaimed. Instead, both the book and business flopped, and Canter quickly lost his license to practice law as his past was uncovered by Internet sleuths. This might have been superfluous because Canter was already facing disciplinary charges involving misuse of client funds and failure to pay employees.

The green card spam was typical of what was to come, offering a service that was both useless and dishonest. Canter and Siegal were offering to fill in forms that required no legal expertise for $75 a time. Having a lawyer fill out the form would not affect the chance of winning.

Canter and Siegal soon had many imitators, many of whom shared Martha Siegal’s fondness for self-promotion. Stanford Wallace moved into spam after Congress made his junk fax business illegal. Styling himself “Spamford,” Wallace quickly became the first spam king. Canter and Siegal’s business had quickly crumbled after being disconnected by their Internet service provider (ISP). To their surprise, Canter and Siegal found that ISPs turned out to be unintimidated by threats of legal action: they were not the first lawyers to have discovered the Internet, after all. Wallace realized that keeping access to the Internet open was going to be the key to staying in business for any length of time.

Wallace was probably the first spammer to sign what soon became known as a pink contract with an ISP. Unlike a normal ISP contract that gives the ISP the right to terminate the account for abuse, a pink contract acknowledged that Wallace intended to send large volumes of unsolicited e-mails containing advertisements and that the ISP waived its right to terminate the contract for doing so. The name pink contract was coined after the improbably violent shade of pink of Hormel’s canned “ham product” SPAM.

Blacklists: Shutting Spammers Down

As the tide of spam rose, antispam activists tried to shut the spammers out of the Internet. This is a bit like playing the fairground game whack-a-mole. Each time the plastic mole sticks his head up out of a hole, you try to bash it on the head with a hammer. Hitting a mole provides a temporary feeling of satisfaction, but only until the next mole sticks his head up.

At first, complaints were directed at the ISP hosting the spammer, a strategy that experienced mixed results. Some ISPs dumped the spammers as soon as they could. Others saw an opportunity to serve a new market willing to pay premium rates for pink contracts. These ISPs were immune to complaints because all their customers were spammers. Some spammers took their operations offshore. Complaints slowed them down, but the spam continued.

As the spam tide rose further, blacklists of spammer IP addresses started to circulate among system administrators. As the number of spammers increased, maintaining these blacklists became a major undertaking.

Blacklists hit the mainstream when Paul Vixie established the MAPS (SPAM spelled backward) blacklist. Unlike previous blacklist schemes that had circulated as text files, Vixie used the DNS system to publish his blacklist. This meant that everyone had immediate access to the latest copy of the MAPS list. Using MAPS, any mail server could check whether a computer trying to connect to send mail was using an IP address listed as belonging to a spammer.

MAPS soon established a significant following. As the popularity of MAPS continued to grow, Vixie started to promote a policy he called collateral damage. If an ISP refused to terminate access to a spammer, MAPS would list all the IP addresses assigned to an ISP. This meant that mail from the other customers of the ISP would be blocked. The idea was that these customers would put pressure on their ISP to terminate the contract of the spammer so that their own mail would get through.

Collateral damage was a step too far. As Vixie himself admitted, he was attempting to force ISPs to terminate service to customers he disapproved of. Vixie had appointed himself as the unilateral arbiter of acceptable Internet e-mail use. Those listed on MAPS frequently disagreed with Vixie. Lawsuits followed.

Like many companies formed during the dot-com boom, MAPS had been established on a business model that was new and a legal theory that was untested. In statements to the media, Vixie appeared to believe that the first amendment to the U.S. constitution provided MAPS with a cast iron defense against any legal challenge that might be brought against it. When the lawsuits began to fly, MAPS discovered that a constitutional right intended to protect freedom of speech is not the best argument for a company making its living from what is an objective point of view—censorship.

Censorship is, of course, an ugly word. The blacklist’s managers did not see themselves as censors. Like many people who consider what they are doing to be righteous, they thought that the rest of the world would understand the purity of their motives and not question their deeds.

This might be why MAPS appeared to be completely unprepared for the flurry of lawsuits that resulted when others disputed their claim to be the ultimate arbiter of acceptable e-mail practice.

Real problems did occur. Some people would subscribe to the mailing lists of political groups they were opposed to in order to report them as spam and get them blacklisted.

Even blacklists that attempt to apply methods that are purely objective can be gamed. Some blacklists identify spam sources by monitoring honeypot e-mail addresses that have never been used by a real user and should never receive legitimate e-mail. This is a good test unless the honeypot address becomes known by an attacker, at which point it becomes a way to add any e-mail sender the attacker wants to block to the blacklist. Some attacks of this type have a commercial motive. In one case, a competitor caused a rival jewelry store to be blacklisted by subscribing a known blacklist to the rival’s newsletter.

MAPS was hopelessly undercapitalized for a business that would inevitably be lawsuit intensive. At one point, AOL, Microsoft Hotmail, and most of the leading ISPs in the U.S. used MAPS to block spam. A business listed as a spammer on the MAPS blacklist was effectively cut off from a large part of the Internet. MAPS won some early legal victories in some preliminary decisions, but court cases soon fell into a regular pattern. After initial bluster, a confidential settlement would be reached, and both sides would declare victory. MAPS would claim the plaintiff had agreed not to spam, and the plaintiff would state that MAPS had unblocked him.

Some observers started to believe that any spammer inconvenienced by MAPS could quickly become removed from the blacklist by threatening a lawsuit. Attention was also drawn to the substantial fees that MAPS was charging ISPs for its services.

Suddenly MAPS faced criticism from the user community as well as people who claimed they had been unfairly blacklisted. Competing blacklists started to sprout like mushrooms. At one point, there were more than a thousand blacklists operating in various degrees with accuracy varying from good to appalling.

The number of blacklists that gained a significant constituency is unknown. After a litigant listed the customers of MAPS in a lawsuit claiming antitrust violations in addition to MAPS itself, few ISPs will state which blacklists they use.

Each new blacklist was being marketed as tougher on spammers than the last. The criteria applied became ever more arbitrary. The number of IP addresses blocked and the aggressiveness with which “collateral damage” was caused became selling points. Adding entries to the blacklist generally took priority over resolving disputes over the accuracy of existing listings. Often there was no dispute resolution process.

Most blacklists were inaccurate because of mere negligence, but many became completely obsolete as the operator became bored with the game but continued to publish the out-of-date information. In occasional cases, this was due to actual malice. When one blacklist maintainer was told by his ISP that his Internet connection would be cut off if he didn’t pay his bills, he retaliated by blacklisting the ISP as a spammer. Many blacklist operators believed that they were answerable to no one.

Perversely, the lawsuits that should have held the blacklists accountable had the opposite effect. Each round of litigation resulted in a search for ways to put themselves even further beyond the law.

The nadir of this process was Spam Prevention Early Warning System (SPEWS), an underground blacklist with a Web site that regularly moved from country to country but always far from the jurisdiction of the U.S. courts. Membership of the cabal that operated SPEWS was secret, and the only way to complain was to post a message in a Usenet discussion group. All decisions were final with no appeal.

SPEWS was both “judgment proof” and impossible to trust. It is not beyond possibility that the list was from start to finish a front run by a spammer whose real purpose was to block messages from his competitors.

Blacklists are an unsatisfactory solution for many reasons. They are blunt instruments that affect the innocent as well as the guilty, in many cases by design. They act as a law unto themselves seeking to force ISPs to comply with their demands, however arbitrary. I have met many network managers, including several who work for major ISPs, who loathe blacklist vigilantes more than the spammers.

The blacklists failed because they tried to hold others accountable but refused to be held accountable themselves. All refused to accept liability for an incorrect or downright malicious listing regardless of fault.

Filters: An Effective Palliative, Not a Cure

The most effective method of keeping spam out of your e-mail Inbox is to use a spam filter. A spam filter looks for features in an e-mail message that mean it is likely to be spam. Like taking aspirin for a cold, filtering provides a necessary palliative but unfortunately not a cure. Without spam filtering, e-mail would now be useless to me. I receive 3,000 messages a day, 85 percent of which are spam. The problem with spam filtering as crime prevention is that it only stops spam affecting me personally; it does not put the spammers out of business if their fraudulent offers still reach enough people without a spam filter.

Blacklists are one source of information that some filters use, but only the most basic spam filters now rely on blacklist data alone, and many do not use externally maintained blacklists at all.

Early spam filters looked for words and phrases that were commonly used in spam but rarely appeared in legitimate e-mail. An e-mail message that contains the word Viagra is almost certainly spam, but it might be a genuine discussion of a medical condition for which Viagra is a treatment.

Other spam filters look at the way in which the messages are sent. Most spammers are outright criminals and do not want to be identified as the source of their messages. The tactics that the spammers use to conceal the origin of their messages often leave distinctive marks.

All filters fail part of the time. Spam that should have been squelched is delivered to the user. Wanted messages that should have been retained are marked as spam. Filters are usually written with a bias in favor of accepting suspect mail because seeing some spam is a nuisance, but losing important mail can be a catastrophe. If you get as much spam as I do, there is simply no time to second guess the filter and spend time checking the junk mail folder to see if an important message found its way in there by mistake.

Unless they have some means of being kept up-to-date with the latest spammer tricks, spam filters that are widely used will lose their effectiveness over time. Spammers have the greatest incentive to develop countermeasures against the filters that are most widely used.

Simply looking for Viagra does not work very well as a spam filter any more. Spammers have discovered that more of their messages get through if they use the spelling Vlagra or V1agra.

It is, of course, a minor matter to update the spam filter to the new spelling, but as soon as this is done, the spammer will have a new trick. The filter writers have to keep working to stay ahead of the spammers because the last player to move always holds the advantage. I don’t like that type of game. I don’t want the odds to be even; I want them to be tipped heavily in my favor. I want to make the bad guys have to work so much harder than the good guys that they start to think that they should look for another game to play.

An indication that a message is likely to be spam is known as a feature. A feature might be a word that often appears in spam, like Viagra, or the fact that the message comes from a blacklisted IP address. Some of the most useful features are telltale traces that are usually left by the spammer’s attempts to cover his tracks. An e-mail that contains the word Viagra is quite likely to be spam; an e-mail that contains the spelling V1agra is almost certain to be spam.

The best spam filters use large numbers of features, scoring each according to how likely a message with that type of feature is likely to be spam. Microsoft measures the effectiveness of more than 4,000 spam features on several million messages each day and uses the results to decide what features to use to filter the billions of e-mails Hotmail receives each day.[2]

Filtering works, but the effort required to make it work is huge. Imagine a room the size of a football field filled with row upon row of computers. That is what the data center of one of the largest ISPs looks like. The vast majority of that computing power is not handling the mail people want; it is handling the spam people do not want. Spam accounts for the majority of the messages received, and analyzing messages to identify probable spam content is the most processor-intensive step in the message-handling process. Each of those machines costs real money to buy, power, and maintain.

Sue and Jail Them

A spam control technique that has been used with a considerable degree of success is to track down the spammers and sue them for damages. In a recent case, a judge awarded a default judgment of more than a billion dollars against a spammer. Awards of more than a million dollars have become routine.

The recent CANSPAM legislation passed in the U.S. provides another useful tool. CANSPAM requires senders of unsolicited e-mail to provide an opt-out mechanism, valid subject line, routing information, and legitimate physical address. Adult material must be labeled.

Although some have criticized CANSPAM as being too permissive, it has been successful in criminalizing the behavior of the most prolific spammers. Law enforcement resources are limited and criminal prosecution must be targeted at the worst and most extreme offenders. The CANSPAM act is a tool for prosecutors; it is not intended to define best practices for e-mail use.

Technology people tend to be at best skeptical about the idea of suing or prosecuting spammers. Using the law looks like moving a haystack one wisp of straw at a time while someone else is using a pitchfork to add new straw to the pile. This is a pity because so far lawsuits and criminal prosecutions are the only methods that have put spammers out of business for good. It has also effectively persuaded the media to stop encouraging others to get a pitchfork and add to the pile.

Early newspaper stories about spam frequently described it as a foolproof get-rich-quick scheme. The bulk of the story would consist of an interview with a spammer and focus on his incredible financial success with at best a couple of comments from an antispam zealot tacked on to the end to give the appearance of balance.[3] The spammers had an interest in inflating their success; it helped them to convince new customers that their spamming services were an inexpensive and highly effective form of advertising.

The lawsuits and prosecutions have effectively ended this type of reporting. The spammers are considerably less keen to give interviews for a start; also, stories in the press are likely to attract unwanted attention from lawyers. Any story that does appear is likely to mention the lawsuits and drive off potential customers.

The threat of lawsuits and prosecutions has also changed the nature of the spammer’s business. At one time, a person trying to become a spammer could concentrate on sending spam for other businesses. Legitimate companies do not want to use an advertising method that might land them with a lawsuit regardless of how profitable or effective it might appear, so this route is effectively closed to entrants without criminal contacts.

The main drawback to using prosecutions is scale. Courts and lawyers are an expensive remedy, and prisons are even more so. If the law is going to be an effective deterrent, it must become a more credible deterrent, which means that the difficulty and cost of bringing a case must be significantly reduced. The main challenge in bringing a case against a spammer is finding out who the spammer is. The law provides a range of tools that can help uncover the true identity of a spammer, but they are expensive and cumbersome to use.

The Longitude of the Internet Age

In 1707, Admiral Sir Clowdisley Shovell ran his fleet of four ships aground off the Scilly Isles. Almost 2,000 men died in the disaster. The admiral himself was one of the only two men washed to shore, where a fisherwoman murdered him for his ring. The disaster led to the Longitude Act of 1714, which established a prize for finding a method of determining longitude, the pursuit of which quickly became synonymous with lunacy.[4]

My favorite crackpot longitude scheme is the “powder of sympathy,” which it was claimed would cause a patient to heal when sprinkled on an article belonging to them, such as a bandage. The sympathetic magic was alleged to work instantaneously and at any distance, but not painlessly. In fact, the patients were alleged to jump with pain when the powder was applied to the sympathetic article.

A supply of sympathetic power provides a means of obtaining the longitude: A wounded dog is taken aboard the ship, and the bandage is kept in the harbor. Each day at noon precisely, the powder of sympathy would be applied to the bandage, causing its sympathetic partner to howl in agony and thereby tell the ship’s company the time of noon at the meridian.

The search for a solution to spam has become the longitude problem of the Internet. Like the longitude problem, the problem of spam is easily explained and understood; understanding the environment in which the problem must be solved is the hard part.

Galileo had discovered a precise means of measuring longitude using the moons of Jupiter more than a hundred years before Sir Clowdisley sailed. The problem was that it depended on precise astronomical observations that would be impossible to make on board a ship, even in the calmest of seas.

For a solution to the spam problem to be acceptable, it must have the minimum possible impact on the legitimate uses of e-mail. Solutions that attempt to deter spammers by introducing a charge for every e-mail sent completely fail in this regard because a charge that was large enough to deter spammers would completely change the economics of sending e-mail.

Another reason the “penny post” approach to spam control fails is that it presupposes the existence of an international payments infrastructure capable of transferring very small amounts of money. This solution is like suggesting inventing radio as a solution to the longitude problem without providing any further insight into solving the problem of actually inventing radio.

The idea of making it uneconomic to send spam is a sound one. But there is no need to charge the legitimate e-mail user fees in order to deter the illegitimate user. The way to deter spammers is to make spammers pay.

Another common problem in bad antispam schemes is that they actually create more spam than they eliminate. Of course, nobody is going to use an antispam scheme if it increases the amount of spam he receives. But lots of people use antispam schemes that send spam to other people.

The main scheme of this type is called challenge-response. The idea is that when you send someone an e-mail, his robot replies with a message that says, “Did you really mean to send this message?” The mail will only be delivered to the user if you send an e-mail confirming that you did really send it. The marginally less objectionable schemes of this type will only challenge someone the first time you try to send him a message. Others will send a challenge every time.

Most schemes only require a response to the e-mail. Others require you to solve a puzzle intended to distinguish a human from a machine, such as reading a word that has been distorted in a way that makes it difficult to read using standard machine-reading methods (see Figure 6-1).

Turing test to check that the user is a human

Figure 6-1. Turing test to check that the user is a human

Whether these schemes actually work as intended is a matter of current debate. Are the problems genuinely hard for machines to solve, or is it just that nobody has had an incentive to make an attempt? The seminal CMU paper that introduced the idea of using Turing tests in this way (and the unlovely acronym CAPTCHA) is subtitled “How Lazy Cryptographers Do AI.”[5]

The schemes are also vulnerable to a man-in-the-middle attack. When the user answers a puzzle presented at one site, he might in fact be answering a puzzle set at another site. Alternatively, an attacker can simply pay people living in low-wage countries to spend their days solving the puzzles.

The basic challenge-response scheme has been reinvented several times. The earliest invention I am aware of is by John Mallery at MIT for the Open Meeting project, an online forum that the AI lab ran as a part of Vice President Al Gore’s Reinventing Government initiative.[6]

Some uses of challenge-response are justified. The messages sent to the Open Meeting would be forwarded to thousands of people; it was important to make sure that the comments really did come from the people they appeared to. There are many other uses that are generally considered legitimate.

Most e-mail lists now use challenge-response to verify subscribers before adding them to the list. This went from being rare to being a near-universal practice in a matter of a few days after someone decided it would be fun to subscribe the White House to a few thousand e-mail lists. To increase the effect, they caused some of the mailing lists to be subscribed to each other in such a way that a single e-mail sent to one mailing list would fan out to a thousand other lists, which would forward it on to the White House. Fortunately, the White House had already experienced a large number of e-mail attacks and was ready for this sort of thing. The mailing lists used to mount the attack were less prepared.

Challenge-response is a useful technique that is widely used on the Internet. It is also a pain in the neck. Every site on the Internet seems to want me to fill out a subscription form. Then they want to check my e-mail address so they can send me spam later on. Like many other frequent Internet users, I have an e-mail account I use just for this.

Using challenge-response to filter your personal e-mail is a spam-shifter, not a spam solution. You get less spam, but I get spammed by your challenges.

One thing I find objectionable about these schemes is the implicit assertion that their user is somehow especially important and that special permission is necessary before speaking to him. What I find really infuriating though is the way they will proclaim that they have never had anyone complain. Of course, there are no complaints when the people who find these schemes objectionable can’t talk to you without using them!

I recently got a message from a student asking me for some information on a protocol I had designed. Although I rarely have time to respond to these requests, I took ten minutes or so to write a reply even though the point was addressed in the specification, which was easy enough to find. In return, I received a challenge from the student’s mail service asking me to prove I was not a spam sender.

A question that I have never had a satisfactory answer to from proponents of these schemes is what happens when two people using this type of scheme try to talk to each other? Alice sends an e-mail to Bob, whose e-mail program sends back a challenge to Alice. With most challenge-response systems, this is the point at which Alice then sends back a challenge to Bob and the challenges just ping-pong back and forth between the systems. Sometimes the schemes try to be clever and always allow a challenge through, but how can they tell the difference between a challenge and spam? If challenges are allowed through, then a spammer can send spam disguised as a challenge.

Challenge-response is a good example of an antispam measure that faces the car burglar alarm problem: It works only for as long as the number of people using it is small. After large numbers of people start using the scheme, spam senders have an incentive to do the trivial amount of work necessary to bypass it.

This example shows the importance of measuring the success of spam control mechanisms correctly. Proponents of challenge response schemes often claim that they have a zero false positive rate—that is there is no risk that a legitimate message will be misclassified as spam. By my way of thinking, however, the false positive rate is actually 100 percent because every message is considered spam and quarantined until proven legitimate.

What really matters is the amount of e-mail that will end up lost. Even if every recipient of a challenge was willing to respond to it (I never do), a high error rate is still inevitable. Many people will send out an e-mail and then shut down their computer. By the time they return, they will have forgotten that they had sent the message.

The challenge-response schemes are an attempt to reduce spam by checking that the purported sender of the e-mail is the real sender—that is, they are a form of authentication. The principle is sound, but the mechanism used is too disruptive. Every e-mail user is required to personally authenticate himself to every new correspondent.

The Worst of the Worst

The worst spam reduction idea of all is to fight fire with fire and attempt to hack into or otherwise disable the machines sending the spam. The idea of vigilante justice is always tempting in theory, particularly when it appears that law enforcement is ineffective. In practice, vigilantes create more problems than they solve; mob rule inevitably leads to attacks against the innocent as well as the guilty. The blacklists lost credibility by targeting the innocent as well as the guilty. There is no reason to believe that vigilantes would be any better.

The interdependent nature of much of the Internet infrastructure means that vigilante attacks will inevitably affect innocent parties even if they only intend to target the guilty. If a vigilante attempts to disable one customer of an ISP, the effects will inevitably be felt by every other customer.

People who are caught often attempt to claim that they are really misunderstood vigilantes attempting to protect the public interest. If vigilante actions are legitimized, there is no hope of ever establishing the rule of law on the Internet.

A screensaver promoted by a search engine company demonstrates how a scheme of this type can go wrong. The screensaver was designed to perform a denial-of-service attack against alleged spammer sites. The machine used to direct the attacks was quickly compromised and used to attack a series of entirely innocent machines until a cabal of network administrators effectively disconnected the controller from the Internet.

Out of the Ashes

So far, the history of spam control has been somewhat depressing. The e-mail system continues to function, but only at great cost and effort. The good guys are at best one step ahead of the spammers.

We can learn many lessons from this history, but I believe two particular lessons to be critical. First, a party that attempts to hold others accountable must accept accountability itself. Second, we cannot win if we allow the spammers to set the rules of the game. We will apply these lessons in Chapter 7 where we consider the problem of e-mail spam, and again in Chapters 13 through 15, when we consider the larger problem of establishing abuse-resistant communication systems.

Key Points

  • Many proposals have been made to stop spam.

    • It is easy to propose a mechanism that stops spam on a small scale, but these rarely work at Internet scale.

    • Schemes that require major changes to the way in which e-mail is used are unacceptable.

  • Blacklists are lists of IP addresses of known or alleged spammers.

    • Maintaining an accurate blacklist requires a considerable effort.

    • Complaints about spam may be malicious.

    • Blacklists demand accountability from others but refuse to be held accountable themselves.

  • Per message charges

    • Would have a major impact on the way e-mail is used.

      • A charge that is large enough to deter spammers would deter many current uses, especially mailing list hosting.

    • Might legitimize sending practices considered to be spam.

    • Depend on the creation of a low-cost “micro-payments” infrastructure that nobody has succeeded in building.

  • Challenge response schemes

    • Provide a lightweight authentication mechanism.

    • Are generally considered reasonable when used to prevent abuse of mailing lists.

    • Are highly objectionable when used as a personal spam control scheme. They are a spam displacement tool.

  • Retaliation schemes in which the spammer is counter attacked frequently end up attacking innocent parties.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.29.112