Click the comments link on any story to see comments or add your own.
Subscribe to this blog
29 Dec 2005
A new company called Blue Security purports to have an innovative approach to getting rid of spam. I don't think much of it. As I said to an Associated Press reporter:
"It's the worst kind of vigilante approach," said John Levine, a board member with the Coalition Against Unsolicited Commercial E-mail. "Deliberate attacks against people's Web sites are illegal."Before they started their current scheme they contacted every anti-spam organization around, including CAUCE where I'm a board member, trying to find someone who would sponsor their scheme. Everyone including CAUCE said no. Since they announced their plan as a separate company, it is my understanding that at least two and maybe three web hosts have booted them off due to their abusive plans.
26 Dec 2005
We're taking a few days off for the holiday. Until then, we wish a joyous holiday season to our readers in the religious and/or cultural mode(s) you may choose to embrace, and we'll be back next week.
The Queen's Annual Message to the Commonwealth is well worth reading, whether or not you're a Commonwealth citizen (we're not) for its heartfelt thoughts about what Christmas means at the end of this difficult year.
12 Dec 2005
03 Dec 2005
02 Dec 2005
30 Nov 2005
In one of the more peculiar developments at this week's ICANN meeting, the ICANN board took the contentious .XXX domain off the agenda for the board meeting at the end of the week. Multiple sources say that the European Union threatened to withdraw all of their delegates to the Government Advisory Committee if the board didn't do so.
But Stuart Lawley, head of the ICM Registry that is proposing .XXX has told me that he spoke privately to the EU delegates, all of whom told him that they have no objection to .XXX, but are using the domain as a hostage in an argument with ICANN about ICANN's processes and the accuracy of information provided by ICANN to the EU.
Update: The court denied the TRO request, saying that CFIT "has not shown that the need for immediate relief is clear." Instead the judge treated the request as notice for preliminary injunction, with papers to be served by December 5th, and a hearing on February 10th.
Yesterday, two separate suits to stop the settlement were filed by newly created organizations, Coalition for ICANN Transparency and World Association of Domain Name Developers. I'm reading the complaints and will post an analysis when I have a chance.
20 Nov 2005
16 Nov 2005
The Knowledge@Wharton newsletter published by the Wharton School at the University of Pennsylvania has an interesting article on the Google Print cases.
Professor Kevin Werbach offers a fine capsule summary of the case:
"Google is clearly going out on a limb with respect to copyright. The limb may well hold, which I think would be the better result as a matter of public policy. On the other hand, the limb could easily break. The courts will decide."
They also excerpt an interview with Pat Schroder, president of the AAP which filed the second case:
"snippet" isn't a legal term and could evolve from meaning a sentence to meaning a complete chapter.
I read that to say that what Google is doing is OK, and even though there's no evidence that they plan to do anything else, we need to stop them just because of what they might do. I hope the judge has enough sense to recognize a cloud of smoke when he or she sees it.
06 Nov 2005
An article in CNET reports that Google hasn't resumed scanning library books. They say it's ``an operational thing'' and confirm that when they do resume, they'll be scanning older books rather than those that are still in print.
02 Nov 2005
01 Nov 2005
31 Oct 2005
30 Oct 2005
25 Oct 2005
08 Oct 2005
01 Oct 2005
25 Sep 2005
06 Sep 2005
Blue Security was scheduled to give a talk on the 4th at TAUSEC, the Security and Computer Forensics Forum at Tel-Aviv University in Israel. This would have been an excellent chance for Blue Security's developers, who work nearby in Herzliya Pituach, to describe what they accomplished to a knowledgable audience.
Unforunately, they cancelled at the last minute with no explanation. Oh, well.
28 Aug 2005
25 Aug 2005
Sending pornography to children is really bad, right? Then making it illegal to e-mail porn to children is a great idea, right? Nope.An article in USA Today describes the perverse effects of new laws in Michigan and Utah. Both laws make it illegal to send ads to minors for things that minors aren't allowed to buy, with serious legal penalties if you do. Both have established opt-out lists on which parents can list addresses used by their children, and mailers can pay to have their lists scrubbed against the opt-out lists. Both states use a new scrubbing service run by Unspam, LLC, run by Matthew Prince who also runs the interesting Project Honeypot. The scrubbing service's website is coy about the cost of scrubbing, but the Utah regulations prescribe a fee of 1/2 cent per address and Michigan allows up to 3 cents per address. Even 1/2 cent is a significant tax on senders, and 3 cents/address is probably more than the entire cost of sending a large email campaign. Both state laws say you have to scrub and pay the ``email tax'' every 30 days to keep your lists clear of opted out addresses.
31 Jul 2005
MAAWG is the Messaging Anti-Abuse Working group. It was started by Openwave, a vendor that sells e-mail hardware and software to large ISPs and originally consisted only of Openwave customers, but has evolved into an active forum in which large ISPs and software vendors exchange notes on anti-spam and other anti-abuse activities. Members now include nearly every large ISP including AOL, Earthlink, Yahoo, Comcast and Verizon is a member, along with ESPs like Doubleclick, Bigfoot, and Checkfree, and vendors like Ciscom, Ironport, Messagelabs, Kelkea/Trend, and Habeas. They've also been quietly active in codifying best practices and working on some small but useful standards like a common abuse reporting format.
Earlier in July their technical committee quietly released an evaluationn of SPF and Sender-ID. Although it is worded very tactfully, the message is clear from phrases like;
While MAAWG neither endorses nor discourages the use of SPF or Sender ID, the technical committee's findings highlight real-world risks to the delivery of legitimate e-mail when the specifications are implemented.
At about the same time, Earthlink equally quietly removed the SPF records they'd been publishing for at least a year. That was particularly surprising because SPF originator Meng Wong had been working with Earthlink to get their SPF set up. If Meng can't make SPF work, who can?
I particularly look forward to see what happens in November when Hotmail says they will start showing a yellow warning box (the Big Yellow Box Of Death, or BYBOD to the cognoscenti) on any incoming mail that doesn't pass Sender-ID. With no SPF records at all, Earthlink's mail won't pass Sender-ID, and will, we assume, be 100% BYBOD compatible. Will Hotmail blink and add their own synthetic SPF records for Earthlink? Will Earthlink publish SPF records that only Hotmail can see (and if they do, how could we tell?) Should be interesting.
(Claimer: most of MAAWG's members are companies that pay a substantial membership fee, but they also have a few invited individual members, including me.)
14 Jul 2005
Last month, Microsoft's Hotmail decided to check Sender-ID on all of its incoming mail, and display a warning box for messages where the Sender-ID said they came from the wrong place. This provoked widespread sceptical responses (including one here.) They further said that in November, they'll even show the warning box for mail with no Sender-ID info at all and perhaps move it into the junk mailbox.
This aggressive move was surprisingly out of character for Hotmail, and we couldn't figure out why they were making a move that in all likelihood will route lots of real mail to the spam folder and leave 100% Sender-ID compliant phishes in the inbox. But one extra bit of info makes it all clear.
Remember when Bill Gates said that the spam problem would be solved in two years? Well, that was in January 2004, and the time will run out pretty soon. If they shoot the Sender-ID magic bullet in November, all the spam will be dead by the end of January, right? I can hardly wait.
12 Jul 2005
The second annual Conference on Email and Spam will happen at Stanford University next week, 21-22 July. I'm on the program committee, and the quality of the accepted papers is pleasantly high. I'm speaking on Experience with Greylisting and no, I didn't review my own paper.
If you're interested in both what people are doing to understand spam and what they're doing to fight it, this conference is by far the best place to learn about them and meet the people involved.
The conference registration is limited, but they still have space available. Visit their web site to see the list of papers and to register.
30 Jun 2005
26 Jun 2005
Here we have a piece of mail purportedly from MBNA (a large credit card bank headquartered in an impressively large and anonymous building in Wilmington DE that I walked past a few weeks ago) about a utility bill that perhaps is available in their system for me to pay. Again the only thing I changed was to turn the target address to firstname.lastname@example.org. All of the X- headers were in the original mail.
23 Jun 2005
Phishing is a big problem, and banks have given us lots of advice like don't click on links in e-mail messages and watch for mail from fake sources. So take a look at this message that I got earlier this year and tell me whether it's real or a phish. (I already know the answer. This is a thought experiment.)
19 Jun 2005
18 Jun 2005
Sender-ID is Microsoft's entry in the anti-spam technology sweepstakes. It's a scheme developed during last year's MARID fiasco in which their earlier Caller ID propsal and Meng Weng Wong's SPF were merged, sort of. Microsoft's patent claims and the details of the patent license they offered so severely distracted MARID that the merits or lack thereof of Sender-ID didn't get much attention.Now, Microsoft's Hotmail, which also handles the mail for MSN users, says that they will shortly be checking Sender-ID on all mail to Hotmail and will show a yellow warning box on all mail that doesn't pass. What should senders do? Ironically, for most senders, the best answer is nothing.
12 Jun 2005
We have upgraded our weblog software to allow readers to leave comments. To read comments on a story or leave your own, click on the small comments link at the bottom of each story.
When you leave a comment, you must provide a valid e-mail address to which it will mail a message with a confirmation URL. Your address won't be displayed with the comment unless you check a box that explicitly permits it. (No, we won't add it to a spam list, either.) This avoids the noxious problem of blog spam, large irrelevant comments containing links to sleazy web sites that want to increase their search engine ranking.
If you know what a trackback is, an inter-blog crossreference, they should work, too.
Marketers are not all stupid or immoral, despite frequent claims to the contrary in the anti-spam community. But when they try their hand at on-line marketing many of them make what seem like obvious mistakes, due to false analogies between e-mail and other media.
I wrote this note over a year ago, but it's just as relevant today, particularly as CAN SPAM encourages some who were sitting on the fence to give it a try.
Brian McWilliams, author of the pulp favorite Spam Kings (which I must I admit I tech edited), has a new article in Salon called How Microsoft is losing the war on spam. He interviewed me by e-mail during the research on the article, and here's what I said.
1. Given the amount of spam being sent through Trojaned Windows proxies, do you think it's accurate to say that Microsoft is indirectly responsible for much of the spam problem today?Definitely.
I got a letter the other day from AOL postmaster Carl Hutzler, about how the Internet community could get rid of spam, if it really wanted to. With his permission, here are some excerpts.
Spam is a completely solvable problem. And it does not take finding every Richter, Jaynes, Bridger, etc to do it (although it certainly is part of the solution).
In fact it does not take email identity technologies either (although these are certainly needed and part of the solution).
The solution is getting messaging providers to take responsibility for their lame email systems that they set up without much thought and continue to not care much about when they become overrun by spammers. This is just security and every admin/network operator has to deal with it. We just have a lot of providers not bothering to care.
We need message providers to implement better security on their networks and take responsibility for their networks being sources of spam. The number of ISPs who don't even authenticate their members is frankly appalling (just for starters).
AOL has implemented the solution to stop spam on our system. We do not send it any more. We even published the solution in the ASTA [the Anti-Spam Technical Alliance, a group of the largest ISPs] technical document. We are again trying to get the info to other messaging providers via the MAAWG.org group.
But no one wanted to listen to one ISP. So we had to apply the set of solutions for every other ISP around the planet for them!
1) The port 25 blocking we do for them (via pattern matches on their dynamic space or getting their actual dynamic IP space from them if their regex set-up is not thought out well)
2) Our Second Received Line rate limits which put reasonable controls on the amount of mail an end user can send through their ISPs mail server.
This is why AOL reported our spam is almost eliminated. Yes, I said it, eliminated. I get so little spam on my AOL business account (the one that has 20 pages of google results, countless newsgroup hits, etc). I think I have gotten 10 spams total in my inbox over the last month and many of them go to the spam folder where they should be. Just think how different everyone's spam problem could be if ISPs did a few of these things, and more simply, took responsibility for their customers/networks. Spam would be gone.
But no one else is reporting success like this? Why? Because every other ISP is building better and better filters to help their system fend of the spam. But the sources of spam are still there and spammers can keep sending till their hearts content until we stop them at the source.
Why do we all keep building better filters? Because it helps us instead of helping others. And its easy as most of these are shrink wrapped software or services that are easy to apply. Good for Postini and Brightmail and spamassasin, but not a solution, just a bandaid. Why do people do this and never try solving the problem? Security for our networks and messaging platforms is much harder to implement, and likely most importantly, it does not help the ISP stop spam inbound to its network usually. So no one does it.
What we need is for providers to do BOTH. You have to implement better filters to survive (we sure do), but we all also have to fix our sources of spam that clog other networks. Eventually as providers do BOTH actions, the problem will go away and everyone will be able to remove the BANDAIDS from the spam wound as we won't need filters and blacklists as much in the future.
A Funny example
If a spammer had a T1 line provided by [a large network], we all would be up in arms that the network is all of a sudden a blackhat ISP hosting known spammers on the Spamhaus ROKSO list, etc, etc. But the fact that that network and many other ISPs are hosting spammers via trojaned and zombied customers and have no security on their network to prevent this situation or manage it at least, does not seem to bother us (messaging providers) as much as it should. Well shame on us.
If you want less spam, then can we all commit to manage our systems better?
Carl then went on to comment on a large web hosting company, which will remain nameless both to protect the guilty and because many other web hosts are just as bad.
They have been spamming the be-jesus out of AOL for months now because they have customers who run insecure formmail and other CGIs. When will these premier hosting companies write a program to find them before the spammers and prevent customers from installing these open relays (cgi scripts) on their network? When will these companies monitor their scomp [AOL's automated spam reporting] complaints and take them off the air without my team having to constantly call them? When will they stop telling their customer service reps to blame AOL for delivery issues their customers are seeing when they can't mail to AOL because we have temporarily blocked them for the 15th time in 2 months?
Should anyone be allowed to operate an email system? Perhaps not. Or perhaps we will find a group of ISPs that band together to create a second email system on top of the current one for email providers that know how to control their networks. And the other people will be on another system, the old one filled with spam.
Everything that Carl says is, largely self-evidently, true. What do we have to do to persuade networks that dealing with their own spam problem, even at significant short term cost, is better for the net and themselves than limping along as we do now?
For those who've been living in an e-mail free cave for the past year, phishing has become a huge problem for banks. Every day I get dozens of urgent messages from a wide variety of banks telling me that I'd better confirm my account info pronto. Early bank phishes were pretty clumsy, but the crooks have gotten better at it and current phishes can look very authentic. See this archive of recent phishes at antiphishing.org for some examples.A very common trick is the fake link, in which the link you think you're clicking on isn't the one you're really clicking on, like this:
Why another weblog? Partly vanity publishing, partly to keep track of all the notes I make about the technology of e-mail. These days the biggest topic is spam filtering, but there's a lot more to e-mail than that. We look at approaches for handling real mail to keep the flood under control, applications layered on e-mail and more.
I had a most illuminating conversation with the managers at a large cable ISP a few weeks ago. They told me that they'd been getting constant requests from their users for more control over the spam filters. Then they changed to a new filter system, and the requests stopped. What happened?
The simple difference is that the old filters didn't work very well and the new filters do. When a filter isn't stopping your spam, it's easy to conclude that the problem is that your spam is special, and the filter needs to be adjusted to know about your special spam needs. But it isn't stopping your spam, the reason is really much simpler: It's because it's not a very good spam filter.
Although there are plenty of theoretical arguments about why some people might consider a given message to be spam and others wouldn't, the reality seems to be that we all agree what's spam and what's not. For the few arguments where people differ, the differences are not great and the messages in the grey area tend to be ones that people wouldn't object to receiving, but wouldn't care if they were lost, either. (Think newsletters from companies from which you've bought stuff.)
TOTN, NPR's afternoon call-in show squeezed in a five-minute segment at the end of today's show about Bill Gates' recent pronouncement that e-postage will solve the spam problem.
So they called me up and asked me to be their expert. I talked really, really, fast, since I had about an hour's worth of stuff to say in 5 minutes, but it came across pretty well. The audio archive is on NPR's web site at http://www.npr.org/rundowns/segment.php?wfId=1751911.
In case you were wondering, I don't think any more highly of Bill's e-postage vaporware than of anyone else's e-postage vaporware.
The country's first criminal trial about spam ended in Leesburg, Virginia earlier this month with a conviction of Jeremy Jaynes, better known under his nom de spam of Gavin Stubberfield. I was an expert witness for the prosecution, the Commonwealth of Virginia.
The case was brought under Virginia's state anti-spam law, not the weaker Federal CAN-SPAM act. Virginia's law makes it a crime to send unsolicited bulk mail using forgery, so the Commonwealth had to show first that Jaynes sent lots of unsolicited mail and second that it was sent using forgery. The mail in question was sent on three days in October to AOL, which is why the case was heard in Leesburg, the county seat of Loudon county in which AOL's mail servers are located.
The first few days of the trial, which I didn't attend, largely consisted of AOL employees documenting the volume of mail they'd received from Jaynes' networks on those days (millions), and the number of user complaints (20,000.) Then MCI confirmed that the networks were Jaynes'. The prosecution showed some CD-ROMs found in Jaynes' house, full of vast lists e-mail addresses, as many as 30 million on one of them, and a nearly full set of AOL addresses on another. The trial went slower than the prosecutor expected, mostly because of a lot of procedural skirmishing with the defense, so I spent the day on which I'd expected to testify waiting outside in the hall reading a book.
They were finally ready for me about 10 AM on the next day. The prosecution asked me to testify as an expert on e-mail technology, specifically to explain why it was utterly implausible that the people on Jaynes' lists could have asked to be on them. The prosecutor had collected some summary numbers about the mail that Jaynes had sent to AOL, showing the IP addresses and the HELO domains he'd used.
I explained that legitimate bulk mailers send mail from a consistent place using consistent names and formats, so that recipients can recognize the mail they've asked for. Jaynes, to put it mildly, didn't do that.
The HELO domains his mail hosts used were obvious forgeries, several being in .bz which I explained was Belize, a nice place to visit but an unlikely place to run an ISP. He sent mail from hundreds of different IP addresses, a good idea if you're trying to disguise the nature of a spam run, but unlike what legitimate mailers do. We repeated this for subsequent spam runs. Then the prosecutor asked whether it was likely that mail sent to all of AOL's addresses was legitimate. "Only if it was AOL who sent it."
Then Jaynes' lawyer cross-examined me, to try to discredit what I'd said. We went off on a detour through click-wrap licenses, in which I agreed with him that few people read them and it was easy to imagine that on page 17 of a license that people hadn't read, there was small print agreeing to receive bulk mail. He then asked what I'd do if I mailed to a list like that. After thinking for about two seconds, I said that since he was talking about a list collected by deceiving people I wouldn't use a list like that, so I couldn't speculate about how someone might use it. Hmmn. Next question.
Then we fenced about the Belize issue. He argued that country domains can be cheap, wasn't that a reason to use them? Since normal domains cost only about $15, I agreed that if someone's business was in such dire shape that five bucks made a big difference, I suppose maybe. (I later checked and found that .bz domains cost $35, which is more than normal .com or .biz.) Since they'd previously established that Jaynes had about $20 million in assets, hmmn, next question. We went through a few more exchanges like that, then the defendant's other lawyer asked whether I ran the ASRG, which I said I did, and we were done.
That was Friday afternoon, so I went home. The case went to the jury early the next week and as has been widely reported in the papers, they sentenced Jaynes to nine years, and gave his sister the largest fine they could under the charge that the judge gave them. Since the court agreed with the prosecution that Jaynes was a flight risk, they put him in jail until they worked out a bail agreement of $1M bail, and confined him to Loudon county wearing a GPS ankle bracelet until final sentencing. The defense lawyers are making brave noises, but the conviction seems quite solid to me, and the first amendment issues that they want to raise have all been rejected in other cases about spam and junk fax.
While it's certainly satisfying that such a major spamming crook got the jail time he deserves, this case cost the Commonwealth of Virginia a whole lot of time and money money for preparation, staff work, and expenses. (We experts don't work for free, we wouldn't be credible if we did.) Going to this level of effort to knock out the top 10 or top 20 spammers is plausible, but going after 100 or a thousand just isn't going to happen. That tells me that we still need more effective civil remedies that individuals or small networks can afford to pursue.
Update: The IETF leadership put the MARID group out of its misery today and shut it down since it was clear that it would never arrive at a consensus recommendation. They suggest that everyone who had a proposal for MARID submit it as an experimental RFC and go out and implement it so we can get some real world experience with all of these putative spam fixes. I heartily concur.
The IETF MARID working group slogged away all summer trying to produce a draft standard about e-mail sender verification. They started with Meng Wong's SPF and Microsoft's Caller ID for E-mail, which got stirred together into a hybrid called Sender ID.
One of the issues hanging over the MARID process has been Microsoft's Intellectual Property Rights (IPR) in Caller ID and Sender ID. The IETF has a process described in RFC 3668 that requires contributors to disclose IPR claims related to their contributions. Microsoft has sent in some oracular IPR disclosures about patent applications relative to Caller ID and Sender ID, but with little detail since the applications hadn't been published. But yesterday, they were published.
I have read the two Microsoft patent applications published yesterday and analyzed the claims. I'm not a patent lawyer or patent agent, but I have read enough patents over the years that I think I can do an adequate job of figuring out what their claims cover.
Visit the USPTO application search page and search for 60/454517, the serial number of the provisional application from which they both derive, or you can download PDF versions from my web site for application number 20040181571 and 20040181585.
Patents consist of a narrative description of the invention, followed by claims. A claim can be independent, standing on its own, or dependent, based on a previous claim. Dependent claims are generally minor tweaks to independent claims, so I'll look at each independent claim and all of its dependent claims as a group.
This application is the less troublesome of the two. About 2/3 of it deals with methods of detecting IP spoofing, which aren't relevant here (or, if you ask me, anywhere else since TCP stacks started randomizing sequence numbers.) The rest describes what is essentially Caller ID.
Claims 1-8 and 10-21 cover anti-IP-spoofing techniques. (There's no claim 9. Oops.)
Claims 22-38 cover Caller ID. Claim 22 says:
22. In a receiving domain that is network connectable to one or more sending domains, the receiving domain including one or more receiving messaging servers configured to receive electronic messages from sending domains, a method for determining if a sending messaging server is authorized to send electronic messages for a sending domain, the method comprising:
an act of receiving an electronic message purportedly sent from the sending domain;
an act of examining a plurality of parameter values of the electronic message to attempt to identify an actual sending side network address corresponding to a sending computer system; an act of querying a name server for a list of network addresses authorized to send electronic messages for the sending domain;
an act of determining if the actual sending side network address is authorized to send electronic messages for the sending domain;
and an act of providing results of the determination to an message classification module such that the message classification module can make a more reliable decision as to classifying the received electronic message.
Note the word "plurality" in clause 3, which is patent-speak for "more than one". I believe that SPF classic, which only checks a single message parameter, the bounce address, isn't covered here. SPF may also check the HELO domain, but that's not a parameter of the message, it's a parameter of the connection. Yes, this is hair-splitting. Welcome to the wonderful world of patents.
Claims 39-41 and 42-45 cover anti-IP-spoofing.
Claims 46-49 cover Caller ID again, phrased in a different way. They also refer to a "plurality of parameter values of the electronic message".
The claims in this application are breathtakingly broad. Along with a lot of computational puzzles, which we don't care about, they cover a wide class of sender verifications and, as an afterthought, scoring spam filters.
Claims 1-18 cover sender verification. Claim 1 is extremely broad:
1. In a receiving domain that is network connectable to one or more sending domains, the receiving domain including one or more receiving messaging servers configured to receive electronic messages from sending domains, a method for determining a sending domain's electronic message transmission policies, the method comprising:
an act of receiving an electronic message from the sending domain;
an act of receiving one or more electronic message transmission policies corresponding to the sending domain;
an act of parsing relevant electronic message transmission policies from the one or more received electronic message transmission policies;
and an act of providing the relevant electronic message transmission policies to a message classification module such that the message classification module can make a more reliable decision when classifying the received electronic message.
To me, that covers SPF, Caller ID, Sender ID, and any plausible variation on them that calls back to the message domain for advice on handling the message. By some readings it might also cover CSV, but from the narrative text it's clear that they're talking about message domains, not host domains. Paul Vixie's original domain verification proposal was published in May 2002 and he said it wasn't new at that time, so I have my doubts about the novelty of this claim.
Claims 19-20 roughly restate claim 1.
Claims 21-35 and 36-44 cover puzzles.
Claim 45 is similar to claim 1.
Claims 46 and 47 are about puzzles.
Claims 48-49 come out of left field and cover scoring spam filters:
48. A computer program product for use in a receiving domain that is network connectable to one or more sending domains, the receiving domain including one or more receiving messaging servers configured to receive electronic messages from the sending domains, the computer program product for implementing a method for generating inputs to be provided to a message classification module, the computer program product comprising one or more computer-readable media having stored thereon computer executable instructions that, when executed by a processor, cause the receiving domain to perform the following:
receive an electronic message;
utilize one or more of a plurality of different mechanisms for attempting to determine if the received electronic message is an unwanted or an unsolicited electronic message;
and provide results of each of the one or more different mechanisms to a message classification module such that the message classification module can make a more reliable decision when classifying the received electronic message.
Since Spamassassin 1.0 was published in Sept 2001 and did exactly this, classify messages based on several criteria, I find it hard to understand how they'd claim this as new in a 2003 patent application. Surely they are familiar with Spamassassin.
Keep in mind that these are just applications, and we don't know whether the USPTO will issue a patent and if so which if any of the claims would be allowed. I assume that their other applications are similar, and we don't know what if anything will issue in other countries.
The issue that concerns me most is that the claims in these applications, particularly in '585, are much broader than what Microsoft's IPR disclosed. Note that since the IPR documents are written by lawyers, not techs, I'm not faulting any of the Microsoft employees who have been participating in MARID and don't get to set their employer's policies.
If '585 issues as a patent in anything like its current form and Microsoft's license doesn't change, it would make SPF or any other similiar system legally very risky since the MS license only lets you implement Sender-ID, not other things that are like Sender-ID. Regardless of what the MS IPR said, their patent rights depend on what's in the patent, and if you look at cases where patents were broader than the IPR disclosure in the standards process, the results can be really ugly. Google for RAMBUS JEDEC for a notorious example.
At this point, I see a variety of unappetizing alternatives. One is to wait and see what patents issue, but that could take years. Another is to standardize only what MS is willing to license. Or we could decide that the '585 claims are implausible and ignore them, at our peril.
My personal inclination is to say that none of the domain/IP verification schemes are good enough to be worth this much heartburn, put them all back on the shelf, deal with the less controversial CSV and BATV proposals, and turn our attention to message signatures in the new MASS working group. (Update: the people working on CSV and BATV, of which I am one, are trying to set up a separate group to work on them.)
The FTC Authentication Summit a few weeks ago featured a blizzard of three and four letter abbreviations of proposed authentication schemes. Here's a rundown of the current candidates.
SPF and Sender ID: The SPF and Sender-ID juggernaut continues to roll on. Sender-ID has evolved considerably from its origin as Caller-ID, almost entirely by replacing parts of the Microsoft design by their SPF eqivalents, moving Sender-ID to the point that it's now indistinguishable from SPF other than the modestly important detail of which of the six or seven sender address fields on a message it checks.
Both of them do what's known as path authentication, attempting to validate an message by listing a set of sending computers for each domain. Each time a SMTP server receives a message, it looks up the list of permitted computers for the domain of the sender address that it wants to check, and if the actual sending computer is in that list, the message passes. This works a lot better for some senders than others; a cynic would say that the ones for which it works well tend to be Microsoft customers.
Large ISPs I've talked to say that the only thing they plan to do with SPF or Sender-ID is to whitelist known domains. That is, if the SPF or Sender-ID validates, and the domain is one that they already know, they're more likely to give the message a pass. Otherwise, it's as though there wasn't any SPF. A few small providers through a combination of desperation and overenthusiasm have started rejecting mail that fails SPF, with the predictable result that they lose a lot of valid mail and alienate their users.
Domain Keys: Yahoo's Domain Keys (DK for short) takes an alternative approach of message authentication. Each domain can publish a set of cryptographic validation keys for its mail. When it sends a message, the sending computer adds a header line to the message containing a cryptographic signature based on the contents of the message. The recipient computer looks up the sender's validation key and checks the signature. If the signature passes, the message is good. The important difference between message and path authentication is that message authentication says that the message itself is good, while path authentication only says that the message came from someplace plausible.
In recent months, Yahoo has been working aggressively to get people to try DK, by paying for development of open-source DK libraries and working with mail software vendors to provide DK add-ons. Since DK is somewhat more complex than path authentication, the first few months were spent working out imeplementation bugs and ensuring that everyone was signing and checking consistently. Yahoo and Google's Gmail are now signing their outgoing mail, and several large ISPs say they're planning to test DK validation to see how useful it is.
Identified Internet Mail: Cisco's Identified Internet Mail (IIM) is about 95% the same as DK, with the differences in minor details. The IIM signature includes a copy of the message headers it signed so a recipient can compare them to the actual received headers and see if they changed ``too much'' (for some definition of too much.) DK and IIM are far more similar than Caller-ID and SPF were, and DK and IIM remain separate more for political reasons than technical ones.
CSV and BATV: CSV and BATV are two simple but useful anti-forgery proposals that were submitted to the IETF MARID group, got lost in all of the noise, and have resurfaced recently.
CSV (Client SMTP Validation) is a simple path authentication scheme that arguably provides 90% of the value of SPF or Sender-ID with about 5% of the work, and none of SPF's false positive problems. Rather than asking whether a particular computer is allowed to send mail for a particular domain, CSV merely asks whether that computer is allowed to send mail to the rest of the net at all. On a typical network, a few computers handle the mail for the whole network, and the rest of the computers shouldn't be sending mail directly to the outside world at all. CSV keys off the HELO name that a computer uses to identify itself when it starts an SMTP mail session. If the HELO is valid (the name is in fact the name of of the computer at that IP address), CSV allows a simple lookup to see whether the name is authorized to send mail.
While SPF and Sender-ID check the domain name in the return address, CSV checks the domain name of the computer sending the mail. In practice, the two checks seem likely to be about equally reliable, since a given computer tends to send either all spam or all good mail, and if it sends a mix, at least now you know who to complain to. At the FTC Authentication summit, several network operators expressed interest in testing CSV even though the CSV presentation was rather hard to follow.
BATV (Bounce Address Tag Validation) is a very simple trick to deal with what's known as spam bounce blowback. As an example, I operate a service called abuse.net that people can use to look up addresses to send abuse reports to responsible parties. Spammers are not fond of abuse.net, and a couple of Russian spammers out of spite put fake abuse.net addresses on vast amounts of their spam. Since their spam lists are no better than anyone else's, a lot of that spam is undeliverable and bounces back, with the bounces coming back to the real abuse.net. On a bad day, I can get 300,000 bounces even though the real abuse.net typically sends only about 500 messages a day. BATV lets me pick out the handful of real bounces from the mountain of bogus ones. All that BATV does is to put a simple cryptographic signature into the bounce address in mail. If the original bounce address was email@example.com, the simplest form of BATV rewrites that to prvs=fred/0744abcdef@abuse,net, where the prvs stands for private validation signature, and the stuff after the slash is a signature of the bounce address and the date. When a bounce comes back, my server checks the signature and if it's good, it's a real bounce. By good fortune, I did my original BATV prototype about a week before a new virus started forging large amounts of spam from several of the domains hosted here, and it's been a lifesaver, efficiently rejecting tens or occasionally hundreds of thousands of forged bounces every day.
Unlike all of the other proposals, prvs-style BATV can be implemented unilaterally, that is, one mail system can use BATV perfectly well whether or not anyone else does. If a message signing scheme such as DK or IIM becomes widely accepted, a likely extension to BATV will use the same keys as the message signatures do, so that recipient hosts can make a quick check of the bounce addressses in incoming mail to reject obvious forgeries.
A friend wrote describing his wife's experience activating a new credit card. She had to call an 800 number, which connected to an automated system that made her sit through a long ad for a worthless registration or credit report service before it admitted that her card was good. I think all big banks do that now.
I realize one way that we geeks differ from the range of normal computer users is that we expect to have control over our computers. For most people, their experience of a Windows PC is that strange things happen, they don't know why, they don't know what to do about it. Windows pop up all the time. They used to say "Install oompha flooba greep. OK?" but now since Windows XP SP2 they say "Install oompha flooba greep. This may be very, very, very, dangerous. OK?" and they click OK anyway because as often as not, it's not. (I got that message a few days ago about Microsoft's own applet that figures out which updates to Office you can apply.)
They can't tell the popups that are part of the ESPN or Orbitz web sites from Gator spyware. But we can.
So when I got my my most recent credit card, I noticed that I had the option of phone or web activation, so I chose web activation because I knew that I could make the crud go away via a combination of popup blockers and clickthroughs, wresting control back from the droid that would otherwise have made me sit through an ad for credit card insurance.
Since we're used to control, we get particularly bent out of shape at attempts to take it away from us. But most people lost that long ago.
In my roles as postmaster at CAUCE (the Coalition Against Unsolicited Commercial E-mail) and abuse.net, I get a lot of baffled and outraged mail from people who have discovered that someone is sending out spam, often pornographic spam, with their return address on the From: line. ``How can they do that? How do I make them stop?'' The short answers are ``easily'' and ``it's nearly impossible.''
One way that e-mail is very similar to paper mail is that you can scribble any return address you want on an envelope and mail it. With paper mail, just like e-mail, you can imagine ways to make it more difficult to scribble the name of someone you don't like, but the costs of doing so would be huge, and the benefits dubious. ...
You might think that it wouldn't be hard to run an e-mail mailing list, but you'd be sadly mistaken. (By "run", here I mean the technical parts of adding and removing addresses, sending mail, amd receiving responses. At Yahoo Groups, it's Yahoo running the lists, not the list managers.)
I've seen a few BCP (Best Current Practices) documents floating around like ISIPP's standards from a meeting last September.
So as not to miss the boat, here's mine.
Some less technical points:
Spam still is a hot topic, there's a lot of conferences.
E-mail Tech Conference, sponsored by Ironport and others. In San Francisco, June 16-18, 2004. Practical orientation, invited speakers, exhibits.
WSIS Thematic Meeting on Countering Spam sponsored by the International Telecommunications Union. In Geneva, July 7-9, 2004. The ITU is the United Nations agency that coordinates all of the world's national communications regulators for radio, TV, telephony, and the like. This meeting is part of the World Summit on the Information Society, I'll be there, not sure if I'll be presenting or not. Geneva speaks French, but the meeting is in English.
First Conference on Email and Anti-Spam sponsored by Microsoft Research and a host of four-letter organizations. In Mountain View CA, July 30-31, 2004. Research orientation with refereed papers. I'm on the program committee, and some of the papers I've reviewed have looked pretty interesting.
60th IETF sponsored by the Internet Engineering Task Force. In San Diego, Aug 1-6, 2004. The IETF is the group of nerds that manages the technology of the Internet through "rough consensus and running code." Spam is only one of topic of many, but more likely than not I'll be giving a sweeping technical overview of the spam situation.
I was listening to Vint Cerf talk at the ETC conference this morning on the history of the Internet and the future of e-mail.
One of the more problematic approaches to spam is a walled garden, an e-mail systems that lets its users exchange mail, but doesn't let mail to or from the rest of the world in or out. Walled gardens are easy to keep spam-free, since the management can set rules and eject people who misbehave.
Why didn't the Internet didn't have any spam for the first decade of its existence? Because it was a walled garden. You could only get in if you were an ARPA or later NSF contractor, there were clear rules, and people could be and were ejected for abuse. Has there ever been a system that allowed cheap communication, didn't have an abuse problem, and wasn't a walled garden? I don't think so. As soon as fax machines became cheap, we had a junk fax problem. As soon as robocallers became cheap, we had a robot junk call problem.
The phone system by its nature is considerably more secure than the Internet, and it's very hard to make a phone call that can't be traced. (You may have to subpoena the info from the phone companies, but that's a detail.)
This suggests that the authentication schemes that people are working on, sich as SenderID (SPF+Caller ID merged) and Domain Keys will be useful if they can provide a level of authentication comparable to what the phone system has. It also suggests that we still need better anti-spam laws, since the junk faxes and robocalls are only kept in check by rather draconian laws that outlaw them completely. For spam, we can only hope.
A few weeks ago I was at an Industry Canada meeting in Ottawa where we talked about spam and e-mail authentication. They introduced a Stop Spam Here campaign (aussi disponsible en français) that tells people how to install virus filters and hide their e-mail addresses.
One of the topics that came up over lunch was an ill-considered bill in Parliament that would have required ISPs to provide spam filtering to all of their customers. While munching on the fussy little hotel sandwiches provided, I had a vision ...
"Ottawans have a problem. Ill-mannered teenagers have been dumping cans of garbage from overpasses onto cars on the 417, the main highway in the area. To deal with this problem, the local police have provided some helpful tips:
You get the idea.
While spam garbage shields are an unfortunate necessity these days, I really think I'd rather put my effort into sending the cops after the miscreants so they don't dump the garbage in the first place.
The CAN SPAM Act of 2003 went into effect a year ago on Jan 1, 2004. As of that date, spam suddenly stopped, e-mail was once again easy and pleasant to use, and Internet users had one less problem to worry about.
Oh, that didn't happen? What went wrong?
There are a few good things about CAN SPAM. It made some arguably fraudulent practices specifically illegal, and set per-spam statutory damages. That allowed a variety of lawsuits such as the one where an Iowa ISP won a billion dollar default judgement against a Florida spammer. It also explicitly ratified ISPs authority to set and enforce their own stricter policies about e-mail.
But overall, CAN SPAM's weaknesses outweigh its benefits. The biggest problem with CAN SPAM is that it doesn't actually forbid spam, for any normal definition of spam. So long as mail doesn't involve fraudulent elements, and contains specified contact and opt-out information, it's 100% legal until the recipient begs the sender to stop. This has set an extremely low floor for mailers to meet, and far too many now argue that since they comply with CAN SPAM they must be OK. I've gotten spam from the National Council of Churches, who really should know better, to addresses that were clearly scraped from my church's web site and added to the NCC's list without asking permission. When I complained, they pompously assured me that they complied with the letter and spirit of CAN SPAM, an utterly vacuous claim since CAN SPAM says nothing at all about non-commercial e-mail. (The obvious counter-argument is that if they didn't comply with CAN SPAM, they're be criminals, but they evidently don't see it that way.)
Another problem is that the remedies are cumbersome, since they require filing in Federal court, so they're likely to be useful only to medium and large businesses who get a lot of spam and can bundle many similar complaints into one case. CAN SPAM wiped out a lot of more stringent state laws, but even so, the remaining state laws are at least as useful as CAN SPAM. For example, the criminal conviction of large-scale spammer Jeremy Jaynes was under the Virginia state law, not CAN SPAM.
What does this all portend for the future? A surprising press release from AOL reported that the amount of inbound spam at AOL dropped by 22% compared to a year ago. Other ISPs reported no such drop, so we can only speculate about the causes, but my speculation would be about one part spam filtering, which AOL does well, and four parts legal threats, both the Jaynes criminal case and several civil cases they've filed in the past year. Spammers may turn away from large ISPs and aim more at smaller domains who are less likely to have the resources to sue them. Tune in again next year and find out.
The Federal Trade Commission and NIST had a two-day Authentication Summit on Nov 9-10 in Washington DC. When they published their report explaining their decision not to create a National Do Not Email Registry, the FTC identified lack of e-mail authentication as one of the reasons that it wouldn't work, and the authentication summit was part of their process to get some sort of authentication going. At the time the summit was scheduled, the IETF MARID group was still active and most people expected it to endorse Microsoft's Sender-ID in some form, so the summit would have been mostly about Sender-ID. Since MARID didn't do that, the summit had a broader and more interesting agenda.
As part of the run-up to the summit, the FTC posed a series of questions about authentication. They got a few dozen more or less relevant responses including one that I wrote. They also accepted requests to speak and set up an agenda.
At the conference, I got to go first, laying out what authentication can hope to do and all the ways it could mess up mail if done poorly. After that there was a series of panels with technologists, marketers, lawyers, and a smattering of other interested parties. People's positions fell into largely predictable groups. Microsoft talked about Sender-ID, which they think is wonderful. Since Sender-ID only really works for senders that have full control over their mail stream and send it all from one place, bulk e-mail marketers, who are just about the only mail senders that fit that description, also think that Sender-ID is wonderful. People who run more heterogeneous mail systems with more varied mail sending methods, such as consumer ISPs and universities, think that Sender-ID is considerably less wonderful. The consensus I heard among mail recipients is that while valid Sender-ID can help to whitelist known friendly domains, there are so many ways for legitimate mail to fail Sender-ID that nobody with any sense will ever reject mail based on failure.
A panel consisting largely of lawyers also had unsurprising results. Microsoft's lawyer thinks that the license they offer for Sender-ID is just like every other patent license and is completely adequate for all purposes. Daniel Quinlan from the Apache Software Foundatation explained why the open software world can't use it but since he's neither a lawyer nor a lobbyist, he didn't make much headway. Yahoo's Miles Libbey did say that their license for Domain Keys avoids the problems that Microsoft has. The Electronic Frontier Foundation reiterated their usual position against any kind of filtering of political or anonymous mail, but as usual failed to explain how we can tell political from non-political mail and how to deal with the costs of dumping vastly more spam into people's mailboxes. The EFF rep said she manually sorts through 2000 spams a day and apparently believes that is a productive use of her time.
The next few panels described the various technical proposals. The audience expressed considerable interest in trying them all out, in one case despite an utterly baffling presentation.
On the second day, I sat in on the international panel, in place of the ailing Neil Schwartzman, to describe what's happening in Canada (nothing too surprising.) Hadmut Danisch presented a proposal for country-specific mail sending scheme of debatable practicality, but went on to discuss the severe limits that European privacy laws can put on a reputation system. For example, if a domain belongs to an indvidual, like my johnlevine.com, information about that domain could be considered personal information subject to privacy laws. As far as I know, there's no case law or regulatory statements to tell how serious an issue this is, but it's not one that can be dismissed out of hand. I commented that reputation systems are likely to be country-specific since the mailers in the US are different from the mailers in Canada and other countries.
I missed the final panel on reputation systems, but I gather it said that they don't exist, and they'll be a challenge to create in ways that are both legal and effective. At the end of the conference, Commissioner Orson Swindle said there'd be another conference next year with the strong implication that we'd better have progress to report Or Else.
According to this interesting article in the Seattle Post-Intelligencer, Richard Clarke, security adviser to four presidents from Reagan to GW Bush, thinks so:
SAN FRANCISCO -- Don't expect Richard Clarke to rely on Microsoft Corp.'s anti-virus or anti-spyware programs to protect his own computer.
"Given their record in the security area, I don't know why anybody would buy from them," the former White House cybersecurity and counterterrorism adviser said yesterday, when asked for his thoughts on Microsoft's forthcoming line of security software. ...
He oughta know. I wouldn't disagree.
What's the point of e-postage? All the e-mail systems I know would have no trouble handling their incoming mail if they only got mail that their users wanted, even with a little unwanted mixed in. This tells me that the goal of e-postage is some combination of compensating recipients and deterring or at least rate limiting senders.
If you're going to compensate recipients, you either need a single central post office (or a cartel of post offices that trust each other, which amounts to the same thing), or else you need settlements to get the money from the sender's stamp provider to the recipient's.
We presumably agree that we don't want a central post office to meter the mail as it's injected into the system. I have a pretty good idea of what it would cost to build a transaction infrastructure big enough to count the stamps to get the raw data needed for settlements, which tells me that settlements aren't going to happen, either.
So let's think about rate limiting or deterring senders. Perhaps I'm suffering from a failure of imagination, but I don't see useful rate limiting without some cooperation from the recipients. This is for two reasons: one is that recipients don't want to rate limit the mail they want, and the other is that without some way for the recipients to audit the rate limiting, naughty senders will lie and claim they're rate limiting when they aren't.
There's a minor exception here to the cooperation rule for for ISPs and their customers. If I were an ISP, I would rate limit my customers' outgoing mail since anyone who sends a whole lot of mail and hasn't told me they're running mailing lists is a zombie. Although I can imagine various ways to rate limit by charging by the message, it seems a lot easier and more effective to rate limit by rate limiting, either slowing down senders or just firewalling them until they fix their problem. If you charge, you'll have trouble collecting (credit card companies aren't thrilled about penalties other than their own) and spammees won't find the fact that you made extra money from the spam your users sent them very reassuring.
Receipient rate limiting might take the form of hashcash, although that seems too easily circumvented so long as the bad guys have zombies to do their hashing. With better authentication, recipients might be more able to count mail by sender and tell the overeager senders to back off. With agreements a la Bonded Sender, recipients might demand consideration to accept mail from dubious sources, but then you're back to a cartel or settlements to make the demands stick.
So where does this leave e-postage? Other than as a clumsy way to get people to fix their zombie machines, nowhere I can see.
The more I think about what identity means in the on-line world, the less I think we're doing a good job with it.
Most on-line identity systems are set up to prove that you're the same person you were last time. For a lot of kinds of e-mail, that's fine, the first time you get mail from someone you can decide pretty quickly whether it's mail you want, and add the person to a whitelist or blacklist. When more mail arrives from the same person, you read it or throw it away.
A more subtle but probably more important kind of identity is proving that people are who you think they are. There's a lot of identity verification that we do in day to day life that doesn't carry over very well into computers. For example, I grow and shave off a beard from time to time. My driver's license picture doesn't have a beard, but people who look at it and at me have no trouble figuring out that even with the beard, it's close enough. Face-to-face, we're good at telling what details matter and what details don't. Online, we aren't. I use a lot of different e-mail addresses, both to keep mail for different roles and jobs separate, and to track mail from correspndents I don't entirely trust, like on-line stores that demand an address when you order something. It's extremely tedious to explain to security software that all of these addresses are equally me, they're just different whiskers. At the least, I need to enter all of the addresses into whatever software creates and verifies certificates, or worse I have to keep a certificate per address and fish out the one that matches whichever address I'm currently using. That rapidly becomes more trouble than it's worth.
Sometimes the exact identity of a person or organization isn't as important as identifying them as a member of a group. For example, when you're looking for a policeman, any real policeman will do, but almost-policemen such as security guards won't. When I want to cash a traveller's check (or these days more likely get a cash advance on my Visa debit card), I need to find a bank but it doesn't much matter which one. Banks are easy to identify, since they have tellers, a vault, and an FDIC sticker on the window. Again, it's not hard to figure out whether a person or an institution is part of a category, using cues so familar that we often don't know what they are.
This sort of category identity is if anything more important on-line than it is in day to day life. If I get e-mail from CITIBANK-ACCOUNTS.COM, is it really about my account at Citibank? Nope. As spammers and phishers have found, there's an unlimited variety of names that are enough like well-known names to fool people. Current signing schemes like SSL don't help, because they can assure you that mail from CITIBANK-ACCOUNTS.COM or the web site WWW.CITIBANK-ACCOUNTS.COM is really from the owner of the domain CITIBANK-ACCOUNTS.COM, but they don't tell you whether that's Citibank.
The obvious solution is industry specific certification. Banking should be the first industry to do that, both because (in the US at least) there is a clear definition of who's a bank, S&L, credit union, or whatever, and who isn't, and because, well, banks have all our money.
There's two ways one might do the certification. The first is a certificate signing agency, sort of like Verisign and the other agencies that do SSL signing now, but just for banks. The signing part would be technically straightforward to set up, but the hard part would be branding, telling consumers that if it doesn't have a Golden Dollar Sign seal, an e-mail message or web page isn't from a real bank.
The second is an industry specific top-level domain. There are some so-called ``sponsored'' TLDs now that restrict registrants to particular industries, including .museum, .coop, .aero, and .pro. So far, the sponsored domains have all been complete failures. Most of them saw the domain as a marketing gimmick, and provided nothing of value to the registrants that they wouldn't get from the domains they already had in .com and .org. For the most part, security and trust in their domains isn't an issue. (``Oh, no, what a fool I was, they said it was a co-op, but it was really a producers' collaborative!'')
The .pro domain is different. It's supposed to be for licensed professionals, doctors, lawyers, and accountants. Applicants have to verify their credentials at registration time, and they say they'll provide SSL certificates with each registration. Unfortunately, .pro seems permanently stuck in the pre-start-up phase and it has as far as I can tell, no registrants at all yet. Too bad, since it would be nice to be able to depend on mail from my accountant coming from pwc.cpa.pro, ey.cpa.pro, deloitte.cpa.pro, or kpmg.cpa.pro, and be confident that a .pro address isn't a creative phisher in Romania.
Would it make sense to set up .bank, overseen by the FDIC and other bank regulators, with registrations limited to regulated banks and similar financial institutions? Heck, yes. I don't understand why they haven't done it yet.
IBM researcher Nathaniel Borenstein has commented that everyone agrees that spam is bad, and that's a huge impediment to doing anything about it. Having decided that spam is bad, it's tempting to divide the spam problem into smaller problems and try to solve the smaller problems, then put the solutions to the subproblems together and, voilà, no more spam. That would be fine if the combined subproblems were truly equivalent to the spam problem, but that's rarely the case.
A common approach is to divide the spam problem in to the authentication problem and the introduction problem. The authentication problem involves ensuring whoever claims to have sent an e-mail message really did send it (or as a minor variant, that the recipient can detect and reject forgeries.) Authentication has gotten a lot of attention with systems like PGP, S/MIME, SPF, Sender-ID, and Domain Keys. While it's far from solved, it's fairly well understood.
The introduction problem involves vetting mail from people who haven't written before. The idea is that a recipient keeps a list of people who've sent good e-mail. When a message arrives from someone not on the list, the sender does something to indicate good faith or non-spamminess, and is then added to the recipient's list. If the introduction fails, the recipient might put the sender into a bad senders list, or just ignore the message so future mail from the same sender will require another introduction attempt.
The introductory something can be fairly complex and onerous, since each sender only has to introduce himself once to each recipient, and it should be onerous enough that spammers won't go to the effort to do it. In such a system, we'd expect bad guys to try to circumvent the introduction by forging mail from someone already in the recipient's list. That's why the introduction approach is only useful if the authentication is good enough to prevent forgeries.
Viewed in this way, a lot of anti-spam proposals turn out really to be introduction proposals. Challenge/response, hashcash, CAPTCHAs (blurry pictures of words that the user has to retype), and refundable e-postage fall into this category. While some of these proposals are quite clever, and some of them are plausible solutions to the introduction problem, none of them solve the spam problem, because the introduction problem is not the spam problem.
For one thing, the introduction approach doesn't match the way that people really use e-mail very well. Its model is that a stranger will write to you, you'll decide whether you like the stranger's mail, and then add that e-mail address to your accept or reject list. But people visit a vendor's web site, order something, get order confirmations and (if they ask for it) newsletters from the vendor. But what address will the confirmations and newsletters come from? It's rarely possible to predict. We can imagine schemes where as part of the ordering transaction the vendor adds its addresses to the user's good sender list, but even if such schemes could be designed and deployed, they would be a tempting target for bad guys to subvert and stuff their addresses into unwitting users' lists.
For another, the introduction scheme presumes that senders' behavior stays the same, that someone who sends good mail will always send good mail and vice versa. That strikes me as extremely optimistic. In the late 1990s, spammers sent spam through other people's existing mailing lists. They don't spam that way now since other approaches are easier, but if the fastest way into people's good sender lists is to piggyback on other mailing lists, they'll do it again. They'll join the list, possibly sending out an innocuous message or two, then blast out spam to the list until the list owner notices and cuts them off. (Yes, this has happened.)
The introduction approach presumes both that mail from unknown senders is probably spam, and that legitimate senders are interested enough in getting their message delivered to bear the burden of the introductory something. This may be true, or it may not be. I often see someone ask a question on a mailing list or newsgroup, send them an answer to the question, and get back some sort of introductory challenge. Am I going to jump through their hoops to do them a favor? Probably not.
Finally, the spam problem is unwanted bulk mail, regardless of where it comes from, not mail from strangers. I publish contact e-mail addresses in my books, and readers send me a lot of mail. It's from people who haven't written to me before, and it's not spam. An accreditation system (third parties that vouch for senders) would help manage that problem a lot better than an introduction system.
Introduction systems aren't inherently bad, but they're not inherently related to spam, either.
I was doodling on the back of an SMTP envelope trying to figure out what the economics of an e-postage system would look like.
Let's say there's 100 billion pieces of mail a day. Knock out 80% of that as spam that's not going to pay, and half of the remainder as mail from known correspondents and lists that recipients will accept without stamps. (Does anyone have any stats for how much of the non-spam is CE?) So you have 10 billion stamped messages per day, and let's optimistically assume that a stamp costs a penny. That's $100 million. Unless I guess wrong, ISPs are going to demand most of that penny to accept the mail since, as people I know at ISPs have noted, a lot of the stamped mail will be mail that their users aren't thrilled to get, so let's assume that the e-postage bank takes a 10% commission.
So that's $10 million/day, which means that the bank has to handle the valid transactions for a tenth of a cent apiece and the rejected spam transactions for a hundredth of a cent apiece, just to break even, with nothing left for building the system, selling the stamps, remitting the recipients' share, fighting the legal battles, etc. I realize that compute power is cheap but $0.0001 per transaction where each mistake costs real money strikes me as considerably beyond the state of the art.
It's a Final Ultimate Solution to the Spam Problem. Vernon Schryver of Rhyolite Software coined the term earlier this year in his highly informative web page called You Might Be An Anti-Spam Kook If.... It lists 48 warning signs to look for in any proposed anti-spam system, and particularly in proponents of an anti-spam system.
Schryver is the author of the widely used Distributed Checksum Clearinghouse, a system that counts signatures of mail received at servers all around the world, so participating servers can tell how many previously received messages have been seen by other DCC users, i.e., whether the message was sent in bulk. DCC currently is counting checksums for about 165 million messages a day, twice what it was doing in January. Based on the counts, about 55% of the mail it's seeing is spam. The DCC site has some horrifyingly informative graphs showing the amount of traffic and fraction of spam.
Designated sender (DS) schemes attempt to list the valid IP addresses that can send mail for a domain. They all suffer from a variety of problems that make them less than ideal anti-spam approaches.
Spam is a hot topic, there's a lot of conferences.
Messaging Anti-Abuse Working Group (MAAWG) Summit. In Washington DC, May 17-19, 2004. MAAWG consists of large ISPs, mostly telco owned and a few cable ISPs, organized by Openwave. Attendance limited, it helps if you work for one of the members. I'll be moderating a panel on spyware as well as kibitzing at other sessions.
INBOX - The Email Event, organized by the Golden Group, with a long list of sponsors and partners. In San Jose CA, June 2-4, 2004. Big commercial conference with sessions and exhibits. Seems to want to be all things to all people in the e-mail world.
E-mail Tech Conference, sponsored by Ironport and others. In San Francisco, June 16-18, 2004. Practical orientation, invited speakers, exhibits.
First Conference on Email and Anti-Spam sponsored by Microsoft Research and a host of four-letter organizations. In Mountain View CA, July 30-31, 2004. Research orientation with refereed papers. I'm on the program committee, and some of the papers I've reviewed have looked pretty interesting.
The Senate Subcommittee on Communications of the Senate Commerce Committee asked me to come testify at a hearing this Tuesday, March 23. It's at 2:30 PM in the Russell Office Building, Room 253. Should anyone want to come and listen, there's a limited number of seats for the public.
The topic is S.2145, the SPY BLOCK act, which is intended to deal with spyware. They asked be to be the technical expert and to focus onthe impact on consumers of spyware, the effectiveness of current and proposed effects to address consumer concernts, and anything else I think they need to know. Oh, and keep it to five minutes, please.
At their request, I wrote up some written comments with all the stuff I can't say in five minutes.
In my spare time when I'm not dealing with the world of e-mail, I'm a politician so now and then I put on my cynical political hat.
At the FTC Authentication Summit one of the more striking disagreements was about the merits and flaws of SPF and Microsoft's Sender-ID. Some people thought they are wonderful and the sooner we all use them the better. Others thought they are deeply flawed and pose a serious risk of long-term damage to the reliability of e-mail. Why this disagreement over what one might naively think would be a technical question?
SPF does what's known in the mail biz as path authentication, that is, it attempts to check whether the route that a message took to get to the recipient is valid for that kind of message. In particular, SPF provides a very complex scheme through which a domain can publish the IP addresses from which it expects its mail to be sent. Microsoft's Sender-ID works almost identically to SPF, with the only difference being which of several possible return addresses on a piece of e-mail it checks.
If all of a domain's mail is indeed sent from the same place, then SPF or Sender-ID works fairly well. (It still has problems with mail forwarders, but that's a separate issue discussed at great length elsewhere.) On the other hand, if the domain's mail can legitimately come from lots of different places, particularly lots of different places that are hard to predict in advance, SPF and Sender-ID are useless.
So what kind of domain sends all its mail from one place? Corporations, mostly. A business will often have a single mail server, or a mail server per branch office, and a policy that all company mail is sent through the company's server. If employees are travelling, they have to connect back to their home network to get and send mail.
A bulk mailing service, known in the biz as an Email Service Provider or ESP. sends all of its mail from its own servers. That's both because that's why the servers exist, and because it's easier to get recipient ISPs to whitelist their mail if the ESP can give the recipients a small set of IP addresses to add to the whitelist.
On the other hand, mail from university domains can come from all sorts of unexpected places. Students and faculty travel, and being clever academics, lash up all sorts of ad-hoc schemes to send and receive their mail. Many universities provide courtesy mail addresses for alumni that the alums can forward to whatever ISP they happen to be using. The alums send their outgoing mail from their own ISP, so mail from the university's domain can originate at any ISP in the world.
Internet Service Providers are in about the same situation as universities. Their customers may check mail from work, and send mail with a personal ISP address via their work servers. Or they might move and keep an old account to avoid changing their e-mail addresses, sending mail with their old ISP address from their new ISP.
Corporations and ESPs run a lot of Microsoft servers. Businesses use Microsoft's Exchange to integrate e-mail and calendar facilities, ESPs run various integrated mail and database applications. Universities and ISPs are more likely to be running Unix or Linux servers. Universities do so since they're been running Unix since before Windows existed, ISPs because Unix and Linux mail software can support vastly more users per server than Windows mail software can.
So places that run a lot of Microsoft software tend to be set up so that Microsoft's Sender-ID works, and places that don't aren't. Coincidence? You make the call.
Adware is a variety of spyware that shows ads on your computer, nominally in exchange for letting you use a program like KaZaA. Programs like Eudora and Opera that show their own ads when they're running aren't usually considered adware; typical adware pops up ads on web browsers in addition to or instead of existing ads. Often they watch the URLs and try to show ads related to the URL, so if you visited a web site for contact lenses, adware would pop up an ad for a competitor that's paid the adware company to do so.Unfortunately, for a variety of reasons, adware is inherently abusive.
[This doesn't have much to do with e-mail, at least not until the big phone companies take over the Internet market in the US and impose their own Bell-shaped policies on it. So sue me.]
I wish the FCC would revisit the key issue of essential facilities, the bits of the telephone infrastructure that everyone needs to use and nobody can afford to duplicate.
The other day I read a most interesting little book Lessons from Deregulation, by Alfred Kahn, the architect of airline deregulation in the 1980s. It was published a year ago and is available either as a printed book or as a PDF.
Executive summary: he still thinks deregulation is swell.
The first half of the book is about airline deregulation, the second half is about telecom deregulation. I found Kahn's analysis of airline deregulation quite persuasive, not surprising since he was in charge of it. The analysis of telecom was much less persuasive. Kahn has been firmly on the side of the Bells in just about every disagreement. He argues, not unreasonably, that forcing competitors to rent facilities to each other below cost, as many state regulators have done, is no way to create a competitive market, and he thinks that cable and wireless will create true competition, but it seemed to me he was missing something critical.
What struck me at the end of the book is how completely opposite the outcomes of the two deregulations have been. In the airline industry, the old incumbents are all in dreadful shape, being walloped by nimble new entrants. In the telecom industry, after a flurry of competition aided by dot.com free money and regulatory pushing, the incumbents are crushing the new entrants and are well on their way to establishing a cozy geographically divided duopoly. What's the difference?
The old-line incumbent airlines (IALs from now on) certainly had their share of both self-inflicted and external injuries, but the old-line phone companies (ILECs, incumbent local exchange carriers, in telecom-ese) did plenty of dumb things, too. The critical difference is that the ILECs owned the essential facilities, and the IALs didn't.
In the airline industry, the essential facilities are airports and air traffic control. Airports are owned by various government agencies and paid for by user fees. ATC has always been Federal and is more or less paid for by ticket taxes.
Imagine a world where the IALs owned the airports. You want to add a route to Dallas? Too bad, American owns one airport, Braniff (which is still in business due to its duopoly profits) owns the other, and neither is willing to sell landing and gate slots at a price anyone else can afford. For a while, the CAB required them to sell Unbundled Flight Elements (UFEs) at a set price, but the IALs all moaned and groaned about how unfair the prices were. Remarkably, despite claims that the UFE prices were below their own costs, none of the IALs ever took advantage of the UFE bargains to invade each other's territory.
When the new entrants complained to the Civil Aviation Board that the legacy airports gave the IALs a stranglehold on access to passengers, the CAB said that rapidly changing technology would level the playing field, citing as an an example a helicopter service between a parking lot in Philadelphia and an abandoned shopping center in southeast Washington DC. Besides, the IALs are promising to roll out personal jet packs, (FTTP, for Flying To The Premises), although a few soreheads pointed out they'd been promising them since the early 1990s and to date they were only available in the financial districts of New York and San Francisco.
Well, enough of that. Nothing like that could ever happen, could it?
I'm hardly the first to advocate separating the ILECs into regulated wire companies and unregulated switch companies, but the more I see of the telecom landscape, the more I believe that we'll never have real competition so long as one party owns a facility that nobody else can afford to replicate. The ILECs have a century of practice assigning costs to infrastructure to show how expensive it is, and they're never going to give anyone else a fair price so long as they can sell it to themselves for funny money.
Industry Canada, the part of the Canadian government roughly equivalent to the U.S. Commerce Department, has had a task force on spam working for the past year or so. I was invited to participate as an unofficial member, since I'm not a Canadian.Yesterday, it wrapped up its work and published its report (aussi disponsible en français) to the government. It's quite good, and has a set of 22 recommendations.
The first week in July I went to an acronym-heavy World Symposium on the Internet Society Thematic Meeting on spam in Geneva.
A few people have reported this as a meeting by "the UN", which it wasn't. Although the International Telecommunications Union is now part of the UN, it dates back to an 1865 treaty to manage international telegraph communication. The ITU is now three pieces, the ITU-T which handles telephony and related things, the ITU-R which handles radio spectrum, and the ITU-D which coordinates telecom related development in less developed countries (LDCs.) The ITU-T coordinates telephone number country codes, standards for interconnection phone and data networks, and other things to glue the world's phone systems together, and was the main part of the ITU visible at the meeting. The ITU isn't the part of the UN that's supposed to have black helicopters; they would be across the street at the Palais de Nations.
Since most countries have permanent delegations in Geneva or nearby, there were representatives from lots of little countries present as well as most of the big ones. The big country reps tended to be political, so that for example the US delegation was from the State Department, appeared to have no experience or instructions relative to spam, and merely objected to language in the report that might have required that the US do something.
A fair amount of the conference was spent on describing the spam landscape (I discussed the limited standards efforts currently under way) and a bunch of snoozers in which various governments told us that they sure thought it'd be a good idea to do something about spam. We all agreed that from the point of view of the governments represented, the most urgent need is to coordinate laws and law enforcement so they can pursue the crooks who send the bulk of today's spam and frequently use computers in multiple countries to do so. Most countries have laws that the crooks are breaking, about computer fraud and abuse or plain old theft, so the immediate issue is to enforce them. The American Federal Trade Commission and the corresponding British and Australian agencies recently signed a Memorandum of Understanding to cooperate in anti-spam enforcement. There was some sentiment for a MOU that lots and lots of countries could join, which would be administered by the ITU, but I got the impression that the big countries would rather not have the baggage of little countries to deal with.
A topic that came up repeatedly was the disproportionate effect that spam has on LDCs. One problem is that their net connections tend to be slow and expensive, so merely downloading the spam to throw it away costs them a lot of time and money. This could presumably be solved at some cost to national pride by locating inbound mail servers or at least mail proxies in places with better connections so that most of the spam is filtered out before being sent down the expensive connection. A more subtle but more important problem is that the all of the spam and phishing and other misbehavior on the net makes LDCs reluctant to use the net at all. People in LDCs are no less smart than people elsewhere, but they rarely have the technical training or experience that their counterparts in developed countries do. The buzzphrase here is human capacity building, something the ITU-D does. The outspoken delegate from Syria made these points quite forcefully.
The last session in the conference was the horse-trading leading to the conference report. (There's audio archives of the whole thing, so if you want, you can listen to the horses being traded.) I'm not sure exactly what this conference accomplished, but was clear that there's finally a global consensus that spam is a problem that needs to be fixed, and no country (well, except maybe the resurgently exceptionalist US) can do it alone.
My other sites
© 2005-2018 John R. Levine.
CAN SPAM address harvesting notice: the operator of this website will not give, sell, or otherwise transfer addresses maintained by this website to any other party for the purposes of initiating, or enabling others to initiate, electronic mail messages.