Nontechnical users often wonder why IT pros can't devise a clever solution to the spam problem. But those of us who have been around computers for a while know that the fight against spam is more of a chess game involving endurance and analytical skill than a football game featuring clever plays and stationary goals. As antispam tools improve, spammers find increasingly devious ways to evade them.
Most tools rely on identifying spam after a server has already accepted it for delivery—an approach that's hard on servers and bandwidth, and even harder on the users who have to wade through everything we don't catch. Because most spam is spoofed, a more robust method of identifying spoofed mail would be a big help in our struggle against spam. Ideally, such a system would provide a way for a server to determine whether the sending system is authorized to send mail for the DNS domain of which it claims to be a part. Such a tool would give our mail servers a way to answer a simple question that's very difficult to answer now: Where did this message really come from?
Two similar protocols—Sender Policy Framework (SPF) and Sender ID—provide a partial answer by letting you verify a message's sending domain. Before we get into the details of how these protocols work and what you can do with them, a brief history lesson is in order.
SPF, Caller ID, and Sender ID
In 2003, Meng Weng Wong started giving conference talks about a new antispam protocol called SPF. The protocol was well-received by the (mostly UNIX-oriented) audiences and began to gather momentum as it was adopted by a variety of ISPs, businesses, and antispam service providers. Not to be outdone, Microsoft published its Caller ID for E-mail standard in February 2004. Caller ID for E-mail had one important technical advantage in that it used XML, rather than plain text, to specify server identities. Although technical differences existed between the two protocols, they shared the same objective: allowing a mail server to verify the origins of sent messages. Both the SPF community and Microsoft realized that having two systems was less efficient—and less likely to vanquish spammers—than a unified system. So in June 2004, Wong and Microsoft announced the release of an Internet Engineering Task Force (IETF) draft specification for Sender ID, a unified system that combines the features of SPF and Caller ID for E-mail. SPF is already widely deployed, but Sender ID is the wave of the future, especially since it will probably be approved by the IETF as a standard within the next year or so.
Accordingly, this article deals primarily with Sender ID (although in some instances I'll point out significant differences between it and SPF). Unlike SPF and Caller ID for E-mail, SPF and Sender ID attempt to solve somewhat different problems: SPF concentrates solely on efficient spam rejection, whereas Sender ID extends domain verification to try to prevent some kinds of forgery and spoofing (including so-called phishing attacks).
How Sender ID Works
You probably already know of several server-based checks that you can perform to tell whether a message is spam. For example, Exchange Server lets you perform reverse DNS lookups for inbound messages and reject those messages that don't have a DNS PTR record. And of course, you can choose from a whole smorgasbord of tools that examine message headers and comments for telltale signs of spam. Microsoft has invested heavily in developing its SmartScreen technology, in use in the Exchange Intelligent Message Filter (IMF), Hotmail, and the Microsoft Office Outlook 2003 Junk E-mail Filter.
SPF and Sender ID, however, take a different approach. Each owner of a DNS domain is supposed to publish a record that specifies where that domain's email comes from. For example, in the simplest case, you might publish a record that says (in effect), All my mail comes from the servers registered in the MX records for my domains. This record corresponds with the MX record, which specifies where to send mail to for a particular domain. This system matches the successful processes of physical mail. SPF folks like to use the analogy of a letter that claims to be from Amazon.com (which is based in Seattle) but which carries a Nigerian postmark. You would quite rightly be suspicious of such a letter's authenticity. Likewise, you might be suspicious of a message that claims to be from Citibank but originates in China.
The basic process that Sender ID uses is simple. First, you publish a DNS TXT record that contains the Sender ID information for the DNS domains you own. You need do so only once. (I explain this process in the next section.) Users then send email just as they usually would; your Sender ID deployment is invisible to them. When a server that's using SPF or Sender ID receives a message from your domain, that server extracts the Purported Responsible Address (PRA), or in plain English, the mailbox that claims to have submitted the message. The Sender ID specification calls for the receiving server to determine the PRA by checking the message headers in the following order:
1.Check for the Resent-Sender header. If a nonempty Resent-Sender header is present and valid (e.g., contains only one IETF Request for Comments—RFC—2822 address) and no other headers are present, use that header as the PRA. If a nonempty Resent-Sender header is present but invalid, reject the message without determining a PRA. If no Resent-Sender header is present or if it appears after the Resent-From and Received or Return-Path headers, don't use the Resent-Sender header. Instead, proceed with the remaining checks until a PRA is determined or the message is rejected completely.
2.Check for the Resent-From header. If a nonempty, valid Resent-From header is present, use it as the PRA. If the Resent-From header is malformed or contains multiple mailboxes, reject the message without determining a PRA.
3.Check for the Delivered-To, X-Envelope-To, and Envelope-To headers. If one of these headers is present and nonempty, check it for validity. If it contains multiple names, reject the message without determining a PRA. Otherwise, return the contents of the first found header as the PRA.
4.Check for the Sender header. If more than one Sender header is present, reject the message without determining a PRA. If only one, nonempty Sender header is present but contains more than one mailbox or is otherwise malformed, reject the message without determining a PRA. If only one, nonempty Sender header is present and valid, return its contents as the PRA.
5.Check for the From header. If more than one From header exists, reject the message without determining a PRA. If only one, nonempty From header is present but contains more than one mailbox or is otherwise malformed, reject the message without determining a PRA. If only one, non-empty From header is present and valid, return its contents as the PRA.
As you can see, this process checks headers in a particular order, which is determined by the difficulty of spoofing that particular header (e.g., the From header, which is the easiest to spoof, is checked last). At each step, if the header is either well formed or clearly malformed, it can be used to extract the PRA or reject the message. If the selected header doesn't exist, the next header in the sequence is checked. The net result of this process is that either we get a PRA, or we don't. If we don't, the rest of the Sender ID checks are moot; the Sender ID specification suggests rejecting the message immediately.
If the receiving server is able to extract a PRA, the server then needs to extract the domain portion of the PRA, which gives us the Purported Responsible Domain (PRD). The server can then use the PRD to perform a DNS query for the PRD's SPF or Sender ID record, more properly known as an E-mail Policy Document. If an E-mail Policy Document exists, the DNS query will return it, giving the receiving mail server a list of IP addresses that are authorized to send mail on behalf of the sending domain.
If the IP address from which the message originates appears in the E-mail Policy Document, the message can be accepted. If not, the message can be accepted but should be subjected to more stringent spam checks because it appears to come from an unlikely source. Some subtleties in this process make it rather interesting. Depending on the results of the lookup process, the Sender ID specification defines the following six values:
The goal behind having this hierarchy of values is that a message that clearly passes (because its E-mail Policy Document matches) can bypass other, more resource-intensive antispam checks. This filtering helps reduce the overhead required for existing schemes on servers that accept large volumes of mail, but can slow message processing if external services have to be used to verify PRDs and E-mail Policy Documents.
Publishing Your Sender ID Record
The Sender ID checking process is relatively complicated, but the beauty of it is that the SMTP server has to do all the work. As an administrator, all you have to do is make sure that a server using Sender ID or SPF can find an E-mail Policy Document for your domain. This requirement turns out to be easy, although it is one area in which SPF and Sender ID diverge widely. SPF uses plaintext records (which you can generate easily by using the SPF Wizard available at http://spf.pobox.com), whereas Sender ID uses a more structured, XML record (the exact structure is defined in the Sender ID specification at http:// www.microsoft.com/mscorp/twc/privacy/spam_senderid.mspx).
Web Listing 1 (http://www.windowsitpro.com/microsoftexchangeoutlook, InstantDoc ID 43917) shows a simple example of a Sender ID E-mail Policy Document. Each E-mail Policy Document must contain an <ep> element, which is the XML container for the record. The optional <out> element specifies the E-mail Policy Document. The <m> element specifies which servers are permitted to send mail for the domain. You can have one or several <m> elements in each <out> element. In the example that Web Listing 1 shows, the only specified element is <mx>, which tells the server that the only authorized sending hosts are the ones registered in the domain's MX records. You can also use the <r> element to specify an IP address range (e.g., 10.10.20.0/24), the <a> element to specify a single IP address, or the <include> element to specify a list of addresses.
Once you've created the XML record, the next step is to publish it. The trick is to create a new subdomain for your domain and name the subdomain _ep.yourdomain.com. This is a handy approach because you can create such subdomains for any or all of the domains in your organization without any outside help (unless you outsource your DNS service). For my home lab, I set up _ep.robichaux.net, which holds one DNS TXT record that contains the E-mail Policy Document for my domain. That's it! After the E-mail Policy Document is published in the correct subdomain for your domain, servers that use Sender ID can validate email sent by your users.
Reaching Critical Mass
The first, and most immediate, objection to most schemes that attempt to address SMTP security is that anything that requires changes to a large percentage of SMTP servers won't be widely adopted. Although that statement is certainly true, it's also true that just a few domains—namely, AOL, Hotmail, MSN, and Yahoo!—generate an overwhelmingly large percentage of the email sent over the Internet; thus, any scheme that those domains adopt has a much better chance of being widely adopted by default. Microsoft recently announced that it will start using Sender ID to check inbound messages to Hotmail, MSN, and microsoft.com addresses. This step is an important one, especially because AOL already uses SPF (as do several thousand other mail senders of various sizes). As the tools for implementing and checking Sender ID improve, I expect to see the protocols become much more widely deployed. In any event, the process of adding a Sender ID record for your domain is so simple that there's no good reason why you shouldn't do so immediately.
Objections to Sender ID Deployment
Of course, not everyone agrees. As with every technology developed since the invention of the wheel, Sender ID has its opponents. Some of the criticisms leveled at Sender ID also apply to SPF. I mention them here solely because you'll probably hear them yourself at some point during the Sender ID lifecycle:
Is This the Future?
Will Sender ID eliminate spam? Probably not. However, in conjunction with other tools and techniques (including client- and server-side heuristic filtering, better client security to keep clients from being compromised and used as spam robots, and financial pressure against spammers), it can only help. As Sender ID and SPF become more widely deployed, they'll take their places alongside other technologies as valuable assets in spam fighting. In the meantime, I suggest you obtain more information about these protocols and their deployment. SPF's master Web site (http://spf.pobox.com) includes useful slide presentations that highlight the specific problems SPF is designed to address and a comprehensive archive of the SPF-discussion mailing list. The Sender ID Web site (http://www.microsoft.com/mscorp/twc/privacy/spam_senderid.mspx) provides three interesting documents: an executive overview of Sender ID's purpose, a guide for administrators who want to implement it, and a copy of the draft IETF standard proposal—well worth reading if you really want to understand how Sender ID works.