I'm giving my faithful readers a week off from talking about daylight saving time—although I might have more to say about it next week. Instead, I want to discuss a problem that's recently come to light with Exchange Server 2003 Service Pack 2 (SP2): It doesn't handle greylisting errors gracefully.
If you're not familiar with greylisting, here's a quick primer. You probably know about using whitelists and blacklists for filtering: Whitelists specify senders or connections that you always want to accept mail from, and blacklists (or block lists, as Microsoft calls them) are senders or connections from which you never want to accept mail. A greylist is in between these two extremes: It's a list of senders or connections from which you might not want to accept mail.
Here's how greylisting works. When the sender establishes an SMTP connection to a receiving server that's using greylisting, the receiver accepts the connection and the message. Then the receiver checks the greylist; if the sender name or IP address is on the greylist, the receiver returns an SMTP 4xx error code. You'll recall that the 4xx error code range indicates temporary or transient errors; the intent of these errors in the SMTP specification is that the sender will resend the message after a waiting period. For example, the Exchange Server 2007 transport engine returns a 452 4.3.1 Insufficient system resources error when it has less than 4GB of disk space on the queue drive. That code tells the sender it should try again later; so do the error messages generated by greylist filters. A legitimate sending server pays attention to the error and requeues the message for later delivery, but a spammer just blasts out another copy of the message (thus helping the greylist filter decide to block the IP address altogether.)
When Exchange 2003 sends a message to a server using greylisting, it gets back a 4xx "try again later" code. Instead of waiting a reasonable interval, Exchange tries again after only a few seconds. This attempt generally fails too, and Exchange doesn't try again.
When the message isn't delivered due to greylisting, Exchange should try again later. Sometimes the sending Exchange server generates a nondelivery report (NDR) to the sender indicating that the message failed (which is incorrect), and sometimes it doesn't. The message isn't delivered, and it doesn't appear in any queues. Exchange won't try to redeliver it again until you restart the SMTP service. The message just disappears, except from the sender's Sent Items folder. That makes it tough to troubleshoot the delivery problem.
Luckily, there's a workaround for this problem. Restarting the SMTP service seems to kick stuck messages out for a retry; I've seen several posts in the microsoft.public.exchange.admin newsgroup that talk about scheduling restarts of the SMTP service to ensure that no messages get permanently stuck. This solution is better than nothing, but it's not a good long-term answer.
How would you know if you have this problem in your environment? Well, you might get user reports about messages that are sent but never received, or you might see suspicious NDRs that claim permanent failures from 4xx error codes. If so, you can restart the SMTP service to see if that unblocks the messages. You should also consider opening a support case with Microsoft; doing so will help Microsoft accurately track how prevalent this problem is. If your problem is caused by this particular bug, the support should be provided to you for free.
I wrote about the GRYNX Greylist filter for Exchange 2003 in January ("Troubleshooting with the Fundamentals," January 25, 2007). I'm still a big fan of greylisting as a spam reduction technology, and I hope this small speed bump won't put you off the technology itself.