Anti-spam trick: Grey listing
There is an anti-spam technique called "Grey Listing" which has almost completely eliminated spam from my main server. What's left still goes through my SpamAssassin and Amavis-new filters, but they have considerably much less work to do.
The technique is more than a year old but I've only installed a greylist plug in recently and I'm impressed at how well it works. I hope by writing this article other people that have procrastinated will decide to install a greylist system.
(for those that want technical specifics, I'm using Postfix plus Postgrey. If you use FreeBSD, just do "portinstall mail/postgrey" assuming you are already using Postfix. Sendmail users, please post some comments directing people to the Milter equivalent!)
So how does grey listing work?
Well, you know that a "black list" is a list of sites you block, and a "white list" is a list of sites that you always permit. A grey list is somewhere in between.
The basic principle is that spammers don't retry an email that couldn't be delivered. There are two kinds of "can't be delivered" (actually, more than that but two are important here). One is a "hard failure"... the email can't be delivered and nothing is going to fix it. For example, you are trying to send email to an account that doesn't exist. The second type is a "soft failure", which is a problem that is temporary. In other words, a disk is full, or there is some kind of system problem that will be fixed soon. If you get a "hard error" the email is bounced. If you get a "soft failure" the sending server is supposed to wait a bit of time and retry. That's why when you run out of disk space email stops flowing, but when you fix the problem (delete that out-of-control log file or whatever) you suddenly get a flood of backlogged email.
Spammers don't retry sending email whether it is a hard or soft failure. When you are sending email to tens of millions of addresses, its too difficult to keep track of failure codes. Besides, even if they don't get their spam sent to 20% of their list, they're still sending it to millions of addresses. Good enough, eh?
So here's how grey listing works. The first time someone tries to send you email, send a "soft error" result code. If they reply more than 5 minutes later, then actually accept it. If they are a spammer you'll never get a retry. If they are legitimate then you'll get a retry.
Implementing this is extremely simple. When someone tries to send email, gather 3 other item of information: the source IP address, the From:, the To:. Maintain a database of these 3-tuples. If you haven't seen that 3-tuple before, send the "soft failure" code. If you have seen that 3-tuple already and it was more than 5 minutes ago, accept the message.
It's amazingly simple yet it seems to be blocking about 80% of my spam right now.
Now, you may be thinking, "I can't have a 5-minute delay on all my email! That's crazy!" Well don't worry. Systems like Postgrey take this all one step further. For example, if 5 emails get through in the last month, Postgrey decides this IP address must be ok and adds it to a list that is "white listed".
Thus, the system tunes itself. Common senders immediately get into the whitelist (Yahoo, gmail, and so on). Site that disappear eventually get expired from the list because you don't hear from them in 30 days. That makes the database self-cleaning. All maintenance is automatic.
I can't believe I didn't install this years ago!
--Tom
P.S. I've also added "reject_non_fqdn_hostname" to the Postfix variable "smtpd_helo_restrictions". That means that when an STMP server issues a "HELO hostname" the email is rejected if "hostname" isn't a FQDN. This rejects about 80% of the spam I'm getting... most of which just sends "HELO friend". I haven't had any complaints from users about false-positives since I implemented this a month ago. This technique reduced spam by 80% and Postgrey reduced spam by a different-but-overlaping 80%. When both are enabled, I receive very little spam. Enough for Amavis-new and SpamAssassin to take care of easily.
Posted by tal at April 20, 2006 04:32 PM | TrackBack
Comments
I've been using greylisting for well over 2 years now [almost from the moment it appeared] on all the mail servers I handle [currently 3, but in total about 7]. It doesn't seem to be temporary, spammers still ignore it. I find it extremely effective, and it's never been complained about by any users, so the delays are unnoticeable in everyday work [and two servers I manage are company servers, one of them belongs to a company whose key business activities rely on e-mail]. Besides, I've been promoting greylisting all over, and glad to see that it's getting more popular at last [but I'm surprised that It's been noticed by some so late...].
If any of the readers is a qmail user like me, I can recommend cqgreylist, which I'm using. It's a simplified greylist, ie. it doesn't store triplets, but only the IP address. Once the IP address retries [but not too quick, which you can set up as well, eg. to avoid whitelisting spammers who send many mails to your server one after the other], it's whitelisted. The recipient and sender addresses are not noted at all.
This might seem less effective than the original greylisting, but I find it equally effective to be honest. I receive very little spam from zombies.
Posted by: Tomasz Andrzej Nidecki at April 22, 2006 06:10 AM
annjoop