Rejecting spam with a procmail accept list

Unfortunately, people who get this script's verification messages because a spammer has used their return address sometimes report them as spam rather than just ignoring them, and then the blackhole lists threaten to block your ISP, which eventually gets fed up and makes you stop using this. Which is a shame, since it's the only really effective way to block spam. At this point, forwarding your mail to Gmail and letting Google filter out the spam is a better solution.

At some point a while back — about when I started thinking about forwarding all my incoming e-mail to a pager — I decided that the only acceptable amout of spam to get is none.

So, I started searching the Web for spam-filtering solutions, and it rapidly became clear that they all relied on recognizing incoming spam by its contents or origin, and that they all did at best a mediocre job of this.

So I wrote my own, in the form of a procmail script which accepts mail only from addresses on its list, but lets real people — with valid reply addresses — add themselves to that list easily.

These days there are a few pretty good spam filters, but I still prefer this approach. It's easy to set up and unobtrusive in operation. It works very well for me, and I invite you to try it yourself.

How it works

When someone sends you mail from an address from which you've never received mail before, procmail will store their mail away without delivering it to you, and send them a message explaining that their mail hasn't been delivered yet. When they reply to that message as instructed, their original message will be delivered instead, and their address will be added to the list of addresses from which to accept mail in the future.

That's really about it — your friends and correspondents won't be inconvenienced much, and you may never get another piece of spam!

Installation and setup

First, make sure that your mail server is configured to use procmail to process your mail. One good guide to this is here.

Then, save my .procmailrc file in your home directory — but name it something else till you're done customizing it, so it doesn't interfere with mail delivery.

You'll need to edit the first five configuration lines at the top of the file, and possibly some of the others. Here's a list of the crucial variables and how to set them:

MY_EMAIL
Your e-mail address, as it should appear on out-going e-mail.
MY_NAME
Your whole name.
BOTS
A regular expression that matches any addresses from which you get lots of mail that you want to archive but not read. These will all get stored in a "notifications" mailbox in your home directory. (There's info on procmail regular expressions in the procmailrc man page.)
KNOWN_DOMAINS
A regular expression that matches all domains from which you know you won't get spammed. Mail from these domains will get delivered to you without the addresses being verified.
LISTS
A regular expression that matches the To: headers on messages to mailing lists to which you've subscribed. Mail to these addresses will get delivered to you without the addresses being verified.

It is very important that you add new mailing lists' addresses to this regular expression when you subscribe to them, or you will spam the list with autoreplies and be despised by the nice people whose interests you share!

You should make sure that the "pending_messages" folder exists in your home directory.

If you have an archive of saved mail, you should put the addresses of the senders in a file called ".accept-list", also in your home directory. This will save your existing correspondents from having to validate themselves.

Now, rename the control file to ".procmailrc", and you should be in business.

Checking cached mail

Periodically — I do it every couple of days — you should check to see if any "real people" have left messages in your pending_messages cache, either because they didn't see procmail's autoreply, or because they for some reason couldn't answer. (Or because they're really a friendly mailbot, like when you've used some Web site's "I forgot my password" page...)

The easy way to do it — and the only way, if you don't have shell access to your mail server — is by mailing commands to the script:

Alternatively, there's a simple Bourne shell script for this here. I call it "ok" and just telnet (well, SSH) to my mail server and run it from the command line. With no arguments, it lists the last thirty addresses from which mail was cached. Give it an e-mail address, and it'll deliver any cached mail from that address and add it to the accept list.

Manually denying addresses

Sometimes, you'll get unsolicited commercial e-mail from real people. If you send yourself a message with the subject "Deny address@domain.tld" future mail from that address will not be delivered.

Tidying up your cache and accept list

I run simple cron jobs like these to clear out old cached messages, and sort the accept list. Skipping this step won't hurt anything, although you'll want to go in and clear out the pending_messages directory manually every so often.

Automatically accepting mail from people to whom you have sent mail

If you can read the mail server's log file, and you can run "daemon" programs that continue after you log out, then you can set up this Perl script to watch the mail log for outgoing messages from you and add their recipients to your .accept-list. It's written to work with Postfix, but you should be able to make it work with any mail server log format by editing the regular expressions in the main while loop.

(Note: the script was written and tested to be run as root. It should work OK for a non-root user as well, but please mail me if you have trouble.)

Alternatives

If your mail client runs on a UNIX workstation that has the Python scripting language installed there is a much more powerful and flexible tool called TMDA that blocks spam using the same general principles, but can also munge your outgoing address on a per-message basis which lets it do a lot of great tricks.

And Nancy McGough has a good list of other spam filters using similar techniques.

Known bugs

Change log

version 070514 — 14 May 2007
29 Nov 2005
version 030131 — 21 Jan 2003
version 021109 — 9 Nov 2002
version 020823 — 23 Aug 2002
version 0.9 — 30 Jul 2002
version 0.8 — 21 Sep 2001
version 0.7 — 9 Sep 2001
version 0.6
version 0.5
version 0.4

Contacting me

If you have any questions, you can mail me at nic@angel.net.