Spam mail bad. Procmail GOOD.

Posted on October 31, 2007 
Filed Under Mail, Unix

Porkchop <porkchop@example.com> wrote:

>I'm looking to get a procmail thingie setup. I spent a
>few minutes looking at the manpage...seems to be
>written using a non-roman alphabet...!
>Took a quick run on search engines with no luck...so does
>anyone know of a 'procmail for idiots' type webpage?



I learned Procmail by grabbing the examples that other people posted to news.admin.net-abuse.email, and from their explanations reading backwards into the man files. The main problem is that it’s not merely using normal “regular expressions,” which are bad enough, but an older variation that has died out just about everywhere else.

At example.com, Porkchop and I have shell access; that’s getting rather rare, unfortunately, but on a Linux box anyone can have shell access — and even root access. So, let’s start with the “man” file for procmail; by following the instructions there, I made it take effect by creating a .forward file in my home directory:

   $ cat .forward  # "cat" = display or type [my] .forward file
   |/usr/bin/procmail

That’s all there is to it — just that one line. The vertical bar is the Unix “pipe” symbol, and in this case it means “pipe all incoming mail to the /usr/bin/procmail program.”

However, procmail doesn’t do anything interesting until it finds a .procmailrc (procmail resource) file with instructions. (Note the leading dot; like many configuration files, .procmailrc is not listed unless you specifically tell your shell to show “hidden” files with the ls -a command.) Procmail instructions are actually written in a little programming language that tells procmail what things to look for in a mail message, and what action to take (where to put the message) based on what it sees. Test, act on a match. Test, act on a match.

A rule-and-action pair in procmail is called a “recipe” — here’s a useful one:

# Too many recipients
:0
* ^(To|Cc):.*@.*@.*@.*@.*@.*@.*@.*@.*@.*@.*@.*@
	/dev/null

This says “Look at the lines that begin with To: or Cc: and count the number of @ signs. If the message is addressed to 12 people (or more), throw it away unread.” Remind you of anyone you know?

The line that begins with # is just a comment. For historical reasons, the recipe begins with a dummy line number, 0:

The line that tells procmail “Look for this!” is the next line, which starts with a star and a space. Now we finally get to tell procmail what we’re after.

The ^ mark means “the start of the line.” The (a|b) syntax using the pipe symbol means “a or b,” in the sense of “either or both.” The .* means any number of any character. The : and @ are literal, meaning that they stand for themselves. If you wanted to test for the presence of a character that has a special meaning to procmail, like a literal period, you put a backslash before it. “example.com” means anything that begins with “example plus one character,” and ends with “com” — it would match examplexcom, exampleycom, or examplezcom. So, to ensure that you always receive mail sent from “cannon@example.com” you would write the domain name as “example\.com”, like so:

# Accept all mail from that nice cannon chap
:0
* ^From.*cannon@example\.com
	${DEFAULT}

And if you want to receive mail from me even though I sent it to twelve other people, you would put this test BEFORE the one for “too many recipients.” Once a recipe ‘delivers’ the message, whether to a mailbox or to the wastebasket, no more tests will be performed.


Here’s an advanced trick that is immediately useful, and which I didn’t see in the “procmail examples” man page: Can you create an explicit whitelist or blacklist for procmail to use, without constantly editing your .procmailrc file to update it?

# whitelist
FGREP=/usr/bin/fgrep
FROM=`formail -x From:`
DEFAULT=/var/mail/cannon

:0E
* ? (echo "$FROM" | $FGREP -i -f $HOME/ok)
  ${DEFAULT}

The first two lines define some utilities we’ll need, “fast grep” and “formail”. The rule or “recipe” that uses them begins with :0 just as before. The test part of the recipe begins with * and in this case it is an unusual one: we will pipe the incoming mail into formail, a helper program which will extract just the From: line and pass it to fgrep, which will scan the ‘ok’ file for a match. It will return a success code if it finds a match.

This is fairly typical Unix behavior — we link together several small programs, each of which returns a code to tell you that it did (or did not) find something. Each program in succession uses that code to decide whether to do (or not do) something as a result of what has been found so far.

The ‘ok’ file is just a list of email addresses I want to hear from, one email address per line:

mamacita@example.com
fred@flintstone.org
wilma@bedrock.net

It is important to realize that you do not want any blank lines in this file — if you do have one or more, fgrep will ALWAYS match and so accept (or reject!) ALL incoming mail. There is no way to enter a ‘comment’ in this file. It must contain ONLY addresses or domain names which you want to match.

To add a new entry to the ok file, I just type:

echo porkchop@example.com >> ok

Now he’s on my whitelist, which means he can send me anything. To blacklist someone, it’s just as easy:

# Dangerous blacklist method
:0E
* ? (echo "$FROM" | $FGREP -i -f $HOME/ng)
        /dev/null

You have to be careful, because it’s very easy to lose mail you want this way. Instead of /dev/null, it’s better to write it to a file somewhere; to me, just moving it out of my mailbox into a ‘junk’ file is enough to lower my blood pressure:

# Safer blacklist method
:0E
* ? (echo "$FROM" | $FGREP -i -f $HOME/ng)
        $HOME/junk

I like these two recipes because the first one lets me be aggressive about killing off junk mail without cutting off my friends, and the second means that I can cut off a pest at once without having to update (and test!) new procmail recipes.

The other nice thing about these lists is that it’s much easier to enter a single new email address than it is to deal with the procmail syntax. Once you get the procmail bit working reliably, you don’t have to mess with it just to add a few names to your list.


Of course, spammers change their names as often as they change ISP’s, so a simple blacklist isn’t much good against them. You have to look for signs they can’t change so easily.Fortunately, you can test any header, not just “To” and “From”. So if you’re being harassed by someone at spam-r-us.com, let’s say, here’s a good way to kill them off:

:0
* ^Received:.*spam-r-us\.com
	/dev/null

You can even test the body of the message, but that’s more ‘expensive’ and should be a last resort. To do that, replace the :0 above with :0 B

If you simply discard mail that doesn’t have a “To:.*porkchop” header (and isn’t from a list you subscribed to, of course), that will kill 90% of spam, according to some reports. To do that, use an exclamation point, which is the procmail way of saying “NOT!”:

:0
* !^To:.*porkchop@example\.com
	$HOME/junk

ANY time you make a change to your .procmailrc, I strongly urge you to set the “verbose=on” line and send yourself a message, then immediately remove the “verbose=on” line and check the log file to look for error messages. (Oh, yes, the log file! Make sure your .procmailrc starts with this line:

	LOGFILE = $HOME/pm.log

Every once in a while, you’ll want to check that to make sure that things are going where you expect them to go, and also keep it from filling up your disk. I use a small shell script to do that easily.)

I’m sure you’ll have more questions; I know I did! But I hope that’s enough to get you going.

Comments

Leave a Reply

You must be logged in to post a comment.