Justin's Linklog

Spam: Bram shares a spam-filtering tip — ‘most of the viruses I get have a Message-Id tacked on by the local mailserver. A little bit of messing with procmail and suddenly my junk mail level is under control.’

This is what the SpamAssassin rule MSGID_FROM_MTA_SHORT does. It gets:

  4.432   6.7680   0.0560    0.992   0.94    3.67  MSGID_FROM_MTA_SHORT

6.7680% of spam is hit, but so is 0.0560% of ham mail — which makes it 99.2% accurate. By default in 2.6x, it gets a score of 3.67 points.

There’s a lot of divergence between people’s corpora — for instance, I currently have no ham mails that hit this, so it’s 100% accurate for my current mail collection; but some other people have an 80% hit-rate.

This is because some large-scale legitimate mass-mailers — for no apparent reason — also omit the Message-ID when they send the message across the internet. This isn’t quite a contravention of RFC 2822, but that RFC strongly recommends using the header:

Though optional, every message SHOULD have a ‘Message-ID:’ field.

(see RFC 2119 for what ‘SHOULD’ means — it’s a strong recommendation.)

The moral for legit senders: make sure you read the RFCs before you start sending SMTP; otherwise you’ll look like a spammer.

The moral for spamfilter developers: watch out for the legit bulk mail senders; some of them do really bizarre things with SMTP. ;)

Archives

Egosurfing images.google.com

Blocking mail with no Message-ID