SAY2K10 Doh

Happy new year! Or maybe not. Doh.

Over a year ago, Lee Maguire noticed that a contributed SpamAssassin rule, FH_DATE_PAST_20XX, was naively written — simply to match any date in the year 2010 or later — and would start to false-positive on all mail in 14 months. We made the trivial fix to avoid this (for at least 10 years, by which point the rule would have obsoleted itself through normal means), and I committed it to SVN.

Problem solved, right? Nope. I’d committed to trunk, but in a moment of inattention had forgotten to backport the fix to the stable release branch, 3.2.x, as well. Nobody else noticed the mistake, and several months later, boom:

Bugger.

Annoyingly, the GA had assigned this rule 3.5 points in the 3.2.0 rescoring run. This meant that the effective default threshold had been lowered from 5.0 points to 1.5, which produced a 2% false positive rate during the first 13 hours of the new year.

After that point, the fix was pushed to the sa-update channel, and anyone who runs sa-update regularly (as they should!) was brought back to normal filtering behaviour.

The rule is superfluous anyway, since it overlaps with a better-written “eval” rule, DATE_IN_FUTURE_96_XX. Accordingly, most likely scenario is that it’ll be removed.

Personally, I see a few lessons from this:

  • Obviously, I need to pay more attention. This is easier said than done though, since SpamAssassin has nothing to do with my day job anymore; it’s a spare-time thing nowadays, and that’s a rare resource, unfortunately. :( But still, a chastening result, and I’m very sorry for my part in this screwup.

  • We need more active committers on Apache SpamAssassin. If we’d had more eyes, the fact that I’d forgotten to backport the fix might have been spotted. we’re definitely in a better situation now in this regard than we were 6 months ago, so that’s good.

  • IMO, this is a good demonstration of how too many simple rules are risky; without careful vetting and moderation, it’s easy for a bad one to slip past. Perhaps we need to move more towards a DNSBL/network-rule driven approach, although this has its downsides too. Still thinking about this.

  • It’d be good to fix the GA so that it wouldn’t assign such high points to simple rules like this, without some indication that a human has vetted them and believes them trustworthy.

Daryl posted a good comment on /.:

Clearly we dropped the ball on this one. As far as I know it’s our first big rule screw up in the project’s 10 years. If you’re going to screw up you might as well do it well.

+1 to that!

And to everyone who had to clean up the fallout and spend a holiday recovering lost mails from spam folders… sorry :(

This entry was posted in Uncategorized. Bookmark the permalink. Both comments and trackbacks are currently closed.

4 Comments

  1. Christopher
    Posted January 4, 2010 at 02:03 | Permalink

    SpamAssassin has been happily and silently been working away for me for so many years, so you’re quite right.. one breakage in ten years is prettay prettaaay impressive. It’s an absolutely superb project and the few lessons to be learned will be sorted soon enough (or already have been)!

    Plus this story got picked up and spread so quickly, I was running sa-update pretty quickly after the event. Not that anything got FP’d anyway (from a brief look at the logs).

    So thanks to all who work on SpamAssassin, and I look forward to another ten years. :)

  2. Posted January 4, 2010 at 12:40 | Permalink

    hey Christopher –

    I think the story got picked up so quickly because it was a slow news Friday; the fact that it was ironic on two points (“Y2K10″ and the “grossly in the future” snarky rule description) also helped.

    thanks for the kind words, btw. ;)

  3. James
    Posted January 6, 2010 at 01:09 | Permalink

    Thanks for getting the update out quickly – by the time I’d read about it, sa-update had already done its thing :)

  4. ben
    Posted January 11, 2010 at 22:26 | Permalink

    Oh, I just encountered this today (?) … I ran sa-update, I’m now at v 3.002005, and I restarted postfix — will that do the trick?