I’ve just found Gary
Robinson’s blog, which is a bit silly, as boasts the primary source
after Paul Graham’s‘A Plan For
Spam’ paper for modern Bayesian spamfiltering techniques. I’d only read
Gary’s
page describing the Robinson-combining technique, but he’s been doing
a good job of blogging the anti-spam world in general recently. Hence,
he’s made the blogroll ;)
Some choice links from his blog:
First off — Jon Udell
points out why reply-to-whitelist systems are Bad:
The email thread that provoked this message will soon dissolve.
Including x@y.com might have been useful, but the moment has passed. If
I urgently need to contact x@y.com , I may have to grit my teeth and
register to do so. But no ad-hoc communication is going to make it over
that activation threshold.
And a different kind of whitelist — the IronPort Bonded Sender type, from
Whitelists: the weapon of choice against spam (ZDNet):
After a one and half months of testing, IronPort identified hundreds of
thousands of false-positives. At that rate, the mail generated by
IronPort’s customers alone, which make up a small percentage of the
total amount of e-mail that traverses the Internet, is resulting in
over one million false-positives per year.
Hmm. Well, I’m not 100% convinced here — I did see Amazon.FR, who are
apparently Bonded Sender customers, send a promotional mail to a mailing
list. I also saw several reports from other places regarding the same
mail. How often does a mailing list order goods from an e-commerce site?
(But, having said that, that’s the only Bonded Sender issue I’ve seen in
about 6 months — so let’s put that down to teething issues, or someone on
the list who decided to act up when ordering some goods.)
Spamland.org, a new Wiki for
spamfiltering.
Debra Bowen, a California State Senator, is proposing a hardcore new anti-spam
bill. “It would bar unsolicited e-mail advertising and allow people
who receive it to sue the senders for $500 per transmission. A judge could
triple the penalty if he or she decided the violation was intentional. …
‘The ($500) fine’s really intended to get a whole generation of
computer-savvy folks to help us do the enforcement,’ Bowen says. ‘Getting
rid of spam is never going to be the district attorney’s first priority
and it shouldn’t be.”‘ She notes also that she’s “seen estimates that it
could grow to 50 percent in the next five years.” Too late — it’s
already there, as far as I can tell.
FWIW, I like the sound of this — she’s requiring that commercial e-mail
senders have an existing verified-opt-in relationship beforehand. Sounds
good to me.
And finally, a very
interesting set of tests on Robinson-combining strategies. Very
interesting, that is, if you’re implementing a Bayesian spam filter.
Otherwise quite boring. ;)
Tags: blog, bonded, choice, com, ironport, list, mail, mailing, sender, spam