An interesting article on blog-spam countermeasures — Google’s
embarrassing mistake. Quote:
I think it’s time we all agreed that the ‘nofollow’ tag has been a complete
failure.
For those of you new to the concept, nofollow is a tag that blogs can add to
hyperlinks in blog comments. The tag tells Google not to use that link in
calculating the PageRank for the linked site. [...]
Since its enthusiastic adoption a year and a half ago, by Google, Six Apart,
Wordpress, and of course the eminent Dave Winer, I think we can all agree
that nofollow has done — nothing. Comment spam? Thicker than ever. It’s had
absolutely no effect on the volume of spam. That’s probably because comment
spammers don’t give a crap, because the marginal cost of spamming is so low.
Also, nofollow-tagged links are still links, which means that humans can
still click on them — and if humans can click, there’s a chance somebody might
visit the linked sites after all.
I agree. At the time, I pointed at
this comment from Mark
Pilgrim:
Spammers have it in their heads now that weblog comments are a vector to
exploit. They don’t look at individual results and tweak their software to
stop bothering individuals. They write generic software that works with
millions of sites and goes after them en masse. So you would end up with
just as much spam, it would just be displayed with unlinked URLs.
Spammers don’t read blogs; they just write to them.
I still think he was spot on.
However, one part of the ‘Google’s embarrassing mistake’ article is a red
herring — I think the chilling effect on “nonspam links” is not to be worried
about; as Jeremy Zawodny
said, life’s too short to
worry about dropping links purely in the hopes of giving yourself Page Rank. I
don’t know if I really want links that people are leaving purely for that
reason. ;)
In fact, I wouldn’t be surprised to hear that Google’s crawler starts treating
“nofollow” links as mildly non-spammy in a future revision, due to their wide
use in wikis, blogs etc.
To be honest, though — I don’t see the problem of blog-spam much anymore.
As I said here:
[Weblog] comment spam should be a lot easier to deal with than SMTP spam. …
With weblog comments, you control the protocol entirely, whereas with SMTP
you’re stuck with an existing protocol and very little “wiggle room”.
On my WordPress weblog [ie. here] — which, admittedly, gets only about 1/4 of the
traffic plasticbag.org does — I’ve instituted
a very simple check stolen from Jeremy
Zawodny. I simply include a form field
which asks the comment poster for my first name, and if they fail to supply
that, the comment is dropped. In addition, I’ve removed the form fields to
post directly, requiring that all comments are previewed; this has the nice
bonus of increasing comment quality, too.
Those are the only antispam measures I’m using there, and as a result of
those two I get about 1 successful spam posted per week, which is a one-click
moderation task in my email. That’s it.
The key is to not use the same measures as everyone else — if every weblog
has a different set of protocols, with different form fields asking different
simple questions, the only spammers that can beat that are the ones that
write custom code for your site — or use human operators sitting down to an
IE window.
Trackbacks, however — turn that off. The protocol was designed poorly, with
insufficient thought given to its abuse potential; there’s no point keeping it
around, now that it’s a spam vector.
Finally, a “perfect” solution to blog spam, while allowing comments, is
unachievable. There will always be one guy who’s going to sit down at a real
web browser to hand-type a comment extolling the virtues of some product or
another. The goal is to get it to a level where you get one of those per
week, and it’s a one-click operation to discard them.
(Update: This story got Slashdotted!
The poor server’s been up and down repeatedly — looks like it needs an upgrade. In the meantime,
WP-Cache has proven its weight in gold; recommended…)
Tags: antispam, blog-spam, nofollow, spam, weblogs