Links for 2008-08-05

Why San Francisco’s network admin went rogue an “eyewitness account” with allegations about the SF network admin in question — no documentation, passwords kept to himself instead of shared with his team, the entire network maintained by 1 person, never took holidays, bad tempered and stubborn — sounds like a recipe for classic BOFH disaster

Yehrin Tong Illustration cool, hyper-detailed hand-drawn tiling patterns

working around installation bug in File::Scan::ClamAV running the test suite results in “ERROR: Can’t open/parse the config file clamav.conf”; looks like File::Scan::ClamAV is now unmaintained :(

ALIFE Conference to reveal bio-inspired spam detection ‘this bio-inspired spam detection algorithm, based on the cross-regulation modeal of T-cell dynamics, is equally as competitive [sic] as state-of-the-art spam binary classifiers and provides a deeper understanding of the behaviour of T-cell cross-regulation systems.’

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , ,

Comments

Planet Antispam unborked

Those of you who visit Planet Antispam may have noticed that it hadn’t been updating in a few days. Somehow or other, the Planet software had corrupted its cache, and was dying with this error:

Traceback (most recent call last):
  File "planet.py", line 167, in ?
    main()
  File "planet.py", line 160, in main
    my_planet.run(planet_name, planet_link, template_files, offline)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 240, in run
    channel = Channel(self, feed_url)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 527, in __init__
    self.cache_read_entries()
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 569, in cache_read_entries
    item = NewsItem(self, key)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 845, in __init__
    self.cache_read()
  File "/home/planet/antispam/planet-2.0/planet/cache.py", line 74, in cache_read
    self._type[key] = self._cache[cache_key + " type"]
  File “/usr/lib/python2.3/bsddb/__init__.py”, line 116, in __getitem__
    return self.db[key]
KeyError: ‘tag:blogger.com,1999:blog-9336495.post-117499582419244211 feedburner_origlink type’

Ah, Berkeley DB, always good for the infrequent inscrutable, yet fatal, error. A wipe of the contents of the cache directory, and it seems to be working again.

Unfortunately, I had to drop the RSS feed for Aunty Spam; it seems the domain has lapsed, and I can’t seem to find an RSS feed that contains just the spam-related Aunty Spam posts any more.

Tags: , , , ,

Comments (3)

Planet Antispam Update

Hey, some Planet Antispam updates. I’ve upgraded to Planet 2.0, and that seems to have solved some of the wierdness with consuming Atom feeds.

Also, there are two new antispam weblogs added to the subscription list:

Welcome guys!

(btw, if you’re wondering what happened to the music post — I moved it over here, to the mp3 blog where it was supposed to be posted in the first place, duh ;)

Tags: , , , ,

Comments (2)

Blog Spam, and a ‘nofollow’ Post-Mortem

An interesting article on blog-spam countermeasures — Google’s embarrassing mistake. Quote:

I think it’s time we all agreed that the ‘nofollow’ tag has been a complete failure.

For those of you new to the concept, nofollow is a tag that blogs can add to hyperlinks in blog comments. The tag tells Google not to use that link in calculating the PageRank for the linked site. [...]

Since its enthusiastic adoption a year and a half ago, by Google, Six Apart, Wordpress, and of course the eminent Dave Winer, I think we can all agree that nofollow has done — nothing. Comment spam? Thicker than ever. It’s had absolutely no effect on the volume of spam. That’s probably because comment spammers don’t give a crap, because the marginal cost of spamming is so low. Also, nofollow-tagged links are still links, which means that humans can still click on them — and if humans can click, there’s a chance somebody might visit the linked sites after all.

I agree. At the time, I pointed at this comment from Mark Pilgrim:

Spammers have it in their heads now that weblog comments are a vector to exploit. They don’t look at individual results and tweak their software to stop bothering individuals. They write generic software that works with millions of sites and goes after them en masse. So you would end up with just as much spam, it would just be displayed with unlinked URLs.

Spammers don’t read blogs; they just write to them.

I still think he was spot on.

However, one part of the ‘Google’s embarrassing mistake’ article is a red herring — I think the chilling effect on “nonspam links” is not to be worried about; as Jeremy Zawodny said, life’s too short to worry about dropping links purely in the hopes of giving yourself Page Rank. I don’t know if I really want links that people are leaving purely for that reason. ;)

In fact, I wouldn’t be surprised to hear that Google’s crawler starts treating “nofollow” links as mildly non-spammy in a future revision, due to their wide use in wikis, blogs etc.

To be honest, though — I don’t see the problem of blog-spam much anymore. As I said here:

[Weblog] comment spam should be a lot easier to deal with than SMTP spam. … With weblog comments, you control the protocol entirely, whereas with SMTP you’re stuck with an existing protocol and very little “wiggle room”.

On my WordPress weblog [ie. here] — which, admittedly, gets only about 1/4 of the traffic plasticbag.org does — I’ve instituted a very simple check stolen from Jeremy Zawodny. I simply include a form field which asks the comment poster for my first name, and if they fail to supply that, the comment is dropped. In addition, I’ve removed the form fields to post directly, requiring that all comments are previewed; this has the nice bonus of increasing comment quality, too.

Those are the only antispam measures I’m using there, and as a result of those two I get about 1 successful spam posted per week, which is a one-click moderation task in my email. That’s it.

The key is to not use the same measures as everyone else — if every weblog has a different set of protocols, with different form fields asking different simple questions, the only spammers that can beat that are the ones that write custom code for your site — or use human operators sitting down to an IE window.

Trackbacks, however — turn that off. The protocol was designed poorly, with insufficient thought given to its abuse potential; there’s no point keeping it around, now that it’s a spam vector.

Finally, a “perfect” solution to blog spam, while allowing comments, is unachievable. There will always be one guy who’s going to sit down at a real web browser to hand-type a comment extolling the virtues of some product or another. The goal is to get it to a level where you get one of those per week, and it’s a one-click operation to discard them.

(Update: This story got Slashdotted! The poor server’s been up and down repeatedly — looks like it needs an upgrade. In the meantime, WP-Cache has proven its weight in gold; recommended…)

Tags: , , , ,

Comments (29)

Poll: keep ‘Fixing Email Weblog’ in Planet Antispam?

I added the Fixing Email weblog to Planet Antispam a while back — however, I’m not entirely sure at this stage that its content (which is seems to be primarily news syndication) fits with the “planet” concept (which is primarily intended for first-person posts).

So — quick poll. Let me know what you think, pro or con, Planet readers: should I remove the Fixing Email feed from that site?

Update: that was a pretty resounding ‘yes’. Done!

Tags: , , ,

Comments (5)

Planet Antispam at abuse.net

Planet Antispam now has a better URL — http://planet.spam.abuse.net/ . Much better!

Tags: , , , , , ,

Comments (1)

The ‘Overseas Spammers’ and ‘Do Not Mail List’ Fallacies

Declan McCullagh: A modest proposal to end spam. Good article on Larry Lessig’s ’spam bounties’ proposal.

Lofgren’s plan won’t give everyone who gets spammed new rights to sue (although spam victims may already may have some rights under state antispam or other laws). Instead, it states that people sending unsolicited commercial e-mail must label it with ‘ADV:’ in the subject line or run the risk of being sued by the Federal Trade Commission. If you are the first to report an unlabeled spam-o-gram to the government, you will get a bounty of ‘not less than 20 percent’ of the fine the spammer pays, assuming it can ever be collected.

There are problems with this. As far as I know, the FTC is not having a problem collecting spam — the figures I’ve seen (can’t recall them right now) indicate that they get hundreds of megs a day. (Even the SpamAssassin.org spamtraps get over 100Mb a day.)

The difficulty is chasing down the perpetrator, and prosecuting. That takes law-enforcement manpower, and that’s just not there right now — because, let’s face it, spam is not a serious offence like rape or murder.

Anyway, Declan says that the major problem is that the spammers are offshore:

For one thing, an increasing percentage of it comes from overseas, and you can be certain that offshore bulk mailers will gleefully thumb their noses at Congress. Ken Schneider, chief technical officer of antispam company Brightmail, estimates that 30 percent to 50 percent of the spam his company tracks comes from outside the United States. ‘It’s a big number,’ Schneider said. ‘It’s a global economy, and spammers are certainly taking advantage of it.’

This is a frequent misapprehension. This is not the case. It’s true that much spam is relayed through machines in Asia and South America, but the originators — the people who are writing the spam and sending it to compromised relay machines and proxies — are US-based. In fact, a vast quantity of ‘em seem to be based in Florida. (This is the thing about country-code blacklists. In reality, if we could track a message all the way back to the origin, a state-code blacklist for FL would probably work much better ;)

In other news from the same article:

… Sen. Chuck Schumer, D-N.Y., is expected to introduce a bill this week to create an national ‘do not e-mail’ list–an idea that the New Democrats touted earlier this month.

OK, while I’m here, let’s debunk ‘do no mail’ lists too. ;) ‘Do not call’ lists work well for telephones, since you typically have only one phone number. But for email:

In summary, I’m not confident a ‘do not mail’ list could actually be operable.

Finally — The SBL’s answer to the EMarketersAmerica.org SLAPP lawsuit.

Tags: , , , , , , , , ,

Comments