Slides from Toorcon 2004

Spam: my slides from the presentation I gave at Toorcon 2004, ‘Spam Forensics: Reverse-Engineering Spammer Tactics’, are now up. Hope they prove enlightening ;)

Tags: , , , , , , , ,

Comments

BitTorrent

Net: Great NYTimes article interviewing Bram Cohen about BitTorrent (u: sitescooper p: sitescooper). Good to see that it landed him a job with Valve, but let’s hope that’s not the last piece of free software from Bram…

One of the best things about the article, BTW, is that it does take notice that BT isn’t a tool for piracy. Refreshing, given how these things are often covered.

Tags: , , , , , , , , , ,

Comments

Spam Composition

defective yeti: Spam Composition. Let’s hope this never happens — those misspellings are SpamAssassin’s bread and butter!

Tags: , , , , , ,

Comments

1 January 1659/60 (Lord’s Day)

Samuel Pepys has a weblog:

This morning (we living lately in the garret,) I rose, put on my suit with great skirts, having not lately worn any other, clothes but them. Went to Mr. Gunning’s chapel at Exeter House, where he made a very good sermon.

Anyway, still recovering from the holidays. Hope you all had a good one..

Tags: , , , , , , , ,

Comments

MAPS gets the TCR treatment, a public corpus, and a wedding

Found on Paul Graham’s site: “according to a recent study, the MAPS RBL, probably the best known blacklist, catches only 24% of spam, with 34% false positives. It would take a conscious effort to write a content-based filter with performance that bad.”

The “recent study” is by David Nelson at Giga Information Group, sometime last year.

For the sake of it, I’ve checked out how the MAPS figures stack up using TCR, Ion Androutsopoulos‘ metric for measuring spam filter performance. TCR is a very nice single-figure metric, which takes into account the “inconvenience factor” of misfiled mails, based on a “lambda” setting indicating what action is taken when a mail is classified. For MAPS, I’m assuming a lambda of 9, the guideline figure for systems which bounce mail back to the sender, instead of 1 for simple tagging, or 999 for outright deletion with no notification.

So: using a lambda of 9, MAPS gets a TCR of 0.0912, a Spam Recall of 24%, and a Spam Precision of 17%. It’s worth noting that the baseline figure for TCR is 1.0, which represents no filtering whatsoever: ie. all the spam comes right into your mailbox.

In other words, using MAPS is more inconvenient all-round than not filtering your mail at all, if these figures are to be believed ;)

More spam: I’ve just assembled a totally-public corpus of spam and non-spam mail, to allow spamfilter developers to compare and contrast results using the same data. Let’s hope it proves useful.

Not spam: finally, I’m off to Chester for a wedding tomorrow morning; my good mates Kitty and Gerry are tying the knot, in Chester Zoo, no less. Let’s hope this horrible cold I’ve had all week dies down before Saturday…

Tags: , , , , , , , , ,

Comments

(Untitled)

One thing I should note — World New York is possibly my best news source for WTC-related commentary, especially for the eyewitness reports. A great site. It was great before the WTC, too — let’s hope things get back to normal pretty soon…

Tags: , , , , , , , , ,

Comments