Open source v closed-source spam filtering

Spam: I’m quoted in
New Scientist! w00t!

SlashDot picked it up pretty quickly. One comment there misses the point, though:

This is interesting and promising technology. But like all antispam techniques, spammers will find a way around it. Once spammers get a copy of the software, they can create and test countermeasures in the comfort of their own sleazy lairs.

It’s worth talking about this. Newsflash: spammers have no difficulty testing their spam against closed-source spam filters, even when they can’t ‘get a copy’ and test them in ‘their sleazy lairs’.

How do they do it? Easy — just set up an account at a site that uses that filter (AOL, Yahoo!, Hotmail, and GMail, it’s pretty obvious how to do that; for other closed-source filters, find an ISP that uses it). Then send ‘test mails’ repeatedly to that account, and apply trial and error to see what gets past the filter and what doesn’t. Eventually, they figure out what works for that filter, and what doesn’t.

How did I figure this out? Well, I came across the manual for the Send-Safe ratware on-line. It noted that the ‘hashbuster’ randomisation technique, which we in the SpamAssassin team had long assumed was intended to block hash matches by DCC, Pyzor and Razor, was in fact intended to block AOL’s implementation of that system. The open source ones weren’t even mentioned.

Update: found it — from their FAQ:

Mime Encoded content

If you want to get into AOL… use it.

MIME encoders allow you to send documents written within a specific application through email without causing readability or formatting problems. For example, you can send a letter created in MSWord with and be certain that it arrives at its destination in the same format by encoding it with MIME first. The recipient then decodes it back into the original MSWord format.

That isn’t why we use it though.

We use it to cause ‘uniqueness’.

When you put a rotate tag at the beginning of a MIME encoded email, it causes everything after that point (including checksums) to be ‘different’ in every message.

Why is that that important?

Because it throws off filters that look for many copies of the same message to nuke.

Tags: , , , , , , , , ,

Comments

Seldom-Asked Questions About Japan

Japan: This is fantastic; full of odd little facts about Japan. Here’s one I really like:

  1. ‘(How do you explain) the frequency of Japanese people (usually women) running or jogging for no apparent reason. In the travel agency, ‘let me get you a copy’ and she runs away. In my office a woman runs to the bathroom (can be explained) and then runs back to her desk (huh?). Most of the teachers I work with wait for the bell in the teacher’s room, and then practically sprint to their classes. Do you know why all this running is going on? Fitness? Service? An Edo-era leftover?’–Question submitted by Ben Schwartz
  2. I once teasingly asked a female with whom I worked why she always did a sort of feigned jog to and from the copier, especially since her jog was slower than her walk. The humour wasn’t lost on her, but she explained that many Japanese do this at work because the appearance of urgency is important in more traditional office environments. You don’t have to truly run around frantically, but just offer the gesture.–Answer kindly submitted by Lou C.

Another good one — it seems Bob the Builder had to have a finger added for the Japanese market, in order to not look like a yakuza.

Tags: , , , , , , , , , ,

Comments (1)

Seldom-Asked Questions About Japan

This is fantastic; full of odd little facts about Japan. Here’s one I really like:

  1. ‘(How do you explain) the frequency of Japanese people (usually women) running or jogging for no apparent reason. In the travel agency, ‘let me get you a copy’ and she runs away. In my office a woman runs to the bathroom (can be explained) and then runs back to her desk (huh?). Most of the teachers I work with wait for the bell in the teacher’s room, and then practically sprint to their classes. Do you know why all this running is going on? Fitness? Service? An Edo-era leftover?’–Question submitted by Ben Schwartz
  2. I once teasingly asked a female with whom I worked why she always did a sort of feigned jog to and from the copier, especially since her jog was slower than her walk. The humour wasn’t lost on her, but she explained that many Japanese do this at work because the appearance of urgency is important in more traditional office environments. You don’t have to truly run around frantically, but just offer the gesture.–Answer kindly submitted by Lou C.

Another good one — it seems Bob the Builder had to have a finger added for the Japanese market, in order to not look like a yakuza.

Tags: , , , , , , , , ,

Comments

On Copy Protection and DRM

Security: Dan Bricklin writes:

As I pointed out in ‘Copy Protection Robs The Future’, the only reason I have a copy (of VisiCalc) that can still work is that someone kept a ‘bootleg’ uncopyprotected copy around. The original disks may not have worked on a Longhorn machine. Just copying the files from the original 5 1/4″ floppy to a 3 1/2″ one that would fit in today’s machines certainly would result in a non-working copy, because of copy protection. We will regret ‘Digital Restriction/Rights Management’ in the future.

Here’s the essay he mentions: Copy Protection Robs The Future:

Copy protection, like poor environment and chemical instability before it for books and works of art, looks to be a major impediment to preserving our cultural heritage. Works that are copy protected are less likely to survive into the future. The formal and informal world of archivists and preservers will be unable to do their job of moving what they keep from one media to another newer one, nor will they be able to ensure survival and appreciation through wide dissemination, even when it is legal to do so.

Tying in nicely with The Long Now Foundation and the importance of the public domain.

Tags: , , , , , , , , , ,

Comments

For reference: email usability

I was clearing out my mail last night, and came across a message that referenced a mail I sent a few years back; it’s a selection of feature requests I made at the start of development of Evolution, the GNOME mail reader/contact manager/Outlook clone. (Not sure if any got implemented BTW ;)

Since I still think some of these are killer ideas that would really improve email readers, and since the only copy is sitting in a mailing list archive, I’ll take a local copy here by posting it.

Worth noting that the reason it came up was a quick mail exchange with Kaitlin ‘Duck’ Sherwood, who’s the queen of email usability, and will be working on the OSAF’s Chandler PIM (and mail) application. Not only had she read the CHI’96 paper in question, she noted it as a ‘profound influence’! Cool — and bodes well for Chandler!

Kaitlin also replied with some excellent plans for folder-overview presentation; I can’t wait to see the results in Chandler, personally. If you want an idea of this stuff, her page on the Perfect Email Client lives here.

Quick top tip: filtering or colorizing messages based how you’re addressed in the headers is immediately beneficial. Quoting Ducky:

My pet view also color-codes messages based on how you were addressed.
  • to me and only me
  • to me and other people
  • cc me and only me
  • cc me and other people
  • bcc me
  • Most people who have implemented the above techniques (you can do it
    with either Outlook or Eudora, though it’s somewhat painful to set up) tell me they’ve saved between 25% and 50% of their prior email time.

She’s right, too!

From: Justin Mason (spam-protected)
Date: Fri, 02 Jun 2000 12:11:56 +0100
Subject: CHI’96 paper on mail usability and some thoughts

Hi guys,

Dunno if you’ve seen this, it’s a good paper on email usability and some recommendations to improve same…

http://www.acm.org/sigchi/chi96/proceedings/papers/Whittaker/sw_txt.htm

Basically it says:

  1. heavy mail users use incoming mail as a to-do list and appointment tracker

(I personally would add “as a reference bookshelf” as well in my case);

  1. filing into folders doesn’t work in a lot of cases; once it’s out of the

inbox it’s off the radar and soon forgotten about; and folder names are hard to pick and remember;

  1. users quite often do not delete mails in case they become valuable context

for an ongoing discussion, resulting in inbox bloat and an interleaved stack of messages from threads filling up the inbox;

  1. inbox bloat means important mails from a day or two ago soon scroll out

of the “main” window and are lost in the noise.

to fix these:

  • it recommends threading (makes sense, and we know that). This reduces

the visual impact of inbox bloat and sorts 3. and 4.

  • close links to PIM functions such as todo and datebook would be good to help

with 1. (that’s the plan isn’t it!)

  • vfolders should deal with 2.

A few ideas I came up with myself during reading it:

  • I previously added some code to ExMH to colorise messages, and used

the colours as a way of differentiating “todo low-priority”, “todo high-pri”, “support mails”, “pals chatting”, etc. This worked very well as a way to scan a lot of mails and immediately work out the rough categorisation without having to read and parse the from and subject. (unfortunately the code stopped working in the next ver of ExMH and my Tk knowledge wasn’t good enough to fix it!) Helps with problem 4 and aids scanning.

  • up to now there’s been essentially 3 states for mail messages — “unread”,

“read” and “deleted” (ie. not there anymore). I would like to see another state, “saved_as_context”, which would be similar to deleted; ie. the mail would not be visible to the user at all. However, if another mail came in that referenced the “saved_as_context” mail, it would be possible (probably through hitting a “view context thread” button) to see all of that new msg’s context mails. This sorts out problem 3 in a nice way IMHO. BTW it may even be better to use “saved_as_context” instead of “deleted”, ie. keep deleted msgs around for possible context use, and purge them periodically.

  • Retitling mails (ie. changing their subjects after they’ve been received)

would help deal with problem 1 as well — e.g. changing a mail from “Re: help” to “How to fix the latest Outlook worm” is obviously handy for future visual message retrieval ;)

  • It would be handy if an incoming mail can be converted into a To-Do list item

in the PIM interface; ie. right-click on mail, select “add to to-do list”, and that mail (and/or thread!) would be visible in the To-Do PIM interface in some way (even just as a “see this mail” link a la the “note” attached to Palm To-Do list items). It’d also be cool if this went both ways so the To-Do list position/priority of a mail was visible in the inbox view.

Anyway, these are some ideas I thought I’d throw in. I’m pretty excited by the possibilities of Evolution, and I’m looking forward to trying it out; after reading that paper, I just had to share ;)

BTW I haven’t used MS Outlook, so forgive me if Outlook sorts out these problems and I just didn’t notice — ditto for Evolution too, I haven’t had the time to get it compiling yet! ;)

–j.

Tags: , , , , , , , , ,

Comments

greenish foul-smelling gravy

While trekking in Nepal, I had a copy of the Lonely Planet Guide to Trekking in the Himalayas, borrowed from our mates Caolan and Barbara. It was especially notable for its incredible medical section, which contained lots of info on what drugs to use to treat various diseases, described symptomatically (of course, in most of the world, most of the common illnesses boast symptoms similar to “I have greenish foul-smelling gravy squirting from both ends of my body”. But it’s good to be able to tell them apart).

It was also notable, because anyone who had a copy knew all about altitude sickness, and were indescribably paranoid. The ones who were charging up the trails as fast as they could generally did not have a copy, and no doubt half of them came back down again in slightly nasty circumstances.

Anyway, it was the best medical info I’ve ever read. Reading the paper today, I came across a reference to e-med.co.uk, which claims to be medical info, including treatment details, for people who might be far away from a doctor. The perfect resource for a know-it-all who doesn’t want to spend money and time on a doctor, just to be told to go home and take an aspirin! Unfortunately it seems to be a “consultation by email” service, rather than “look it all up” one. Ah well.

Caolan and Barbara should be somewhere around Oz by now. I must see if I can dig up the URL of their travelogue site, it’s great fun.

Tags: , , , , , , , , ,

Comments