Daily Archives: February 20, 2006

TREC Spam Corpus

Some news from TREC’s Gordon Cormack: The TREC 2005 Corpus (92,000 messages – 42,000 ham; 50,000 spam) is now available for self-serve download. TREC Spam Evaluation is a NIST program to develop methods to measure spam filter accuracy and performance. More details here. The corpus can be picked up at Gordon’s site. As far as […]

Posted in Uncategorized | Tagged , , , , , , , , , | Comments closed

Four Things

I don’t do silly blog antics much, but I got tagged by Mat for the Four Things meme. Looking around, it is indeed a bit more interesting than things like the usual LJ quiz, so why not! I wrote this on the plane from LA to Dublin, which may have affected some of the selections […]

Posted in Uncategorized | Tagged , , , , , | Comments closed

The Return of Sneakernet

Keith Dawson sent this on — an interview with Jim Gray, head of Microsoft’s Bay Area Research Center and winner of the ACM Turing Award, talking about new transmission systems for truly massive data collections. Very interesting: [One] option is to send whole computers. …. We’re now into the 2-terabyte realm, so we can’t actually […]

Posted in Uncategorized | Tagged , , , , , , , , , | Comments closed


Boing Boing has an interesting case today: “I filled out a web form for a contest from Miller using a throwaway junk email address and then, months after I dumped the throwaway account, I got this to my main account! Not sure I like the idea of companies tracking me down like this.” I sent […]

Posted in Uncategorized | Tagged , , , , , , | Comments closed