links for 2008-07-03

2 Comments »

  1. Steve Loughran said,

    July 4, 2008 @ 8:27 am

    Justin, if you have Petabytes of spam data -and want to work with Mahout or Hadoop core, then get on the list and start talking about what you have and what you need. Someone should be able to sort out some time on one of their clusters, even if isn’t one of the big Yahoo! ones.

  2. Justin said,

    July 4, 2008 @ 12:19 pm

    Steve — thanks for the tip!

    We only have GBs, not petabytes, but we do have a need to re-run our mass-checks (mass scans of GBs of mail using SpamAssassin) over the entire collection on a very frequent — currently daily — basis. I’d say that might be a problem, but I should really get off my ass and get over there to find out ;)

RSS feed for comments on this post

Leave a Comment

Comment text formatting: Markdown Extra syntax is supported, as is plain old HTML. (Quick reference for Markdown basics)

View blog reactions using Technorati