links for 2008-07-03

This entry was posted in Uncategorized. Bookmark the permalink. Both comments and trackbacks are currently closed.


  1. Posted July 4, 2008 at 08:27 | Permalink

    Justin, if you have Petabytes of spam data -and want to work with Mahout or Hadoop core, then get on the list and start talking about what you have and what you need. Someone should be able to sort out some time on one of their clusters, even if isn’t one of the big Yahoo! ones.

  2. Posted July 4, 2008 at 12:19 | Permalink

    Steve — thanks for the tip!

    We only have GBs, not petabytes, but we do have a need to re-run our mass-checks (mass scans of GBs of mail using SpamAssassin) over the entire collection on a very frequent — currently daily — basis. I’d say that might be a problem, but I should really get off my ass and get over there to find out ;)