SpamAssassin as an EC2 service

I had a bit of an epiphany while chatting to Antoin about the qpsmtpd/EC2 idea. Craig had the same thoughts.

Here’s the thing — there’s actually no need to offload the SMTP part at all. That stuff is tricky, since you’ve got to build in a lot of fault tolerance, quality-of-service, uptime, etc. to ensure that the MX really is reachable. Since an EC2 instance will lose its “disks” once rebooted/shut down, you need to store your queues in Amazon S3 — which has differing filesystem semantics from good old POSIX — so things get quite a bit hairier. On top of that, it requires a little RFC-breakage; there are issues with using CNAMEs in MX records, reportedly.

However, if we offload just the spamd part, it becomes a whole lot simpler. The SPAMD protocol will work fine across long distances, securely, with SSL encryption active, and SpamAssassin will work fine as a filtering system in an entirely stateless mode, with no persistent-across-reboots storage. (What about the persistent-storage aspects of spamd operation? There’s just the auto-whitelist, which can be easily ignored, and I haven’t trained a Bayes database in 2 years, so I doubt I’ll need that either ;)

If the spamd server is down or uncontactable, spamc will handle this and retry with another server, or eventually give up and pass the message through, safely intact (though unscanned).

Given that there’s a cool third-party ClamAV plugin now available for SpamAssassin, this system can offload the virus-scanning work, too.

So here’s the new plan: run the MTA, MX, and the super-lean “spamc” client on the normal MX machine — and offload the “spamd” work to one or more EC2 machines.

Basically, there would be a CNAME record in DNS, listing the dynamic DNS names of the EC2 spamd instances. Then, spamc is set to point at that CNAME as the spamd host to use. As EC2 instances are started/removed, they are added/removed from that CNAME list and spamc will automatically keep up.

Pricing is reasonably affordable — don’t send over-large messages to the EC2 spamd; rate-limit total incoming SMTP traffic in the MTA; and use the SPAMD protocol‘s REPORT verb to reduce the bandwidth consumption of mails in transit by ensuring that the mail messages are only transmitted one-way, MX-to-EC2, instead of both MX-to-EC2 and EC2-to-MX. That will keep the bandwidth pricing down.

Recent figures indicate that I got about 90MB of mail per day, at peak, over the past weekend (which nearly DOS’d my server and caused some firefighting) — 68MB of spam, and 13MB of blowback. At 20 cents per GB, that’s 1.8 cents per day for traffic. Plus the $0.10 per instance hour, that’s $2.42 per day to run a single EC2 instance to handle DDOS spikes. Of course, that can be shut down when load is low.

Yep, this is looking very promising. Now when are Amazon going to let me onto the beta program for EC2?…

This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.


  1. Matt Sergeant
    Posted November 30, 2006 at 16:42 | Permalink

    MessageLabs is cheaper :-)

  2. Posted November 30, 2006 at 16:53 | Permalink

    what is it, UKP1 per mailbox per month? I have 1 mailbox. But I doubt they’d talk to me ;)

  3. Matt Sergeant
    Posted November 30, 2006 at 17:46 | Permalink

    I think it’s about UKP2 or maybe 3 per month – I’m not really sure and our entire system is setup so that you have to go through a sales droid, unfortunately.

    IM me though – we may be able to set something up :)

  4. Posted November 30, 2006 at 18:29 | Permalink

    spamc/spamd wants to add support for gzip-compression as well as the SSL encryption; that’ll reduce the bandwidth even further.

  5. Posted November 30, 2006 at 20:25 | Permalink

    Craig: good idea.

  6. Posted December 3, 2006 at 15:50 | Permalink

    Well compression would be easy ‘ssh -C’ ;) How about a cpushared spamd ? C.

  7. Posted December 4, 2006 at 15:23 | Permalink

    I don’t see any sign that cpushare actually is workable, yet!

  8. Posted December 7, 2006 at 21:13 | Permalink

    You could lease a server with a faster processor and more memory than the EC2 instance for a dollar a day, stick it under your desk and be done with it. Surely you’ve got a few dozen spare GB/month of transfer allowance on your high speed connection at home.

  9. Posted December 8, 2006 at 12:00 | Permalink

    Daryl, unfortunately my ISP (like many european ISPs) caps the bandwidth I’m allowed to consume. an additional 27GB of traffic per month is definitely over that limit. :(

  10. Posted December 20, 2006 at 02:11 | Permalink

    Solution to disk problem is S3 service and everything that should be stored should be in S3 very simple and free traffic S3 <-> EC2

  11. Posted January 7, 2007 at 19:57 | Permalink

    I was going through the httpd logs for my SARE sa-update channels, looking for the /24s with the most hits, and noticed that somebody’s been running sa-update from a few EC2 hosts since December 15th. On Jan 5th they had (I’m assuming it’s the same person as it’s all hits for the same channels in the same order) 7 hosts run sa-update all around 10:45am EST.

    So I guess somebody’s gone to the trouble of doing it… shouldn’t really be any problems as they’re just Xen guest instances.

  12. Posted January 7, 2007 at 21:32 | Permalink

    Daryl — yeah. it’d be better with the “-z” switch I’ve just added on a branch (adds compression to the wire protocol) and another useful feature, which would be a new verb in spamc/spamd to send only rewritten headers back over the wire in responses…

    I haven’t hacked on it at all really, though. still no sign of my EC2 account — they’ve been very slow :( it’s unlikely those will be merged into 3.2.0 unless I can actually real-world test them.

  13. Neil Gunder
    Posted April 8, 2008 at 20:43 | Permalink

    I’ve been using – works really well on all of our email. We use it for our marketing and transactional email and they monitor the inbox delivery. Even integrates with google analytics.

    Took us about 3 hours to integrate too, hard to say how much of an improvement it was as we went from zero information to information overload! It’s interesting to see though as our authentication emails were going into the spam folder on and we didn’t know it.

    Neil Gunder