Email authentication is not anti-spam

There’s a common misconception about spam, email, and email authentication; Matt Cutts has been the most recent promulgator, asking ‘Where’s my authenticated email?’, in which various members of the comment thread consider this as an anti-spam question.

Here’s the thing — email these days is authenticated. If you send a mail from GMail, it’ll be authenticated using both SPF and DomainKeys. However, this alone will not help in the fight against spam.

Put simply — knowing that a mail was sent by ‘jm3485 at’, is not much better than knowing that it was sent by IP address, unless you know that you can trust ‘jm3485 at’, too. Spammers can (and do) authenticate themselves.

Authentication is just a step along the road to reputation and accreditation, as Eric Allman notes:

Reputation is a critical part of an overall anti-spam, anti-phishing system but is intentionally outside the purview of the DKIM base specification because how you do reputation is fundamentally orthogonal to how you do authentication.

Conceptually, once you have established an identity of an accountable entity associated with a message you can start to apply a new class of identity-based algorithms, notably reputation. … In the longer term reputation is likely to be based on community collaboration or third party accreditation.

As he says, in the long term, several vendors (such as Return Path and Habeas) are planning to act as accreditation bureaus and reputation databases, undoubtedly using these standards as a basis. Doubtless Spamhaus have similar plans, although they’ve not mentioned it.

But there’s no need to wait — in the short term, users of SpamAssassin and similar anti-spam systems can run their own personal accreditation list, by whitelisting frequent correspondents based on their DomainKeys/DKIM/SPF records, using whitelist_from_spf, whitelist_from_dkim, and whitelist_from_dk.

Hopefully more ISPs and companies will deploy outbound SPF, DK and DKIM as time goes on, making this easier. All three technologies are useful for this purpose (although I prefer DKIM, if pushed to it ;).

It’s worth noting that the upcoming SpamAssassin 3.2.0 can be set up to run these checks upfront, “short-circuiting” mail from known-good sources with valid SPF/DK/DKIM records, so that it isn’t put through the lengthy scanning process.

That’s not to say Matt doesn’t have a point, though. There are questions about deployment — why can’t I already run “apt-get install postfix-dkim-outbound-signer” to get all my outbound mail transparently signed using DKIM signatures? Why isn’t DKIM signing commonplace by now?

This entry was posted in Uncategorized and tagged , , , , , , , , , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.


  1. Posted January 10, 2007 at 17:28 | Permalink

    I can tell you why DK and DKIM haven’t taken off: they have no penetration. I had to google for “DKIM” to find out what it even was, and from looking at, I can’t tell you what problem it is trying to solve.

    The site says it “provides a method for validating an identity that is associated with a message, during the time it is transferred over the Internet”. I have no idea how I might go from there to “now, thanks to this tech, I have less spam”, which is the outcome I’m interested in.

    I’m not entirely ignorant of these issues, but SPF clearly wasn’t going to solve anything when I looked at it, so DK is tainted by association in my mind.

  2. Posted January 10, 2007 at 18:23 | Permalink

    thanks for the comment, Bryan. interesting! You’d heard of DomainKeys, though, right?

    Sounds like DKIM needs to do a lot of PR work, then. Given that it’s likely to pass through the IETF relatively intact, it should be more widely known by now.

  3. Posted January 10, 2007 at 18:27 | Permalink

    oh — also — for the benefit of future readers, I’ve added some definition links for DK, DKIM and SPF. ;)

  4. Posted January 10, 2007 at 18:43 | Permalink

    I’d heard of DK, but was dismissing it without much thought as “gratuitously not SPF, while also not solving anything I care about” :-)

  5. Posted January 11, 2007 at 02:39 | Permalink

    I’m working on a trust system called Konfidi. Our plan is to support DKIM and SPF authentication, although its due for a big refactoring since it only supports OpenPGP signatures at this point. The basic idea is that people declare who they trust, and publish that. Then when you get an authentic email from somebody, you can query the trustserver to see if you implicitly trust them (via your friends and friends of friends, etc). In a sense, it’s a way to do distributed, personalized whitelists. I think this might be more accurate than reputation systems, although it (like the authentication systems) require adoption to be very useful.

    I’m currently working on getting it to work as a SpamAssassin plugin.

  6. Posted January 11, 2007 at 11:47 | Permalink

    Dave —

    great news! I have lots of ideas for this stuff, but spamassassin keeps me busy enough to block much work on them. ;)

    I (and quite a few others) briefly tried a whitelisting system several years ago: . In fact, I’m still publishing a very out-of-date file at:

    However, that broke down due to one WOT member whitelisting over-aggressively; iirc, they whitelisted their employer’s dialup pools, resulting in botnet-relayed spam being whitelisted. IMO the correct way to fix that would have been to allow levels of “outgoing” trust to be specified by a WOT file owner (so he could assert “only a little spam is emitted by this IP range”, for example), but the maintainer wasn’t keen on the idea. There’s still not much agreement on how to deal with someone N levels away from you over-whitelisting, resulting in spam getting through your filter…

    Anyway, that project deals with whitelisting of IP addresses, suitable for MTA-to-MTA whitelisting. Since then, SPF/DKIM have emerged, and both look very promising as a way to whitelist safely using domain names, so you’re on the right track there!

    I’ll keep an eye on Konfidi — sounds great.

  7. danny
    Posted January 11, 2007 at 16:25 | Permalink

    Apache James is trying to accept a contribution of DK, its plagued with lawyers at the moment because of the licence. We already have an SPF implementation. :-)

  8. Posted January 11, 2007 at 21:43 | Permalink

    fwiw, “whitelist_from_spf, whitelist_from_dkim, and whitelist_from_dk” is a user nightmare. How about one list called “whitelist_if_authenticated”? Users don’t have any frigging idea if the other end is spf, DKIM or DK. Actually, they probably don’t know about authentication. How about “whitelist_from” and only apply the whitelist is authenticated? Actually, that latter thing is probably bad, in the case where the sender isn’t doing auth, the receiver will think something’s broken if whitelisting doesn’t work… Maybe whitelist_from and whitelist_authenticated is the best… ultimately probably moving to just whitelist_from where it only applies to auth’d mail once more senders are auth’ing

  9. Posted January 15, 2007 at 14:40 | Permalink
  10. Doug Otis
    Posted January 22, 2007 at 03:06 | Permalink

    An email server might handle outbound mail for thousands of domains and yet obtain 8 IP addresses. SPF is not an authentication scheme, it is an authorization scheme. SPF scripts compile perhaps thousands of IP addresses transmitting mail for a specific domain and is prone to DDoS exploits, in addition to making spoofs more believable. One should not execute scripts from strangers. SPF puts DNS at risk without asking bad actors to expend much of their own resources.

    DKIM or DK does not sign the message envelope. This exclusion allows a signed message to be replayed from any other location. DKIM still requires a means to associate originating domains with the signing domain as well. Unfortunately DKIM has chosen to restrict the ability to link email-address domains with that of the signing domain. DKIM expects that you’ll be happy to give your provider your private keys or perhaps delegate a portion of your domain (at some ongoing cost of course).

    Both SPF and DKIM suffer from the same basic problem. Providers don’t want to identify their outbound SMTP clients and be expected to handle complaints. SPF and DKIM go to great lengths in hiding the identity of the SMTP client. So much for anyone wanting to stop spam, but can we prevent fraud at least?

    A simple SHA-1 base32 representation of the provider’s signing domain could be placed into any originating email domain to confirm a relationship. There could be SHA-1 base32 tags of the SMTP clients placed within the signing domain to ensure a message is not being replayed (but that won’t happen). Using an associative technique, DKIM could ensure DSNs as well. If your message is blocked, you’ll want to know about it.

    Associative techniques would allow email-address domain owners a means to autonomously confirm the providers that sign on their behalf (at no cost). Providers offering premium accounts could place DKIM key into customer specific domains to ensure other users are unable to spoof. (A scalable solution that can replace SPF.) This would be completely transparent for recipients, as the DKIM key is not seen.

    A low cost application of DKIM and the associative technique would be to annotate email-addresses found in your address book confirmed as being signed by an associated signing domain. Security for the masses! Any provider that signs using DKIM could allow their customers to utilize vanity domains and have it verified by publishing a simple associative tag.

  11. Posted November 3, 2009 at 02:36 | Permalink

    The problem with trusted systems is that they rely on users to not stuff them up, hackers to not infiltrate them, ISP’s to support them, vendors to implement them, etc. The only way to effectively filter spam is by having an appliance that filters the spam before it hits the mail server, and uses multiple tests to accurately distinguish spam from ham.

  12. Posted November 3, 2009 at 03:45 | Permalink

    You are right about no provider being perfect. The first step would be to estimate whether email might be emitted by an 0wned system. These systems often transition to different IP addresses daily. The next step is to use non-linear profiles to estimate normal behavior for a particular source. Profiles need to account for things like challenge response when attempting to measure what can be gleaned from backscatter, and the diversity of the source.

    Once baseline are established, detections by filter programs then indicate whether acceptance from a particular source is still desired. I have been able to process about 1 trillion messages from 80 million sources in about 6 minutes using a three year old dual core server. It is surprising how much can be discovered by examining just the consistency of information gleaned, of things like hostnames associated with a particular IP address. It is rare for hostnames to be not be properly configured from sources that can be trusted, or to change more than few times over a long period of time.

    Those that depend upon SPF as a means to confirm a relationship with Mail Froms are likely allowing otherwise valuable information to be lost, because they think SPF has provided actionable information. It does not, and it is risky to use. My advice is to ignore SPF, and you’re better off, because then you’ll be forced into paying attention. And do not include message bodies in your DSN when this can not be avoided. Often, they can and should be.

    It would be nice to have a system where domains can publish which other domains they trust to handle their mail stream. This could work something like Google’s search rating system. Things like mailing lists will be easily identified, and dependence upon IP address rating lists might become a thing of the past, once everyone learns to watch what what is being presented to their incoming MTAs. Don’t depend only on filtering. Bad actors are getting better at defeating these techniques.

    The url describes a proposed method for voting trust and establishing relationships without a centralized authority.

  13. Posted November 3, 2009 at 11:19 | Permalink

    “It would be nice to have a system where domains can publish which other domains they trust to handle their mail stream”

    I’ve long believed that we could make a real difference if mail domains could publish rules which could be consulted by other “well behaved” domains before wasting time sending mail that was always destined for /dev/null

    For unknown senders there would be tar-pit process which would allow the recipient time to perform checks and publish new rules. Perhaps using 4.?.? “Performing background checks, try again in 30000 milliseconds” and 5.?.? “Message fails rule a.b.c” to update rules during transport.

    I admit that I haven’t thought it through in much detail, but I think the idea of publishing rules may have legs.

  14. Posted November 3, 2009 at 18:29 | Permalink

    Sorry, but the URL was linked to the name, and did not appear separately.

    A problem dealing with DKIM related policy is not with the end-to-end issues confronted by SPF, but in the case where a message has been modified and carries the same From email address as is common with mailing lists. By allowing the originating domains to declare their trusted third party handlers, this would offer more trustworthy information than anything the third party could say about themselves. This approach could use a Google like rating scheme to help identify which third-party signers are trustworthy, and would be less easily gamed than attempting to use anything directly published by the entity in question.

    We provide a 4xx temp error service based upon detection of a spam being emitted from otherwise unknown IP addresses, which does slow down spam runs. Eventually, spam will return to being sent through normal outbound servers. In this case, the provider needs to pay attention to the number of errors generated by their user accounts.

    As the number of domains and IP addresses increase, named relationships will provide more meaningful information that is manageable without a need for super computers. A simple shared 2 TB drive would be large enough to handle 10 years of history for the entire Internet. When properly structured, this approach can provide fast answers. IMHO, this is something that can be done by everyone.