Open APIs, Open Source, And Giving Away The Crown Jewels

Tech: Bit of a long essay, this one.

World+dog have been linking to this interview with Flickr’s Stewart Butterfield on the O’Reilly Network, so I wasn’t going to bother. But I came across a great illustration of what I think is a very important point:

Koman: In the write-up for your web services session at ETech, you say, Capturing the creative energy of the hive can be scary. It requires giving up some control, and eliminating lock-in as a strategy. Tell me some more about that.

Butterfield: Ofoto is a pretty good example. I don’t want to pick on them too much, but they create a pretty artificial kind of lock-in. When you upload your pictures to them, you might upload a three- or four-megapixel image, but all you can get back from them is a 600-pixel image; if you want to get the original back, you have to buy it on a CD. There’s no way to get it out because if you got it out, then your friends and family could get it out and print it out at home, and they’re in competition with Lexmark and HP as well as the other online photo services. So that’s one aspect of it.

There’s also a tendency to want to capture all the value that’s being generated or will potentially be generated by new business. What I mean by that is, we don’t explicitly allow commercial uses of the API yet, but we definitely plan to. And we know that there are people working on products based on our API that we want to do, but outside developers will get to it first. What letting go in that context means is letting go of all the control you have over users by being the one who owns the database, because other developers can generate businesses and products that hook into you, and that takes some value away.

This is a point that still, to this day, most people miss.

The traditional viewpoint is that, if you’ve got something, you hoard it, and ensure you’re the guy who makes the money from it. So you do what Ofoto do — you keep the full-resolution images, and charge for access to them; or you don’t publish APIs, and keep the data to yourself; or in the world of source code, you hold onto the source so no-one else can see it, because it’s your ‘crown jewels’. Then, the idea goes, you can ensure that you’re the only one who can do prints, or add a feature to the source, or whatever.

But the problem is, you’re not always the one with the idea; or alternatively, every feature request has to go through you, and be implemented by you, on your time. And in the meantime, your users are considering the big question — ‘do I want to get locked in, here? what if he goes out of business? am I a small customer who’s going to be ignored?’

In fact, I’ve been guilty of this myself. When I started writing open-source software, I used the GPL as a license, which prohibits commercial use (mostly) — except by myself or through my explicit permission. I had no intentions of making it available for commercial use, because I couldn’t see the commercial uses.

But that was me being short-sighted — soon, people starting asking if they could license the code for commercial use, or hire me. I realised that I didn’t have the time, or inclination, to go the whole hog, and risk my livelihood on a piece of software — especially risky since I didn’t think that software could support me alone.

So when I wrote SpamAssassin, I picked the Perl dual license, a license that did permit commercial use, while still being an open-source license. By now, there are quite a few commercial versions of SpamAssassin, all making money (I hope!), I’m getting paid to work on SpamAssassin, and everyone’s happy ;)

Perhaps I should have kept commercial rights to myself. But I have no doubt that doing so would have ensured SpamAssassin remained a small-time solution, and would not have received the number of contributors, committers, and patches it has by now. (for example, Matt Sergeant, who was an SpamAssassin committer, joined the project explicitly to use that code in MessageLabs‘ product.)

Plus, at the time, there were already quite a few commercial competitors – and there’s a lot more to being a commercial success than the simple things required to be an open-source success; I’d be dubious that SpamAssassin would have been able to compete as a purely-commercial play, and I’m not sure I’d have been keen to risk my livelihood to do so, anyway. (I’m not really dot-com CTO material, anyway. I like hacking code too much.)

I think things have worked out well: the software’s better, I’m earning a livelihood from open-source software regardless, and the software’s usable for more people. As usual, Larry Wall was right ;)

Tags: , , , , , , , , , ,

Comments

‘Shooting The Messenger’

Yoz does a great job rounding up some Plan For Spam links. First off, he links to a great essay, Shooting The Messenger, which nicely rebuts the idea that to deal with spam, we need an SMTPng. Recommended. (He goes a bit overboard with some hard-ass filtering recommendations at the end IMO, though…)

Secondly, Yoz links to a couple more posts. The first is a friendly-fire incident involving the SpamCop DNS blacklists, illustrating the dangers of peer-to-peer ‘this is spam’ reporting. There’s a related issue with the SpamCop DNSBL, in that it’s over-sensitive; one report can sometimes be enough to get a site BLed, which is not good. The problems with SpamCop’s hair-trigger thresholds are well-documented, and — hopefully — Julian will fix them soon.

The second is a mail from John Gilmore to Politech. He says ‘a simple rule for anti-spam measures that preserves non-spammers’ freedom to communicate is: No anti-spam measure should ever block a non-spam message. But there isn’t a single anti-spam organization that actually follows this rule.’

Wrong. That’s exactly the SpamAssassin angle. If the user says it’s not spam, it’s not spam — and we have to figure out a way to get our scoring system to return that result, if at all possible. And yes, it gets it wrong about 0.1% of the time — and that’s why we never tell users to block, bounce or delete spam if at all possible; just mark it ‘possible spam’ and divert to another folder, and always let a human take a look to verify that decision.

Given the nature of the spam problem, and the nuisance it poses to virtually everybody trying to use email, that’s the best that can be done at this point.

And yes, something has to be done. Spam is a massive problem. If it’s not dealt with somehow, and kept out of our day-to-day inboxes, people will stop using mail. Before spam filters became ubiquitous, I talked to many casual internet users who (a) closed down their email address every 6 months to escape the flood, or (b) gave up reading their mail because of it. (And why did spam filters become ubiquitous?)

It comes down to: what’s better for the internet — a mislabelled email in your ’spam bucket’ folder — or no email at all?

Tags: , , , , , , , , ,

Comments

(Untitled)

Some vague web musing: while reading Cory Doctorow’s “Metacrap” essay on metadata, I noticed this:

Certain kinds of implicit metadata is awfully useful, in fact. Google exploits metadata about the structure of the World Wide Web: by examining the number of links pointing at a page (and the number of links pointing at each linker), Google can derive statistics about the number of Web-authors who believe that that page is important enough to link to, and hence make extremely reliable guesses about how reputable the information on that page is.

He’s right, of course — that’s how Google works. But while reading this, it occurred to me that this implicitly rewards websites that consist of small numbers of large pages, instead of high numbers of short pages; if your site has a page for ever sub-heading (think of a Linux HOWTO document here), and a linker to your site links to the page that’s relevant to what they’re talking about, your Google ranking will be lower than if you keep the document all in one page and use named anchors.

Personally, despite what Jakob Neilsen thinks, I prefer the all-in-one page mode myself. It’s quicker to download (overall), easier to print or read offline, and I’m not afraid to use a scrollbar. Interesting to see Google (accidentally) recommends it too ;)

The rest of the essay is spot on, in my opinion.

BTW, Cory also writes for Boing Boing, one of the coolest mags I used to read back when, and now a top-quality weblog.

Tags: , , , , , , , , ,

Comments