Technorati bloginfo API wierdness

For the benefit of other Technorati API users…

In a comment on this entry, Padraig Brady mentioned that his blog had mysteriously disappeared from the Irish Blogs Top 100 list.

I investigated, and found something odd — it seems Technorati has made a change to their bloginfo API, now listing weblogs with their ‘rank’, but without some of the important metadata, like ‘inboundblogs’, ‘inboundlinks’, and with a ‘lastupdate’ time set to the epoch (1970-01-01 00:00:00 GMT), in the API. Here’s an example:

<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"
                 "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
    <result>
        <url>http://www.pixelbeat.org</url>
                    <weblog>
                <name>Pádraig Brady</name>
                <url>http://www.pixelbeat.org</url>
                <rssurl></rssurl>
                <atomurl></atomurl>
                <inboundblogs></inboundblogs>
                <inboundlinks></inboundlinks>
                <lastupdate>1970-01-01 00:00:00 GMT</lastupdate>
                <rank>74830</rank>
            </weblog>
                            </result>
</document>
</tapi>

Compare that with this lookup result, on my own blog:

<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"
                 "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
    <result>
        <url>http://taint.org</url>
                    <weblog>
                <name>taint.org: Justin Mason’s Weblog</name>
                <url>http://taint.org</url>
                <rssurl>http://taint.org/feed</rssurl>
                <atomurl>http://taint.org/feed/atom</atomurl>
                <inboundblogs>143</inboundblogs>
                <inboundlinks>227</inboundlinks>
                <lastupdate>2008-02-12 11:48:10 GMT</lastupdate>
                <rank>43404</rank>
            </weblog>
                            <inboundblogs>143</inboundblogs>
                            <inboundlinks>227</inboundlinks>
            </result>
</document>
</tapi>

This bug had caused a number of blogs to be dropped from the list, since I was using “inboundblogs and inboundlinks == 0″ as an indication that a blog was not registered with Technorati.

It’s now worked around in my code, although a side-effect is that blogs which have this set will appear with question-marks in the ‘inboundblogs’ and ‘inboundlinks’ columns, and will perform poorly in the ‘ranked by inbound link count’ table (unsurprisingly).

I’ve posted a query to the support forum — let’s see what the story is.

Tags: , , , , ,

Comments

Flickr as a ‘TypePad service for groups’

Web: a while back, I posted some musings about a web service to help authenticate users as members of a private group, similarly to how TypeKey authenticates users in general.

Well, Flickr have just posted this draft authentication API which does this very nicely — it now allows third-party web apps to authenticate against Flickr, TypeKey-style, and perform a limited subset of actions on the user’s behalf.

This means that using Flickr as a group authentication web service is now doable, as far as I can see…

Tags: , , , , , , , , ,

Comments

Open APIs, Open Source, And Giving Away The Crown Jewels

Tech: Bit of a long essay, this one.

World+dog have been linking to this interview with Flickr’s Stewart Butterfield on the O’Reilly Network, so I wasn’t going to bother. But I came across a great illustration of what I think is a very important point:

Koman: In the write-up for your web services session at ETech, you say, Capturing the creative energy of the hive can be scary. It requires giving up some control, and eliminating lock-in as a strategy. Tell me some more about that.

Butterfield: Ofoto is a pretty good example. I don’t want to pick on them too much, but they create a pretty artificial kind of lock-in. When you upload your pictures to them, you might upload a three- or four-megapixel image, but all you can get back from them is a 600-pixel image; if you want to get the original back, you have to buy it on a CD. There’s no way to get it out because if you got it out, then your friends and family could get it out and print it out at home, and they’re in competition with Lexmark and HP as well as the other online photo services. So that’s one aspect of it.

There’s also a tendency to want to capture all the value that’s being generated or will potentially be generated by new business. What I mean by that is, we don’t explicitly allow commercial uses of the API yet, but we definitely plan to. And we know that there are people working on products based on our API that we want to do, but outside developers will get to it first. What letting go in that context means is letting go of all the control you have over users by being the one who owns the database, because other developers can generate businesses and products that hook into you, and that takes some value away.

This is a point that still, to this day, most people miss.

The traditional viewpoint is that, if you’ve got something, you hoard it, and ensure you’re the guy who makes the money from it. So you do what Ofoto do — you keep the full-resolution images, and charge for access to them; or you don’t publish APIs, and keep the data to yourself; or in the world of source code, you hold onto the source so no-one else can see it, because it’s your ‘crown jewels’. Then, the idea goes, you can ensure that you’re the only one who can do prints, or add a feature to the source, or whatever.

But the problem is, you’re not always the one with the idea; or alternatively, every feature request has to go through you, and be implemented by you, on your time. And in the meantime, your users are considering the big question — ‘do I want to get locked in, here? what if he goes out of business? am I a small customer who’s going to be ignored?’

In fact, I’ve been guilty of this myself. When I started writing open-source software, I used the GPL as a license, which prohibits commercial use (mostly) — except by myself or through my explicit permission. I had no intentions of making it available for commercial use, because I couldn’t see the commercial uses.

But that was me being short-sighted — soon, people starting asking if they could license the code for commercial use, or hire me. I realised that I didn’t have the time, or inclination, to go the whole hog, and risk my livelihood on a piece of software — especially risky since I didn’t think that software could support me alone.

So when I wrote SpamAssassin, I picked the Perl dual license, a license that did permit commercial use, while still being an open-source license. By now, there are quite a few commercial versions of SpamAssassin, all making money (I hope!), I’m getting paid to work on SpamAssassin, and everyone’s happy ;)

Perhaps I should have kept commercial rights to myself. But I have no doubt that doing so would have ensured SpamAssassin remained a small-time solution, and would not have received the number of contributors, committers, and patches it has by now. (for example, Matt Sergeant, who was an SpamAssassin committer, joined the project explicitly to use that code in MessageLabs‘ product.)

Plus, at the time, there were already quite a few commercial competitors – and there’s a lot more to being a commercial success than the simple things required to be an open-source success; I’d be dubious that SpamAssassin would have been able to compete as a purely-commercial play, and I’m not sure I’d have been keen to risk my livelihood to do so, anyway. (I’m not really dot-com CTO material, anyway. I like hacking code too much.)

I think things have worked out well: the software’s better, I’m earning a livelihood from open-source software regardless, and the software’s usable for more people. As usual, Larry Wall was right ;)

Tags: , , , , , , , , , ,

Comments

The Web-App generation

Software: Mark Twomey, in response to all the Win32 API stuff recently:

We now have a generation of computer users … who have never received or sent email from a so called ‘rich client’, never had to send a postal order off to order something from some distant vendor, and are not amazed by something like a search engine. ….

Those (’rich client’) people remind me of minicomputer users who crapped on the ‘crummy little operating systems’ used on ‘crummy little desktop computers.’

He’s right, you know — for de yoot, Windows is generally just a way to access Hotmail.

Tags: , , , , , , , , , ,

Comments