Technorati bloginfo API wierdness

For the benefit of other Technorati API users…

In a comment on this entry, Padraig Brady mentioned that his blog had mysteriously disappeared from the Irish Blogs Top 100 list.

I investigated, and found something odd — it seems Technorati has made a change to their bloginfo API, now listing weblogs with their ‘rank’, but without some of the important metadata, like ‘inboundblogs’, ‘inboundlinks’, and with a ‘lastupdate’ time set to the epoch (1970-01-01 00:00:00 GMT), in the API. Here’s an example:

<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"
                 "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
    <result>
        <url>http://www.pixelbeat.org</url>
                    <weblog>
                <name>Pádraig Brady</name>
                <url>http://www.pixelbeat.org</url>
                <rssurl></rssurl>
                <atomurl></atomurl>
                <inboundblogs></inboundblogs>
                <inboundlinks></inboundlinks>
                <lastupdate>1970-01-01 00:00:00 GMT</lastupdate>
                <rank>74830</rank>
            </weblog>
                            </result>
</document>
</tapi>

Compare that with this lookup result, on my own blog:

<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"
                 "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
    <result>
        <url>http://taint.org</url>
                    <weblog>
                <name>taint.org: Justin Mason’s Weblog</name>
                <url>http://taint.org</url>
                <rssurl>http://taint.org/feed</rssurl>
                <atomurl>http://taint.org/feed/atom</atomurl>
                <inboundblogs>143</inboundblogs>
                <inboundlinks>227</inboundlinks>
                <lastupdate>2008-02-12 11:48:10 GMT</lastupdate>
                <rank>43404</rank>
            </weblog>
                            <inboundblogs>143</inboundblogs>
                            <inboundlinks>227</inboundlinks>
            </result>
</document>
</tapi>

This bug had caused a number of blogs to be dropped from the list, since I was using “inboundblogs and inboundlinks == 0” as an indication that a blog was not registered with Technorati.

It’s now worked around in my code, although a side-effect is that blogs which have this set will appear with question-marks in the ‘inboundblogs’ and ‘inboundlinks’ columns, and will perform poorly in the ‘ranked by inbound link count’ table (unsurprisingly).

I’ve posted a query to the support forum — let’s see what the story is.

This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.

2 Comments

  1. ugh
    Posted December 15, 2008 at 21:08 | Permalink

    Thanks for the post… I see Technorati helpfully jumped to your aid by answering your support question… /sarcasm. 0 responses and it was posted 10 months ago.

    I’ve just started using the API to find blog posts related to a term/tag and it only works about 10% of the time. The other 90% of the time it just returns no results. If I echo out the URL being retrieved and paste it into my browser it will suddenly work… and then wont’t, and won’t, and will–completely random.

    I’m guessing I’m hitting some malfunctioning server that is giving me no results?? Who knows… annoying and sucky nonetheless.

  2. ugh
    Posted December 15, 2008 at 21:21 | Permalink

    I think my guess was right (maybe a problem with caching?) and I think I’ve found a solution. It might be coincidence, but appending a random unique string to each request seems to produce the correct output. In PHP I added:

    $url = ‘http://api.tec […]?technorati=whatever&’ . md5(microtime());

    Results seem to be returned every time now. Yay..