Another script:

A quick hack — – find out where your site’s google juice comes from

This script will grind through your web site’s “access.log” file (which must be in the “combined” log format). It’ll pick out the top 100 Google searches found in the referer field, re-run those searches, and determine which ones are giving your website all the linky Google love — in other words, the searches that your site ‘wins’ on.

The output is in plain text and a chunk of HTML.

usage: sitehost google-api-key < access.log > out.html


cat /var/www/logs/* | \ 0xb0bd0bb5yourgoogleapikeyhere0xdeadbeef | tee out.html

NOTE: this script requires the SOAP::Lite module be installed. Install it using apt-get install libsoap-lite-perl or cpan SOAP::Lite. It also requires a Google API key.

For example, here are the current results for this site. You can immediately see some interesting stuff that’s not immediately obvious otherwise, such as my site being the top hit for [beardy justin] ;)

Download here (5 KiB perl script).


  • if you see a lot of “502 Bad Gateway” errors, it’s probably over-zealous anti-bot ACLs on Google’s side. Try from another host.

  • Read the comments for notes on a bug in recent releases of SOAP::Lite; please let me know if you hear of them getting fixed ;)

This entry was posted in Uncategorized and tagged , , , , , , . Bookmark the permalink. Both comments and trackbacks are currently closed.


  1. Posted March 2, 2006 at 15:03 | Permalink

    Useful, thank you.

  2. Posted March 2, 2006 at 16:03 | Permalink

    more useful now that I’ve uploaded it ;)

  3. Posted March 2, 2006 at 19:29 | Permalink


    I don’t speak Perl, so I don’t know if

    useprefix has been deprecated. if you wish to turn off or on the use of a default namespace, then please use either ns(uri) or defaultns(uri) at /usr/lib/perl5/vendor_perl/5.8.8/SOAP/ line 858, <> line 31029.

    is a problem within your code or with my system’s Perl installaion.

  4. Posted March 3, 2006 at 10:30 | Permalink

    Sebastian —

    It looks like this is a bug in the latest version of SOAP::Lite (0.67); if you google for [useprefix SOAP::Lite defaultns] there’s quite a few reports including:

    I would suggest maybe downgrading to an earlier version, if you can — perhaps using the distribution copy via “apt-get” instead of loading it via CPAN.

    disappointing! sorry about that! Damn external dependencies — live by the CPAN, die by the CPAN ;)

  5. Posted July 13, 2006 at 14:53 | Permalink

    SOAP::Lite v0.68 seems to work fine out of the box.