Another script: goog-love.pl

A quick hack –

goog-love.pl - find out where your site’s google juice comes from

This script will grind through your web site’s “access.log” file (which must be in the “combined” log format). It’ll pick out the top 100 Google searches found in the referer field, re-run those searches, and determine which ones are giving your website all the linky Google love — in other words, the searches that your site ‘wins’ on.

The output is in plain text and a chunk of HTML.

usage:

goog-love.pl sitehost google-api-key < access.log > out.html

e.g.

cat /var/www/logs/taint.org.* | goog-love.pl \
  taint.org 0xb0bd0bb5yourgoogleapikeyhere0xdeadbeef | tee out.html

NOTE: this script requires the SOAP::Lite module be installed. Install it using apt-get install libsoap-lite-perl or cpan SOAP::Lite. It also requires a Google API key.

For example, here are the current results for this site. You can immediately see some interesting stuff that’s not immediately obvious otherwise, such as my site being the top hit for [beardy justin] ;)

Download here (5 KiB perl script).

Notes:

  • if you see a lot of “502 Bad Gateway” errors, it’s probably over-zealous anti-bot ACLs on Google’s side. Try from another host.

  • Read the comments for notes on a bug in recent releases of SOAP::Lite; please let me know if you hear of them getting fixed ;)

Tags: , , , , , ,

5 Comments »

  1. Yoav Shapira said,

    March 2, 2006 @ 3:03 pm

    Useful, thank you.

  2. Justin said,

    March 2, 2006 @ 4:03 pm

    more useful now that I’ve uploaded it ;)

  3. Sebastian Bergmann said,

    March 2, 2006 @ 7:29 pm

    Justin,

    I don’t speak Perl, so I don’t know if

    useprefix has been deprecated. if you wish to turn off or on the use of a default namespace, then please use either ns(uri) or defaultns(uri) at /usr/lib/perl5/vendor_perl/5.8.8/SOAP/Lite.pm line 858, <> line 31029.

    is a problem within your code or with my system’s Perl installaion.

  4. Justin said,

    March 3, 2006 @ 10:30 am

    Sebastian –

    It looks like this is a bug in the latest version of SOAP::Lite (0.67); if you google for [useprefix SOAP::Lite defaultns] there’s quite a few reports including:

    http://rt.cpan.org/Public/Bug/Display.html?id=16780 http://rt.cpan.org/Public/Bug/Display.html?id=16898

    I would suggest maybe downgrading to an earlier version, if you can — perhaps using the distribution copy via “apt-get” instead of loading it via CPAN.

    disappointing! sorry about that! Damn external dependencies — live by the CPAN, die by the CPAN ;)

  5. Yoav Shapira said,

    July 13, 2006 @ 2:53 pm

    SOAP::Lite v0.68 seems to work fine out of the box.

RSS feed for comments on this post

Leave a Comment

Comment text formatting: Markdown Extra syntax is supported, as is plain old HTML. (Quick reference for Markdown basics)

View blog reactions using Technorati