About the title change

The eagle-eyed may have spotted a change that took place a month or two ago in the taint.org configuration — I ditched the old weblog tagline.

Previously, this weblog was titled “taint.org: Happy Software Prole”. This title had been in place since around October 2003, when Daniel Lyons wrote a particularly idiotic article for Forbes entitled “Linux’s Hit Men”, which I took umbrage to:

Here we go again — the old ‘free software is communism’ line [...] The article goes on to bemoan how software companies who write proprietary extensions into GPL-licensed software, have to comply with the terms of the license. It’s all a bit of an obvious dig — but I am looking forward to the follow-up article — that’s the one where the author bemoans how commercial software companies send out their ‘enforcers’ to extort money from companies who don’t bother paying the royalties and runtime license fees their licenses require.

As an free/open-source-software guy, I happily adopted ‘happy software prole’ as an absurd tagline, in the spirit of detournement. Fast-forward to 3.5 years on, however, and I’d say most people can’t even remember the Forbes article, or that Daniel Lyons guy! So that tagline was a bit old and busted, really.

On top of this, I’d noticed something I do in my weblog reading — I’ve started renaming blogs in the feed reader from their fancy title, to simply the name of the author.

I’ve found that when reading blogs, I’m interested in who’s writing. When skimming through the feeds of a morning, having to spend 5 seconds to recall that “ByteSurgery.com” is Robin Blandford is just a wee bit superfluous, sorry Robin. ;)

As a favour for readers, I’ve saved them the trouble, and renamed the blog to be quite explicit about who’s writing; the taint.org tagline is now just “taint.org: Justin Mason’s Weblog”. Let’s face it — it’s a bit functional. Hopefully it’s helpful, though!

(And finally, it gives me the edge in the ongoing Google war against the non-me “Justin Masons” out there… and against a heart surgeon and a Texan basketball player, I need it. ;)

Tags: , , , , ,

Comments (3)

Retroactive Tagging With TagThe.Net

Hacky hack hack.

Ever since I enabled tags on taint.org, I’ve been mildly annoyed by the fact that there were thousands of older entries deprived of their folksonomic chunky goodness. A way to ‘retroactively tag’ those entries somehow would be cool.

Last week, Leonard posted a link on his linkblog to TagThe.net, a web service which offers a nifty REST API; simply upload a chunk of text, and it’ll suggest a few tags for that text, like this:

echo 'Hi there, I am a tag-suggesting robot' | curl "http://tagthe.net/api/?text=`urlencode`"
<?xml version="1.0" encoding="UTF-8"?>
<memes>
  <meme source="urn:memanage:BAD542FA4948D12800AA92A7FAD420A1" updated="Tue May 30 20:20:39 CEST 2006">
    <dim type="topic">
      <item>robot</item>
    </dim>
    <dim type="language">
      <item>english</item>
    </dim>
  </meme>
</memes>

This looked promising.

Anyway, I’ve now implemented this — it worked great! If you’re curious, here’s details of how I did it. It’s a bit hacky, since I’m only going to be doing this once — and very UNIXy and perlish, because that’s how I do these things — but maybe somebody will find it useful.

How I Retroactively Tagged taint.org

This weblog runs WordPress — so all the entries are stored in a MySQL database. I took the MySQL dump of the tables, and a quick script figured out that out of somewhere over 1600-ish posts, there were 1352 that came from the pre-tag era, requiring tag inference. A mail to the TagThe.Net team established that they were happy with this level of usage.

I grepped the post IDs and text out of the SQL dump, threw those into a text file using the simple format ‘id=NNN text=SQLHTMLSTRING’ (where SQLHTMLSTRING was the nicely-escaped HTML text taken directly from the SQL dump), and ran them through this script.

That rendered the first 2k of each of those entries as a URL-encoded string, invoked the REST API with that, got the XML output, and extracted the tags into another UNIXy text-format output file. (It also added one tag for the ‘proto-tag’ system I used in the early days, where the first word of the entry was a single tag-style category name.)

Next, I ran this script, which in turn took that intermediate output and converted it to valid PHP code, like so:

cat suggestedtags | ./taglist-to-php.pl  > addtags.php
scp addtags.php my.server:taint.org/wp-admin/

The generated page ‘addtags.php’ looks like this:

<?php
  require_once('admin.php');
  global $utw;
  $utw->SaveTags(997, array("music","all","audio","drm-free",
      "faq","lunchbox","destination","download","premiere","quote"));
  [...]
  $utw->SaveTags(998, array(”software”,”foo”,”swf”,”tin”,”vnc”));
  $utw->SaveTags(999, array(”oses”,”eek”,”longhorn”,”ram”,
    “winsupersite”,”windows”,”amount”,”base”,”dog”,”preview”,”system”));
?>

Once that page was in place, I just visited it in my (already logged in) web browser window, at http://taint.org/wp-admin/addtags.php, and watched as it gronked for a while. Eventually it stopped, and all those entries had been tagged. (If I wasn’t so hackish, I might have put in a little UI text here — but I didn’t.)

The results are very good, I think.

A success: http://taint.org/tag/research has picked up a lot of the interesting older entries where I discussed things like IBM’s Tieresias pattern-recognition algorithm. That’s spot on.

A minor downside: it’s not so good at nouns. This entry talks about Silicon Valley and geographical insularity, and mentions “Silicon Valley” prominently — one or both of those words would seem to be a good thing to tag with, but it missed them.

Still, that’s a minor issue — the tags it has suggested are generally very appropriate and useful.

Next, I need to find a way to auto-generate titles for the really old entries ;)

Tags: , , , , , ,

Comments (1)

Four Things

I don’t do silly blog antics much, but I got tagged by Mat for the Four Things meme. Looking around, it is indeed a bit more interesting than things like the usual LJ quiz, so why not!

I wrote this on the plane from LA to Dublin, which may have affected some of the selections in 4 places I would rather be right now at least ;)

4 jobs I’ve had:

  • I was Iona Technologies’ first employee, and stayed there for no less than 7 years. I got to see the company grow from a handful of people, most of whom weren’t getting paid (hence how I wound up as the first employee ;), all the way up to a 300-strong multinational, while the company itself formed a core of Ireland’s mini dot-com boom. That was fantastic fun, and educational to boot.

  • my Dad’s gun/fishing/sporting-goods shop. Was it really a good idea to have a teenager working near firearms? At least I wasn’t the one who unplugged the fridge where the maggots were kept, so that they all hatched over the course of one weekend…

  • A horrible teenage job — picking tomatoes. I can still feel the orange dust under my fingernails every time I smell fresh tomatoes :( I didn’t last very long at that at all.

  • writing an Amiga-based kiosk system for virtually no pay whatsoever, at the age of 18 or 19. Ah, exploitation.

4 movies I can watch over and over:

  • Koyaanisqatsi — it’s dating a little now, since every ad agency through the 90s ripped it off. But still, the invention of a new format. I remember looking at the 405 freeway in LA, and thinking “looks like something out of Koyaanisqatsi” — of course, it was.

  • Princess Mononoke — either that, or Nausicaa. I just love the way the characters are coloured in shades of grey, rather than black and white.

  • the Lord of the Rings trilogy — oh dear I’m a hopeless Tolkien fanboy.

  • Spinal Tap — pure genius.

4 places I’ve lived:

  • Melbourne, Australia; around the time of the annoying TV drama, The Secret Lives Of Us;

  • Newport Beach, CA; around the time of the annoying TV drama, The O.C.;

  • Dublin, Ireland; no annoying TV drama — so far

  • University of California Irvine, CA; while Irvine itself is the most soulless suburban hellhole I’ve ever visited, living on the UCI campus is quite fun by comparison. Take about 1000 grad students, post-docs and lecturers from around the world; put them all in the same square mile or so; remove all fun (and bars!) from the surrounding areas; watch them make their own entertainment, or go mad.

4 tv shows I love:

4 places I’ve vacationed:

  • Annapurna Base Camp, Nepal; we trekked our way up to there, then trekked back down again. Unforgettable. I really want to do another Nepal trek as a result

  • car-camping around the Australian state of Victoria; they have some fantastic national park campsites, which most tourists overlook

  • learning how to dive in Ko Tao, Thailand; great setting, great dive sites, pretty cheap too!

  • Yosemite; amazing, world-class natural beauty. Californians don’t realise just how lucky they’ve got it ;)

4 of my favourite dishes:

  • A good Thai green curry

  • Laos-style green papaya salad with sticky rice

  • a good meaty cassoulet, from Fandango in San Luis Obispo. At least, that was the tastiest meal I’ve had in recent months ;)

  • Mangosteen — the queen of fruit, according to the Thais. I could, and probably have, eaten hundreds of these

4 places I would rather be right now:

  • spending New Year’s Day with a bunch of friends in rural West Cork or County Galway; until I moved to the US, this was one of my favourite annual traditions.

  • the Stag’s Head Bar, Dublin, in the snug, again with a bunch of friends

  • sitting on the grass outside the Pavilion bar in TCD, on a sunny summer’s day (hmm, that’s a lot of bars!)

  • Chiang Mai, Thailand

4 sites I visit daily:

4 people I’m tagging:

Tags: , , , , ,

Comments (6)

Lean’s got a weblog

Friends: the ex-Iona readers, and those with an interest in urban design, might like to go take a look at citynoise.blogspot.com — Lean Doody’s new urban design weblog.

Tags: , , , , , , , , ,

Comments

Project management, deadlines etc.

Work: I took a look over at Edd Dumbill’s weblog recently, and came across this posting on planning programming projects. He links to another article and mentions:

My recent return to managing a team of people has highlighted for me the difficulties of the arbitrary deadline approach to project management. Unfortunately, it’s also the default management approach applied by a lot of people, because the concept is easy to grasp.

The arbitrary deadline method is troublesome because of the difficulty of estimation. As John’s post elaborates, you can never foresee all of the problems you’ll meet along the way. The distressing inevitability of 90% of the effort being required by 2% of the deliverable is frequently inexplicable to developers themselves. Never mind the managers remote from the development!

I’ve been considering why my experience of working with open source seems generally preferable to commercial work, and this may be one of the key elements. Commercial software development is deadline-driven, whereas most open source development has not been, in my experience; ‘it’s ready when it’s ready’.

Edd suggests that using a trouble-ticket-based system for progress tracking and management is superior. I’m inclined to agree.

Tags: , , , , , , , , , ,

Comments

Bad Blogger.com Security Model

Security: Hey user auth systems! If you’re going to require me to sign in, and publish my login as a signature to prove that I’m ‘me’, please do me a favour — don’t delete the account if it’s been ‘inactive’, and allow anyone to re-register that name without my knowledge!

I just tried to leave a comment on a Blogger.com weblog, to find that my user account at Blogger had been deleted. Re-creating a new account with the same name wasn’t a problem – the previous account data had been simply deleted outright. (Presumably they don’t do this to people with a Blogger.com weblog — I hope.)

The risks of this are pretty clear; given that I’d already established an identity (at least in comments on certain Blogger weblogs) as ‘justinmason23′, if an attacker were to have re-registered that identity before I did, they could impersonate me.

Tags: , , , , , , , , , ,

Comments (1)

RFID Scan Detector

RFID: Over on Adam Shostack’s weblog, in a comment on an entry regarding the plans to mandate remotely-readable RFID passports, Martin Forssen brings up a great idea:

What I want is a device which beeps every time somebody scans me for RFID-tags. I assume this would be fairly easy to construct since the scanner must send a signal of some strength to activate the chip.

I wonder if that’d work? A keyfob, for example, something similar in size to the dinky Chrysalis Wifi Seeker I have on my keyring, would be perfect. It’d be probably pretty cheap to make, would make a great geek toy, and be quite educational too. ;)

Tags: , , , , , , , , ,

Comments

Xmas hols

Meta: I’m back in Dublin for a couple of weeks over xmas, so I won’t be updating this weblog very much. See you in January!

BTW I flew back via Chicago, which is obviously the stopover of choice to Dublin from Silicon Valley — surrounded by 1 iBook per every 8 passengers. ;)

PS: looks like they forgot Poland!

Tags: , , , , , , ,

Comments

If taint.org was spam

Spam: ever wondered what this weblog would look like if it was spam? wonder no more. (via crummy.com)

Tags: , , , ,

Comments

GMail Usability

Web: Check out GMail’s ‘thread history’ built into the message display, dubbed ‘collapsable history’ and ‘cards’. Very, very nice email usability!

More at Kevin Fox’ weblog, fury.com.

Tags: , , , , , , , , , ,

Comments

Abuseable Tech

Tech: ATAC: Abusable Technologies Awareness Center. Great panel weblog, with some of the big names in the research field, dealing with several security issues quite nicely.

Found from a link to Simon Byers’ 2003 roundup of information leakage, which notes an interesting case I hadn’t heard of – a TechTV presenter accidentally posting topless photos of herself, due to a bug in Photoshop!

(link via Liudvikas Bukys)

Tags: , , , , , , , , , ,

Comments

Exploding Monitors pt. II

Hardware: This weblog is jinxed!!

That’s the only explanation I can come up with. The day before yesterday, I blogged about exploding monitors and various halt-and-catch-fire software instructions. Last night, my monitor made a popping noise, emitted a faint burning-plastic smell, and shrank the display into a thin stripe down the middle of the screen.

Great. It’s dead as a doornail — I’m working from Catherine’s iBook for now. Quite a step down from the lovely 21-inch CRT. Argh :(

BTW, needless to say, I wasn’t running any scary apps — not even Freedom: First Resistance — the only possible display-hosing culprits were Firebird, KDE, ExMH or gvim ;)

Tags: , , , , , , , , , ,

Comments

Clay Shirky on Complex Software Systems

Software: Shirky on the Semantic Web. Great snippet:

it turns out that people can share data without having to share a worldview, so we got the meta-data without needing the ontology. Exhibit A in this regard is the weblog world. In a recent paper discussing the Semantic Web and weblogs, Matt Rothenberg details the invention and rapid spread of ‘RSS autodiscovery’, where an existing HTML tag was pressed into service as a way of automatically pointing to a weblog’s syndication feed.

About this process, which went from suggestion to implementation in mere days, Rothenberg says:

Granted, RSS autodiscovery was a relatively simplistic technical standard compared to the types of standards required for the environment of pervasive meta-data stipulated by the semantic web, but its adoption demonstrates an environment in which new technical standards for publishing can go from prototype to widespread utility extremely quickly. …

This, of course, is the standard Hail Mary play for anyone whose

technology is caught on the wrong side of complexity. People pushing such technologies often make the ‘gateway drug’ claim that rapid adoption of simple technologies is a precursor to later adoption of much more complex ones. Lotus claimed that simple internet email would eventually leave people clamoring for the more sophisticated features of CC:Mail (RIP), PointCast (also RIP) tried to label email a ‘push’ technology so they would look like a next-generation tool rather than a dead-end, and so on.
Here Rothenberg follows the script to a tee, labeling RSS autodiscovery
’simplistic’ without entertaining the idea that simplicity may be a requirement of rapid and broad diffusion. The real lesson of RSS autodiscovery is that developers can create valuable meta-data without needing any of the trappings of the Semantic Web. Were the whole effort to be shelved tomorrow, successes like RSS autodiscovery would not be affected in the slightest.

Another good line: ‘There is a list of technologies that are actually political philosophy masquerading as code, a list that includes Xanadu, Freenet, and now the Semantic Web.’

Tags: , , , , , , , , , ,

Comments

Microtution spam warning

Just received a mail from a bunch called ‘microtution’, looking to write a collaborative political weblog. More details here.

But hold on there — this was an out-and-out spam, sent via an open proxy, using a spam tool, with faked headers, to a spamtrap address they scraped from one of my sites. Anyone considering helping out on this collaborative weblog might like to consider who they’re helping.

The mail was sent from 213.176.81.230, direct to my MX, from ‘Fredericka’ <promiseman@promiseman.com>, Subject ‘need help with political blog’.

Tags: , , , , , , , , ,

Comments

Lessons from history

I’ve been reading Crooked Timber recently; a good literate weblog. Today’s interesting post, from Kieran Healy: Frustration is not a Strategy. Well worth a read for some context on today’s Middle East, and the fundamental problem with those ‘kill ‘em all’ proposals that keep cropping up from the hawks.

Blogs: Nathan Cochrane, Aussie journalist for The Age and writer of a very interesting weblog — has won quite a lot of money on a TV gameshow! I think the term is ‘goodonyamate’, if I recall correctly ;)

(Pity he couldn’t have fixed the BlogShares listing first though.)

Tags: , , , , , , , , ,

Comments

Good tech-politics blog

Nathan Cochrane has a weblog. He’s a clueful journo who writes about technology for The Age, the Melbourne newspaper – thumbs up for that; I read plenty of The Age during my sojourn in Melbourne, it’s the best newspaper in Oz. (Plus it recommends using Sitescooper and Plucker in their Handheld Howto page, so that’s always going to get a +1 from me ;)

But anyway, a very clueful weblog; lots of good journalism straight from the source. Recommended.

LinMagAU.org: Integrating SpamAssassin with MailMan. I really must get around to getting our server upgraded to MailMan 2.1 so we can apply this; I have one list that’s getting about 5-10 spams a day, and even with ’subscriber posting only’ set, MM 2.0’s admin interface is very clunky for dealing with that.

Does anyone know if there’s a usable tool to automate Mailman admin BTW? Or give it a good UI?

Tags: , , , , , , , , ,

Comments

Tim Bray on Drugs

Tim Bray’s weblog is a great read; I’ve added it to my daily list. Today, he’s provided a fantastic article about the drugs problem in Vancouver’s Downtown Eastside.

Dublin has historically had a serious of up-and-down swings with a heroin problem; at one stage, it was one of the worst in Europe. It improved quite a lot during the 90’s, but it’s going downhill again, apparently; maybe the legislators need to read this article.

(The big problem as far as I can see is that treatment centres are horrifically underfunded, it being a lot easier, and — while not cheaper – at least already budgeted for, to ship the junkies off to prison. Business as usual. Of course, while they’re there, they’re (a) off the streets (out of sight, out of mind), and (b) learning all the latest criminal techniques, and getting well hooked on all the cheap heroin in there.)

(BTW did you know that one reason heroin is massively popular in prisons, is due to drug-testing? Apparently, marijuana can be detected a month after use, whereas heroin is undetectable 48 hours afterwards. So prison drug-testing regimes indirectly encourage heroin use. Oops!)

Linux: Linux Journal: report from LinuxWorld Ireland. Sounds like a great talk from maddog and Michael Meeks. And if you look carefully at the photo on that article page, you can see Proinnsias in the background!

Mind you, I would probably have just done my ‘incomprehensible question about software patents’ schtick with the IBM guy again…

What with this and GUADEC coming to Dublin, I’m missing all the good piss-ups^Wevents it seems ;)

Tags: , , , , , , , , ,

Comments

Comment links back again

the (discuss) links are back, and about time too, things were getting quiet. Anyway, it’s a unified comments forum now. All posts go into one forum, instead of creating a new forum for each weblog posting. Having comments pages for each story just didn’t work for a small-scale blog — and it was impossible to see if there was any new posts for all those individual forums.

Tags: , , , ,

Comments

1 January 1659/60 (Lord’s Day)

Samuel Pepys has a weblog:

This morning (we living lately in the garret,) I rose, put on my suit with great skirts, having not lately worn any other, clothes but them. Went to Mr. Gunning’s chapel at Exeter House, where he made a very good sermon.

Anyway, still recovering from the holidays. Hope you all had a good one..

Tags: , , , , , , , ,

Comments

(Untitled)

Looking at whump.com’s moreLikeThis weblog, I found a link to a static web CMS using DocBook and XSLT: XM (XSLT Make), and the concept of site maps as a design pattern for websites. Both are similar to what I’m trying to do with WebMake, but unfortunately there’s no links to WebMake anywhere there. Gotta do more hyping, I reckon ;)

Tags: , , , , , , , , ,

Comments