Skip to content

Month: June 2004

Microsoft 0wnz ‘http’

Web: Back in 2002, it occurred to someone to check the Google search results for ‘http’, to figure out what the most popular sites were.

Looks like it’s changed — here’s the top five results from a Google search for ‘http’ now:

  • 1: Microsoft
  • 2: AltaVista (!!)
  • 3: Yahoo!
  • 4: My Excite
  • 5: Google

My guess: older links are getting good PageRank, using whatever new tweaked algorithm they’re using. But AltaVista beating Google? ;)

RTE’s Bush Interview

TV: RTE’s ‘Prime Time’ secured a fantastic interview with GWB, with Carole Coleman asking a few very pointed questions. Watch it with RealPlayer, or listen to the audio in MP3 (2.7Mb).

There’s a pretty accurate transcript here:

Let me finish! How many times do I have to tell you how to do your job? See, I gotta insult France at least once. Then I gotta claim ‘merica to be the most generous nation in the whole wide world, even though it’s not true. And listen, let me mention that democracy in Pakistan, too. And guess what? I’m the first president to ever call for a Palestinian state and I’m damn proud of it – just look at the size of my smirk now. Listen, as long as I keep repeating myself and mouthing empty platitudes, you won’t have a chance to call me on any of the bullshit coming out of my mouth.

OK, the official one is here.

It appears that the White House just dropped the ball on this one; reportedly, they had her list of questions three days in advance, but given that they suggested that she ‘ask him a question on the outfit that Taoiseach Bertie Ahern wore to the G8 summit’ (!!!), they weren’t paying attention, and expected some kind of giggling moronic schoolgirl, or something.

Hilariously, the White House has since complained to RTE, the Irish Embassy, the Irish Government, and the reporter herself. Probably God, too. I doubt Prime Time will ever get a White House interview again, but given what they clearly expect from the poodles in the White House press corps, that’s hardly much of a loss.

(I’d love to see what’d happen if he had to deal with Paxman ;)

Also, went to see Fahrenheit 9/11. Fantastic movie, and best of all, incredibly well-attended.

My favourite moment: the reminder of just how easily the US news media sold itself out during the war. Seeing Katie Couric blurting ‘Navy Seals rock!!’ like some kind of starstruck 5-year-old with an Action Man toy, was a classic. It’s good to see that this will be immortalized in celluloid, as it was truly shocking at the time. (Not much has changed; Judith Miller is still writing for the NYT.)

Samuel L. Jackson’s ‘Irish’ comment

Ireland: Here’s a hot UL that’s floating around the irish web right now —

In a British program about Samuel L Jackson and Colin Farrell’s lastest movie SWAT presented by British presenter, Kate Thornton, the following exchange occured:
  • Thornton: What was it like working with Colin (Farrell), cos he
    • is just so hot in the U.K. right now?
  • Jackson: He’s pretty hot in the U.S. too.
  • Thornton: Yeah, but he is one of our own.
  • Jackson: Isn’t he from Ireland?
  • Thornton: Yeah, but we can claim him cos Ireland is beside us.
  • Jackson: You see that’s your problem right there. You British keep claiming people that don’t belong to you. We had that problem here in America too, it was called slavery.

… yeah, right. ;)

(Update: Actually, believe it or not, that’s more or less how it really went. Here’s the transcript.)

Some commentary at
TheReggaeBoyz.com (quote: ‘I NEARLY DEAD TO RASS!!!!’) and Kuro5hin.

It looks like the TV programme does exist; no scripts online, unfortunately, so we’ll never figure out if this one really happened, I think.

IMO, it’s made up for sure. That last line is just a little too harsh for a primetime schmooze-a-gram, at the very least. Plus, it’s the kind of thing only an Irishman would give a shit about — the perpetual adoption of Irish celebs and worthies by the UK media is a continual source of irritation for the Irish — as Dervala puts it:

‘No, Oscar Wilde was ours. You put him in jail, though. And Shaw was ours. And Yeats. And Johnny Rotten.’

Announcing a new script

Web: Minor software announcement — after some time using HTMLThumbnail, album, and even WebMake to build photo galleries, I finally got peeved enough, and gave in to the temptation of ‘not invented here’. ;)

Presenting Uffizi, a CSS- and template-driven, themable perl script to generate photo galleries. Quoting the POD:

  • it’s very self-contained, apart from dependencies on Image::Size and the ImageMagick convert command
  • fast, efficient incremental rebuilding
  • generates full CSS-styled, templated and valid HTML
  • every part of the generated HTML can be modified through the templates
  • generates reasonably-sized images as well as thumbnails, with a link to the full-sized image
  • secure — all pages are static HTML, so your webserver won’t get r00ted through a silly photo album script

I am, of course, using it on my own photo pages, and I’m very happy with it; it’s been a while since I had to hack it. (I need to get it to thumbnail MPEGs as well, but apart from that it’s teh nifty IMO.)

SpamAssassin now an Apache TLP!

Spam: SpamAssassin is now officially an Apache top-level project! InternetNews.com coverage:

The Apache Software Foundation is taking the spam fight to a new level — literally — with the promotion of its Spam Assassin project to top-level status.

Hooray ;)

The ‘humans are 99.84% accurate’ figure

Spam: ‘The spam-classifying accuracy of a human being is 99.84%’. This statement has passed into SlashDot lore as the gospel truth, so time for some debunking.

First off, that’s not what Bill Yerazunis said in the CRM-114 Sparse Binary Polynomial Hashing and the CRM114 Discriminator paper. Here’s the real quote:

the human author’s measured accuracy as an antispam filter is only 99.84% on the first pass

Here’s a copy of the original mail:

I manually classified the same set of 1900 messages twice, and found three errors in my own classifications, hence I have a 99.84% success rate.

(my emphasis). In other words, the author sat down and ran through 1900 messages manually, then ran through them again, and checked to see how many messages in the first batch disagreed with the second.

Let’s consider an alternative situation, where a user is presented with one message, and asked to take their time, give it a full examination and some thought, and then classify the message. I would consider that more likely to be classified correctly, since fatigue will not be an issue (after 1900 messages, I’m pretty tired of eyeballing), and neither will time pressure (taking 20 seconds on each of 1900 mails would require 10.5 hours, and would be excruciatingly boring to boot).

In addition, the study wasn’t clear on exactly how much information from each mail was presented. Too little (just the subject line) or too much (every header and raw HTML), and a human will be more likely to make mistakes than if the mail is rendered fully, and the extraneous header info hidden. In my experience, I’ve never hand-classified 1900 messages purely through either method, because it’s just too tiring, and I know I’ll make quite a few mistakes. The UI for this work is important.

And finally, the figure is derived from a study with one user performing a task once. There’s no way you could use that figure in a serious setting — it’s not valid statistical science. Here’s Henry’s comment:

Yerazunis’ study of “human classification performance” is fundamentally flawed. He did a “user study” where he sat down and re-classified a few thousand of his personal e-mails and wrote down how many mistakes he made. He repeats this experiment once and calls his results “conclusive.” There are several reasons why this is not a sound methodology:
  • a) He has only one test subject (himself). You cannot infer much about the population from a sample size of 1.
  • b) He has already seen the messages before. We have very good associative memory. You will also notice that he makes fewer mistakes on the second run which indicates that a human’s classification accuracy (on the same messages) increases with experience. For this very reason, it is of the utmost importance to test classification performance on unseen data. After all, the problem tends towards “duplicate detection” when you’ve seen the data before hand.
  • c) He evaluates his own performance. When someone’s own ego is on the line, you would expect that it would be very difficult to remain objective.

So, to correct the statement:

‘The spam-classifying accuracy of this one guy, when classifying nearly two thousand mails by hand, was 99.84%, once.’

Cormack and Lynam’s study on supervised spam detection

Spam: or, ‘SlashDot spam drama’. So, a few days ago, I forwarded a link to a paper I’d been sent — it’s a great paper, and I’m not just saying that because SpamAssassin did well — it really tests some of the popular open-source spam filters comprehensively, and correctly. (The authors have 24 years of information retrieval research between them.)

The results have been pretty incendiary. ;) Here’s a timeline with links, in case you were wondering where we are right now:

A UNIX shell tip

UNIX: I’ve just made the first change to my core bash configuration in years, to add -b to the set command-line. It triggered some thinking about when the last one was.

It turns out, that apart from writing scripts and aliases frequently, I haven’t changed my commandline UI in any respect, since about 2 years ago. By contrast, I’ve been hacking about with GUI settings continually, new desktop backgrounds, themes, colours, etc. Odd!

Anyway, here’s the tip — it’s very handy, I find.

I changed to using a 2-line prompt, with the first line containing the time and the full working directory, in a ‘magic’ cut-and-pasteable format:

        : exit=0 Thu Jun 24 17:55:29 PDT 2004; cd /home/jm/DL
        : jm 1203...; 

Note that the prompt starts with “:”, which means that bash/sh will ignore the line until it hits “;”. The end result is that the entire line evaluates to “cd /home/jm/DL” when pasted. Hey presto, cd’ing several terminals to the same dir just involves triple-clicking in one, and middle-button-pasting into the others. nifty! Similarly, the second line has a little bit of prompt, but that snippet will be ignored when cut and pasted.

Having the exit status of the last command (bash var: $?) is useful too. The code:

  do_prompt () {
    echo ": exit=$? `date`; cd $PWD"
  }
  PROMPT_COMMAND='do_prompt $?'   # executed before every prompt
  do_prompt 0                     # set up first prompt
  PS1=": `whoami` \!"
  PS2="... >>; "            # continuation prompt
  PS1="$PS1...; "

The Web-App generation

Software: Mark Twomey, in response to all the Win32 API stuff recently:

We now have a generation of computer users … who have never received or sent email from a so called ‘rich client’, never had to send a postal order off to order something from some distant vendor, and are not amazed by something like a search engine. ….

Those (‘rich client’) people remind me of minicomputer users who crapped on the ‘crummy little operating systems’ used on ‘crummy little desktop computers.’

He’s right, you know — for de yoot, Windows is generally just a way to access Hotmail.

Ahmed Chalabi and Iran’s encryption

Security: some crypto drama.

Ahmad Chalabi apparently told the Iranian government that the NSA had broken their secret code, according to ‘US intelligence officials’: NYTimes: Chalabi Reportedly Told Iran That U.S. Had Code. This story is still running — Bruce Schneier has just posted his expert opinion, as has Ross Anderson. As I noted on Eric Rescorla’s weblog, here’s my (non-expert) theory ;)

It’s known that the Iranians used Crypto AG equipment up until about 1992, and it’s been widely reported that Crypto AG’s systems were backdoored by the NSA and traffic routinely decrypted. (also, Baltimore Sun story, 1995)

Reportedly, the Anglo-Irish discussions of the 1985 were a rather one-sided affair, because the Irish government used Crypto AG machines to communicate between their Embassy in London and Dublin, and intercepts of their reports were fed back to the UK government.

In addition, according to this article (backup), the NSA also provided Iraq with intercepts of Iranian secret traffic, while Iraq was a US ally — which could explain why Chalabi would have known about it.

It also speculates as to how it was done:

‘Knowledgeable sources indicate that the Crypto AG enciphering process, developed in cooperation with the NSA and the German company Siemans, involved secretly embedding the decryption key in the cipher text. Those who knew where to look could monitor the encrypted communication, then extract the decryption key that was also part of the transmission, and recover the plain text message. Decryption of a message by a knowledgeable third party was not any more difficult than it was for the intended receiver. (More than one method was used. Sometimes the algorithm was simply deficient, with built-in exploitable weaknesses.)’

So my opinion is that Chalabi’s claim was very old news from the 80’s and early 90’s — which pretty much fits in with the rest of his tip-offs to everyone else ;)

“Vice-President Hunter Thompson”

Politics: Kerry in Colorado:

“Just to put your minds all at ease, I have four words for you that I know will relieve you greatly,” Kerry told the fund-raiser. “How does this sound? Vice President Hunter Thompson.”

Travel: Great posting on culture shock and ‘going native’ at Yankee Fog.

Hacks: Dan Kaminsky’s LayerOne presentation hits Slashdot. Definitely one of the highlights of that conference.

Spam: confession for two: a spammer spills it all. Interesting — especially since the spammer winds up earning less than he would have working for Starbucks.

It’s also worth noting this posting from Gary Smith on the sa-users list, in which Gary filled out a spam form with some not-entirely-valid info — with hilarious results!

So I did talk to some of these lenders. Apparently they buy leads from www.lendergateway.com . One guy that I talked to was irritated because it costs him $100 per lead they sell him and it’s supposed to only be sold to him. He apologized quite a bit and was nice enough to give me the information on who sold him the names. The number he game me goes to voicemail which I’m going to try later. A couple other people told me what I can do with myself and one lady kept saying that she couldn’t give me information on who provided her with my information.

The stupid thing is each time I talk to them I tell them I’m on a cell and that I need their name and number and I’ll call them right back. They give it to me… So when they hang up I start calling again and again. I’ve been irritating the hell out of them…

Anyways, that’s the fun storing of what happens when these forms are filled out.

$100 per spurious ‘lead’ would make a serious dent, if enough spurious leads showed up… ;)

WINW

Net: WINW Is Not WASTE: ‘WINW is a small worlds networking utility. It was inspired by WASTE … (WINW) has diverged from its original mission to create a clean-room WASTE clone. Today, the WINW feature set is different from that of WASTE, and its protocol is incompatible with WASTE’s protocol. However, WINW and WASTE achieve similar goals: they allow people who trust each other to communicate securely.’

Not quite there yet — just a Windows version with no sharing — but actively under development. One to keep an eye on…

Great Economist article on UNIX

Software: Economist: Unix’s founding fathers (via sourcefrog.net). A very good article on Thompson, Kernighan and Ritchie’s amazing achievement, with some new details I hadn’t heard before:

AT&T was required under the terms of a 1958 court order in an antitrust case to license its non-telephone-related technology to anyone who asked. And so Unix and C were distributed, mostly to universities, for only a nominal fee. When one considers the ineptness of AT&T’s later attempts to commercialise Unix — after the court order ceased to be applicable because of another antitrust case which broke up AT&T in 1984 — this restriction, an accidental boost to what would later become known as the open-source movement, becomes even more crucial.

So that’s how that happened. Just think — if it wasn’t for that court case, we’d probably all be hacking on VMS. ;)

Also at sourcefrog, mbp points out that the Sulston reverse-engineering story is ‘remarkably similar to that of Richard Stallman several years earlier, when the frustration of closed-source printer software helped motivate him to start the GNU project’.

Patents: yet another sourcefrog link, this time to a CNet story with a hilarious quote regarding software patents and the GIF/PNG debacle:

But Unisys credited its exertion of the LZW patent with the creation of the PNG format, and whatever improvements the newer technology brought to bear.

‘We haven’t evaluated the new recommendation for PNG, and it remains to be seen whether the new version will have an effect on the use of GIF images,’ said Unisys representative Kristine Grow. ‘If so, the patent situation will have achieved its purpose, which is to advance technological innovation. So we applaud that.’

Wow. Presumably by the same logic, they applaud al-Qaeda for improving airline security innovation, too…

What’s wrong with DRM, and ‘better support’

Copyright: Cory Doctorow’s DRM talk presented to MS research yesterday. This is a fantastic introduction to the issues regarding DRM; if you know someone who isn’t convinced that DRM is A Bad Thing, this is the argument they need to read.

OSes: /.: France Considers Open Source. The usual arguments are going on in the comments, but some people still insist that they get better support from MS than from Linux vendors.

What planet are they on? Because it would have been handy for me to live there, on the occasions in the past where I’ve had to develop code on MS platforms, and administer networks of Windows PCs. In my experience, you do not get support from Microsoft. Instead, you do what you do with Linux — go searching on Google, read MSDN, or post in the MSDN forums.

As far as I can see, there’s zero difference between doing that with Windows, and doing exactly the same thing with Red Hat — except in the latter case, you can turn up debug logging through a documented API or switch, use the source and fix it yourself, find the original developers and post a message to their core -dev list, or even ask them personally.

Where’s this amazing support? Maybe the companies I’ve worked for just weren’t paying enough, and therefore weren’t significant blue-chip customers. Or maybe it’s because we weren’t based in the US, and so got support from less-skilled, less high-priority staff in a regional office. But I’ve certainly never experienced the support these advocates claim MS offers, which makes me think it’s FUD as usual.

Bloomsday!

Literature: Happy Bloomsday Centenary! Google agrees:

Google Bloomsday logo

You can have a read of Joyce’s masterpiece online at online-literature.com, although this is certainly one text that works better on paper, to be pored over and parsed slowly. But regardless of whether it’s readable on-screen or not, the legality of that copy is dubious, anyway.

As this Telegraph article notes, the copyright situation on Ulysses is, sadly, a total mess. Even 84 years after it was written, and promptly banned in the US, UK and Ireland for ‘obscenity’, Ulysses remains a thorny legal subject.

The novel was first published in 1922, and as such, fell into public domain in the UK in 1992, but was apparently ‘pulled back’ in 1996. According to this mail, due to recent copyright term extensions, the 1922 text will now remain in copyright in the EU until the end of 2011, and may not expire until 2032 in the US. And this Irish Times article notes that in Ireland, ‘copyright on Joyce’s works ran out on December 31st, 1991, 50 years after his death. However, EU regulations revived copyright from July 1995 when it extended the lifetime of copyright to 70 years.’

Reportedly, the Dail even had to pass emergency legislation last week to prevent an exhibition at Dublin’s National Library from being sued by the Joyce Estate:

The threat to the exhibition has been caused by the 2000 Copyright Act which creates a doubt about its ability to display manuscripts bought by the State because the Joyce estate still holds copyright.

Hilarious. Recent overzealous copyright extension legislation snares governments too! But they get to rewrite the laws in emergency session to fix it ;)

All very ironic, considering Ulysses’ structure was deliberately derived from The Odyssey in the first place.

Making a Bootable CD from a Floppy Image

Tech: Troubleshooters: Making a bootable CD from a bootable floppy image.
Making a note of this for future reference — it should be handy next time I need to do a BIOS or firmware upgrade on my Thinkpad.

I ran into the need for this recently when trying to upgrade the BIOS on my Thinkpad running Linux, so hibernation would work. IBM don’t provide BIOS upgrade tools for Linux, so you have to keep a Windows partition around. (Yes, I pay the Windows Tax — I’ve been bitten by proprietary firmware upgrades requiring it in the past, as in this case.)

Amazingly, however, even after paying the Tax, the ‘non-diskette’ BIOS upgrade (ie. the standalone Windows app) doesn’t work from Windows XP! Instead, you get a hard hang when it tries to bring the machine down from XP to a single-app mode to perform the upgrade. Running from DOS similarly fails, because the BIOS upgrade app is a WIN32 application. Clever.

Eventually, I wound up reformatting my Windows partition, installing Windows 98 (!), and running the BIOS upgrade app from that worked fine. But next time around, I should be able to save myself a few hours of MCSE imitation by using this floppy-to-CD trick… here’s hoping. ;) PCs Are Hard.

‘Precision’ bombing, and iTMS Europe

War: A couple of war links, I’ll keep it short. ;)

High-profile air strikes ‘killed only civilians’. ‘The American military launched some 50 air strikes designed to kill specific targets during the Iraq war, it emerged yesterday, but none of them found its mark. Instead the air strikes had a high civilian toll, according to military officials serving at the time.’ Still, it sounded good, like as if CSI were doing all the war strategification and stuff ;)

And: the
Pentagon ‘Torture Memos’ took some tips
from the torture techniques used in Northern Ireland in the 1970s.

Music: Licensing row mars iTunes launch. UK indie labels report that ‘where Apple has spoken to labels the terms on offer have been commercial suicide’, and as a result, they won’t be selling their tunes via iTMS Europe.

I agree with Mark Twomey on this one — bad move. This (and the prices!) reduce the Euro-iTunes offering to about the usefulness of whatever that one is that Real.com have (you know, the one you can’t even remember the name of) — and nobody in Europe buys major-label music online anyway.

First Nobel Prizewinner forced to reverse-engineer?

Software: This mail contains a fantastic anecdote from The Common Thread: Science, Politics, Ethics and the Human Genome, by John Sulston, head of the Sanger Centre, and a joint winner of the Nobel Prize for Medicine. I’ll reproduce some bits here:

Once the first fluorescence sequencing machines arrived, it became clear that we had to take control of the software. The machines worked well, but ABI (jm: the vendor) wanted to keep control of the data analysis end by forcing their customers to use their proprietary software. …

I could not accept that we should be dependent on a commercial company for the handling and assembly of the data we were producing. The company even had ambition to take control of the analysis of the sequence, which was ridiculous. …

So, one hot summer Sunday afternoon, I sat on the lawn at home with printouts spread all around me and decrypted the ABI file that stored the trace data. … Within a very few days, Rodger and his group had written display software that showed the traces – and there we were. The St Louis team joined in, and they all went to decrypt more of the ABI files, so that we had complete freedom to design our own display and analysis systems. It transformed our productivity. Previously we’d only been able to get the traces as printouts, which we bound together in fat notebooks ….

I certainly feel that between us we did push ABI back a bit and denied to them complete control of this downstream software. It was the first experience of the kind of battle for control of information that I seem to have been fighting with commercial companies ever since: a foretaste of the much larger battles that would later surround the human genome.

Amazing. Was John Sulston the first Nobel Prize-winner to have to reverse-engineer a proprietary file format in the course of his research?

And would his actions be legal in the UK in a few years, once the IPR Enforcement Directive is transposed into law there?

LayerOne

Conferences: LayerOne was seriously great! Got to meet up with some really interesting people; discuss some nifty stuff; and get some new angles on the whole hacking scene.

Seriously, that was well worthwhile, especially in terms of potential new ways to deal with spam, and issues to watch out for in terms of spammer techniques in future. A great techie conf, and the boozing^Wsocialising was pretty good too ;)

I’m actually giving some thought to going to Defcon after that…

German neo-nazi UBE, and CAN-SPAM

Spam: Reg: German hate mail spam attack stuns experts: ‘Mailboxes in Germany and the Netherlands were flooded yesterday with spam containing German right-wing propaganda. Spammers used the Sober.G virus – a mass mailing worm that sends itself to email addresses harvested from infected computers – to spread their messages as widely as possible.’

The one good thing about this is that it might help some people realise that spam isn’t all about porn and commercial email; any kind of mail can be spam, including political speech.

However, this may be a bit late for the US, since CAN-SPAM explicitly does not regulate political spam. ah well, you live and learn, I suppose. ;)

Updating European Election voting guide for Ireland

Patents: Ciaran O’Riordan just posted a message to ILUG, regarding how concerned voters in Ireland can use their votes in tomorrow’s European elections to prevent legalising patenting of software ideas in Europe. Here’s the scoop:

Area Vote #1 and #2
East Avril Doyle Eoin Dubsky
South Brian Crowley Gerard Collins
North WestSean O’Neactain
Dublin Patrica McKenna Ivana Bacik

Note the main thing I got wrong — some sitting MEPs from Fianna Fail and FG actually voted the right way! So a vote for FF in this case, is a vote against software patents. (I never thought I’d be saying that, but there you go ;)

TaintBochs, and oil

Security: A very interesting security paper — Understanding Data Lifetime via Whole System Simulation. It combines virtual machines with data-flow tracking (a la perl’s ‘taint’ mechanism, after which this site is named ;)

By modifying the Bochs VM to support tracking ‘tainted’ data, they found several cases in popular apps (Mozilla, emacs, and MSIE) where passwords entered from the keyboard are retained in memory, and thereby wind up on disk due to swapping.

This has been a known issue for a long time — see the source for passwd.c from the ‘shadow’ package — but aside from security-naive developers, several other factors have made it more complex recently:

  • recent too-smart compilers will optimise away memset()
    • buffer-zeroing unless you’re careful (oops!)
      • Input buffers and event queues are a problem; password data from the keyboard will often persist in the kernel, window system, and application event queue buffers.
      • Abstractions cause many needless copies of tainted strings. Mozilla’s abstraction layers even include a string-copy to the heap to perform a string comparison operation, ouch ;)

In general, they suggest more use of buffer zeroing, even for low-level buffers that might not seem to require it (such as the X server’s event queue, and the kernel input buffers).

BTW, a similar system they didn’t mention is the Sidewinder firewall appliance, which uses what they call ‘Type Enforcement’ — effectively, tainting the data based on which network interface it arrived on.

Overall, a very nifty paper. I wonder if Tal Garfinkel is related to Simson? ;)

Oil: a MeFi gem: expert opinion on depletion of the oil reserves. ‘Simmons, Campbell, even the Iranian Bakhtiari agreed that the real situation of Saudi reserves is very bad. … Not a rosy picture, even for optimists.’

Patents: Transcript of the rms talk from a couple of weeks ago.

MS’ latest patent

Patents: Oh, come on. USPTO: task list window for use in an integrated development environment. Here’s claim 1:

  1. A computer-implemented method for managing development-related tasks, the method comprising:

    during an interactive code development session, evaluating source code to determine whether a comment token is present;

    in response to determining that the source code contains a comment token, inserting a task into a task list; and

    in response to completion of a task, modifying the task list during the interactive code development session to indicate that the task has been completed.

There’s 74 more claims that are about up to that standard, including the usual ‘an input module connected to the knee-bone’ mumbo-jumbo that means it ‘isn’t a software patent’.

This is just quite simply absurd. Are we really supposed to believe that nobody had thought of what is essentially a list of tickboxes, displaying the output of ‘grep TODO *.c’, before March 6, 2000? You have got to be kidding. This /. comment suggests that Delphi 5 (released 1999) did it.

(update: looks like there was a provisional patent application, so that may have to be Mar 5 1999.)

William Chiles, Anders Hejlsberg, Randy Kimmerly and Peter Loforte should be ashamed of themselves for filing this joke. And the USPTO examiner who granted it should be fired.

(PS: a factoid from the slashdot comments: IBM receives (note: not even ‘files for’) nearly 10 patents every day.)

Invasion of the spambots

Spam: Good Salon article on the new forms of spamming, such as Wiki and referrer-log spamming etc. Here’s a good quote:

‘The adult industry will likely be married to spam and its attendant distribution methods long past the evolution of man into beings of pure energy,’ jokes Domenic Merenda, vice president of business development for Edge Productions, a company that operates adult-media properties.

There’s a good deal of crossover — I’ve seen both email and referrer-log spam advertising the same porn sites.

Nigritude Ultramarine

Web: the June part of the contest is over, but given that there’s a July part still to go — here’s a ‘Nigritude Ultramarine‘ link to Anil Dash.

I wasn’t really bothered at all about this, until I came across this guy, whose technique involved spamming third-party Wiki sandboxes with backlinks. His excuse? ‘A Sandbox (is) a part of a system in which everybody is urged to play around freely. Usually for testing purposes. You can post headings, paragraphs, lists and links here. The content in return will be indexed by Google.’

As this forum thread points out — ‘The SandBox page is there for a purpose: to allow users of the wiki to learn to use the software. It is
not meant to be “a place where anyone can create backlinks.”‘

Sorry, that’s spam in my book.

GMail Invites

Mail: GMail users, check your mail; if mine was anything to go by, you should have three new invites to give out.

Irish Dating Site, and TheyWorkForYou.com

Web: Bernie Goldbach points to a site that’s news to me: AnotherFriend.com. It’s an Irish dating site.

I’ve had the odd discussion comparing dating culture in the US (organised ‘dating’) and Ireland and the UK (where it’s a lot more casual), and I must say, I was really convinced that the Friendster/craigslist-style organised, web-mediated dating just wouldn’t fly.

Seems I was wrong! Right now, there’s 157 people online on the site, with a good half of those being logged-in, chatting users, and about 75% of those in turn being premium, paying members. Wow, not bad.

Politics: TheyWorkForYou.com is a triumph. The most incredibly detailed, and web-aware, hypertextual database of political activity I’ve seen yet. The web-awareness — full of scraping, links, RSS and even community — is what makes it amazing; the concept of being able to read news of your representative’s latest speeches and voting record in your RSS aggregator is incredible. We need to get this out there for every country in the world.

It certainly beats Today in Parliament, that’s for sure ;)

Aside: nice choice of username for the ‘Site News’ weblog:

Some sites linking to this entry

An error occurred: Connection error: Access denied for user: ‘fawkesmt’@’localhost’ (Using password: YES)

Wierd: Incredible footage (WMV stream) of a guy who went nuts, converted a caterpillar earthmover into what is essentially a tank, and went on a GTA-style rampage through the streets of Granby, 15 miles west of Denver, Colorado. In the process, he destroys the local bank, the newspaper, and several stores, seemingly working on the basis of (several) personal grudges.

Action Replay

Hacking: Amazing — the Action Replay cartridge is still around!

To be honest, I’m quite surprised that the PS2 hardware platform allows any of this stuff without some mod-chip-style soldering… but then, it’s pretty clear Datel have the technology to figure these things out. Impressive.

Aside: in my teens, I wrote demos on the Commodore 64 entirely in the Action
Replay’s built-in monitor. I tried using compilers that supported such luxuries as symbolic labels, variable names, etc., but the ability to halt the entire machine and debug extensively, with a single button press, was just too nifty ;)

Irish MEP Candidates

Patents: lyranthe.org notes that the EU elections are coming up this Thursday, 11th June. Accordingly, here’s a single-issue roundup of the candidates, from what I’ve heard:

  • The Labour Party, sadly, haven’t yet come up with a concrete policy on the issue — but the Dublin candidate, Ivana Bacik, has (verbally) stated her opposition.
  • The Greens, however, are actively campaigning against them, their candidates clearly understand and have communicated with voters on the issue in the past, and the cross-Europe party policy is clearly stated.
  • Eoin Dubsky is an independent candidate, standing on a primarily anti-war platform. He’s stated his opposition to software patenting clearly and publically. He’s also a total techie — with RSS feeds and a Redbrick account! ;)
  • FG‘s position is totally unclear, as usual… ;)
  • And in the other corner: FF and the PDs are whole-heartedly supporting software patenting; in fact, they’re the ones running the EU Council which just pushed through software patenting law despite the democratic mandate from the European Parliament. boo.

(PS: these are my opinions, not those of my employer. ;)

(updated: I’d left out Eoin Dubsky! my bad, now fixed.)

Easy-peasy web scraping: HTTP::Recorder

Perl: I’ve been writing a few convenience web-scrapers recently using WWW::Mechanize, with great success.

So the latest development, HTTP::Recorder, looks very nifty too:

HTTP::Recorder is a browser-independent recorder that records interactions with web sites and produces scripts for automated playback. Recorder produces WWW::Mechanize scripts by default (see WWW::Mechanize by Andy Lester), but provides functionality to use your own custom logger.

… Simply speaking, HTTP::Recorder removes a great deal of the tedium from writing scripts for web automation. If you’re like me, you’d rather spend your time writing code that’s interesting and challenging, rather than digging through HTML files, looking for the names of forms an fields, so that you can write your automation scripts. HTTP::Recorder records what you do as you do it, so that you can focus on the things you care about.

No SSL support yet, though, as far as I can see, but for simple scraping — or as a good starting point for a more complex Mechanize script — it looks like it’ll work great.

ISPs, AUPS, and RIRs

Spam: Kasia raises a very interesting question. Here it is, in a nutshell:

Should the quality of an ISP’s enforcement of its Acceptable Use Policy, be a condition of their contract with their Regional Internet Registry, and therefore affect whether they can be assigned new network address space?

  • Are there that many ISPs with lax or virtually nonexistent spam-related AUP enforcement? Yes, definitely.
  • Is spam that much of a problem? Speaking personally, I would say yes, but then, I would ;)
  • who would judge whether an ISP is doing enough, or too little?

Head on over to her weblog if you have a comment on this.

Don’t look for it, and you won’t find it

Health: USDA orders silence on mad cow in Texas: ‘The U.S. Department of Agriculture has issued an order instructing its inspectors in Texas, where federal mad cow disease testing policies recently were violated, not to talk about the cattle disorder with outside parties … The order … was issued in the wake of the April 27 case at Lone Star Beef in San Angelo, in which a cow displaying signs of a brain disorder was not tested for mad cow disease despite a federal policy to screen all such animals.’

Great idea — if you want to avoid finding mad cow cases, just don’t bother looking for them! The beef rendering plant in question supplies beef to MacDonalds, reportedly.

Press: LWN: A look at SpamAssassin 3.0 (article is subscriber-only until next week).

OSes: Kernelthread.com: Making an Operating System Faster. Great article on some OS-level optimisations Apple used in MacOS X — including a nifty boot-time read-ahead system which reportedly more than doubles the speed of OS X reboots. nice!

Wildlife: here’s another critter we encountered last weekend — a baby Western Diamondback rattlesnake, hiding in a crevice.

Spamometer

Spam: The Spamometer; a 1997-vintage spamfilter along the lines of filter.plx. Interestingly, I hadn’t seen this before — who knows, if I had, SpamAssassin could have used a (0.0, 1.0) scoring system instead of the ‘5 point threshold’. ;) (Thanks, Gary!)

Going to LayerOne

Conferences: I’m going to LayerOne; it looks interesting, and I’ve been hoping to bump into Danny O’Brien (who’s there doing his Life Hacks talk) for a couple of drinks and a blather for quite a while. Other speakers look similarly interesting, in an ‘offbeat hacker conference’ way, so I think it’ll be fun.

Conflicts with The Streets playing the Wiltern though, but c’est la vie ;)

Desert camping, and Dr. Strangelove’s all-zeroes password

Life: I’ve learned one thing this weekend — humans are not designed to function in the desert. I went bush-camping in the Anza-Borrego Desert state park with a few mates, and we quite simply baked in the 45C/113F degree heat. Walking 3 miles in that heat was easily equivalent to 15 miles in normal temperatures.

We did manage to catch a good look at one of the endangered bighorn sheep that live there — the poor sheep was clearly trying to get to some water, but those damn humans kept getting in the way!

On the way back, we passed the aftermath of a forest fire near Temecula. Scorched earth.

Security: via IP — a very scary article at Bruce Blair’s Nuclear Column — apparently, the secret unlocking codes on the launch control mechanisms of Minuteman nuclear missiles were deliberately set to ‘00000000’ throughout the height of the cold war, because the Strategic Air Command ‘remained far less concerned about unauthorized launches than about the potential of these safeguards to interfere with the implementation of wartime launch orders.’

Green: A couple of good /. comments on renewable power sources: one from a wind farm designer, and some anti-FUD figures for solar panels.

Music: The full text of
The Timelords’ The Manual (How To Have a Number One the Easy Way) is online:

        THE JUSTIFIED ANCIENTS OF MU MU
      REVEAL THEIR ZENARCHISTIC METHOD USED
        IN MAKING THE UNTHINKABLE HAPPEN.

                  KLF 009B
          1988 (YOU KNOW WHAT'S GONE)