Skip to content

Month: January 2005

Building a Freevo

Freevo: so I’m planning to build myself a PVR, of the home-built, running Linux with mythTV or Freevo, mini-ITX variety.

So far I’m still at the hardware planning stages, but the price looks good — around $455 (plus shipping) for a working, thoroughly hackable, silent, set-top PVR system.

(Silence is a key aim here — last thing I want is something noisy taking over the room. But silence typically seems to cost the dollars, once you get into Shuttle gear and the like.)

If anyone wants to follow along, or provide some tips — I’m going to track progress (very slowly) on this wiki page. Like all wiki pages, it’s editable — although you’ll need to create an account to edit pages there (sorry, anti-spam measure).

BTW, lately, there’s been a lot of talk about using a Mac mini as a media center. So I took a quick look — but wow, it’s pricey! $499 + $329 for an EyeTV 200 tuner? Dude, that’s over 800 dollars, not include shipping or sales tax. Given whatever extras turn out to be appropriate, I wouldn’t be surprised if it hits double the mini-ITX’s price.

January 24th: a day of partition table misery

Tech: January 24th, besides being the date the first Apple Macintosh went on sale, is supposedly the day of maximal post-xmas misery. Well, it certainly was for me today.

I decided to power on my old desktop to set it up as a back-room fileserver, and twiddled the partition table accordingly to nuke a few unused Windows partitions and maximise usable space.

Somehow or other, some component of my system decided that it would henceforth be non-bootable. It seems some BIOSes don’t like partition tables where a high-numbered logical partition have a lower starting sector than a boot logical partition, or something… GRUB just errored out with an obscure ‘Error 17’, which apparently means that it couldn’t find its boot partition any more.

OK, so I needed a boot disk. But I had 1 laptop with a CD/DVD drive but no floppy drive, and a desktop with a floppy drive but no CD drive (due to hardware failure)… and the original linux boot floppy was long gone, seeing as I’d hardly booted this machine in the duration of two house moves. Argh.

A dinky little Cruzer mini 128MB USB flash drive saved the day. (R)ecovery (I)s (P)ossible is a tiny Linux distro that fits into 27MB, well inside the USB drive’s limits; it has an exceptionally helpful and detailed README detailing exactly what needs to be done to create a bootable USB flash drive from its ISO image, using just the generic linux toolchain.

Together with fdisk and parted’s ‘rescue a lost partition’ mode, I was able to get the mangled partition table back into shape, mount the boot disk, change the fstab and grub configuration file, and reboot into a working system. phew!

Many thanks to Kent Robotti, who’s done a great job with RIP.

On the other hard — no thanks to whoever came up with the arcane rules behind the IDE partition table… argh.

OpenStreetMap.org

Map: much interesting geowankery going on in London, where they suffer under the same Ordnance Survey monopoly as we do in Ireland.

This message to their mailing list notes a quote from IKONOS of $1,172.50 USD plus shipping for a 1m Color Geo referenced satellite image of central London, covering 67 square kilometers.

Given ‘enough processing’, data extracted from that map becomes a Derived Work, and have no copyright restrictions. ‘Processing’ includes ‘vector extraction, classification, etc.’

Now, I worked it out — central Dublin city centre covers about 3km x 4km. At the named rates for London, that works out at an inexpensive $210! Looks like it was imaged in September 2003.

There’s something interesting for a local geohacker to add to their list of projects ;)

(There’s also some old Landsat-7 data that may be usable.)

‘Spam Kings’ review

Spam: Before xmas, I received a copy of Brian McWilliams‘ new book, Spam Kings.

It’s a great book — full of behind-the-scenes details on how the spammers operate, how they get away with it on the sending end, how they try to evade filters on the receiving end, and how they’re fundamentally running the usual simple scams that have been around since before email spam came into existence. Well worth reading.

In addition, Brian’s continuing to write about spam and spammers at the Spam Kings weblog, and will be giving a talk at this year’s MIT Spam Conference, tomorrow.

Anyway, pick up a copy if you’re interested in the spam problem — this is one of the best books I’ve read on the subject, and this kind of information is essential for an understanding of the people we’re up against.

Echo chamber goes crazy about ‘nofollow’

Blogs: Just to expand on a linkblog posting I made yesterday, Google’s search team have announced support for a new piece of Google functionality; they’ll fix their crawlers to ignore links with a rel="nofollow" attribute, for PageRank calculations, the idea being that spammers will stop blog-spamming once they can’t get PageRank out of it.

The blog world has been all aflutter:

BurningBird is right, to a degree. In fact, it’s been solved before.

Here’s a taint.org posting from November 2003 where I point out that by using a trivial Javascript URL one can link to another page without conferring PageRank. The format is:

javascript:document.location=target

The result looks like this, and work in any browser with a basic JS engine, from IE 3.02 and Netscape Navigator 2 onwards. I’ve been using it for my referrer logs, among other things, for over a year. I wrote a patch that implemented it for external links in the Moin Moin wiki software.

Amazingly, despite my plugging this idea at virtually every opportunity, it seems nobody noticed! At least, nobody among the people who (it would seem) should be looking into comment spam, thinking about how to deal with it, etc.

Disappointing — the echo chamber keeps talking to itself, once again. Maybe I’ll stick with dealing with email spam instead ;)

Ah, whatever. Anyway, this is a nicer fix; relying on JS isn’t a good thing. So nice work, Google.

(PS: worth noting that while this is a good plan, comment spam won’t be going away any time soon, as Mark Pilgrim noted. Still, here’s hoping it’ll help in the long term…)

IPC::DirQueue 0.04 released

Perl: at last, a perl-related posting! I’ve released IPC::DirQueue 0.04; details of what’s changed (summary, a couple of bugs fixed) are at that link.

BTW, thanks to Ask and Robert at perl.org, who are providing free SVN repository and list hosting for CPAN modules! And don’t overlook the fact that the mailing list/newsgroups each have their own RSS feed, woot!)

Prescient tsunami spam

Spam: I was just looking back through the archives here on taint.org, and noticed this entry from December 2 last year:

A huge 300 ft. high ocean wave is moving towards your continent. Your and many other cities are in a real danger. Approximate wave moving speed is 700 km/h. cmoym eaaa yypbzz

Please read more about this catastrophe here: (link)

We are strongly urging you to evacuate yourself and your family as soon as possible, even though you may live far away from your city. The tsunami will reach the continent in approximately FOUR hours.

It appears that the spam was a phish attack — the site in question is full of Internet Exploder exploits. It was ‘targeted’, at least as well as such things ever are, at Australian readers. AUSCERT issued a warning about it at the time.

But how’s about that for timing? Spooky! What did those phishers know?

eWeek’s ‘Spammers Upending DNS’ article

Spam: eWeek recently published an article entitled ‘Spammers’ New Tactic Upends DNS’ , which notes that:

One .. technique finding favor with spammers involves sending mass mailings in the middle of the night from a domain that has not yet been registered. After the mailings go out, the spammer registers the domain early the next morning.

By doing this, spammers hope to avoid stiff CAN-SPAM fines through minimal exposure and visibility with a given domain. The ruse, they hope, makes them more difficult to find and prosecute.

The scheme, however, has unintended consequences of its own. During the interval between mailing and registration, the SMTP servers on the recipients’ networks attempt Domain Name System look-ups on the nonexistent domain, causing delays and timeouts on the DNS servers and backups in SMTP message queues.

This had me stumped when I read it, since an email from a nonexistent domain is a pretty reliable spamsign (it’s used in the NO_DNS_FOR_FROM rule in SpamAssassin, for example, which hits about 2% of spam), has been a rule in the default ruleset for several years, and there’s no sign of that behaviour in our spam traps.

After some discussion, Suresh Ramasubramanian came up with this explanation of what’s really happening:

Verisign now allows immediate (well, within about 10 minutes) updates of .com/.net zones (also same for .biz) while whois data is still updated once or twice a day. That means if spammer registers (a) new domain he’ll be able to use it immediatly (sic) and it’ll not yet show up in whois (and so not be immediatly identifiable to spam reporting tools) – and spammers are in fact using this “feature” more and more!

That does sound a much more likely explanation, and matches what’s been seen in the traps.

So: WHOIS, not DNS.

IBM Pledges 500 U.S. Patents to Open Source

Patents: wow, this is amazing news! ‘IBM today pledged open access to key innovations covered by 500 IBM software patents to individuals and groups working on open source software. IBM believes this is the largest pledge ever of patents of any kind and represents a major shift in the way IBM manages and deploys its intellectual property (IP) portfolio.’

Even better, they are hoping to begin a ‘patent commons’ for other companies to join, and the OSI definitions of which licenses are judged ‘open’ apply.

More details:

Of course, it would be better if it were also safe for commercial software development. But this is a valuable bulwark against Microsoft-style patent tactics.

Web-browser style history for the command line

Code: Here’s something I came up with recently — it’s actually an evolution of the idea of pushd and popd, as included in BASH. To quote the POD docs:

cdhistory is a perl script used to implement web-browser style “history” for UNIX shells; as you use the cd command to explore the filesystem, your moves are remembered, and you can go “back” through history, and “forward” again, as you like.

Download the perl script here.

Annoying anti-arab Republican talking points, pt. xxviii

Politics: This moronic comic from Pat Oliphant came up in my comics page the other day, and, after a few days of hearing this particular talking point through the usual propaganda channels, I just saw it again. It pissed me off enough that I took a look at the stats.

Naturally, it’s bullshit. The top 50 governments pledging tsunami aid, per GDP:

  • Qatar (#2)
  • UAE (#5)
  • Kuwait (#9)
  • Bahrain (#10)
  • Saudi Arabia (#15)

Given that the USA’s at #29, and the UK at #22, I think the arab states are coming up with a pretty good result there.

I guess it’s hard to look beyond today’s talking points when you’re still drawing cartoons at the age of 70.

A Firefox Extension plug

Web: Urgh, I still have this damn cold I picked up in Ireland… sniffle cough etc. More vitamin C needed!

Anyway, just a quick plug for a very deserving Firefox extension, one I haven’t seen mentioned widely. It’s pretty common, when you wish to print out a web page, that you wish you could get rid of the obnoxious extra-wide sidebar tables, gigantic ads, or other extraneous parts of the page. Well, now you can:

Nuke Anything is a Mozilla/Firefox extension which offers two great features in the right-click context menu:

  • Remove this object: this will remove the object you’ve right-clicked on — a table TD, paragraphs, images, IFRAMEs, etc.
  • Remove selection: more usefully, this allows you to select exactly what you want to remove with a left-button drag, then right-click to remove it.

It’s really useful. I almost never print anything out these days without scrubbing off a few unwanted sidebars ;)

HOWTO: invalidate a patent application with prior art

Patents: here’s an interesting technique I heard recently. (credit: I’m not sure who told me about it, but I think it may have come from or via John Levine.)

If you become aware of a patent application (note: not an issued patent!) for which you are aware of possible prior art, you may be able to help invalidate it, or at least ensure any resulting patent is narrow enough to be relatively sane. Here’s how.

  • If you have knowledge of techniques that you believe may be prior art, you can send them on to the filers or the patent examiner. At this stage, the onus is on them to prove that the technique is not prior art for the application (once it’s granted, the onus would be on you to prove that it is).
  • The filer also must indicate techniques that they are aware of, that may be prior art, during filing; so CC’ing a public forum with a copy of whatever you send to them, may at some point in the future help indicate that they did not do this.

Of course, you have to go find the patent application number, the contact addresses of the filers, and the contact address for the patent examiner to do this ;) But it beats posting a whinge to Slashdot.

An unnamed patent agent comments:

‘I believe an examiner is not under obligation to review art sent directly to them, but certainly the applicant and his agents are required to report any art they come across. That means the inventor as well as the law firm representing them.

You should include a cover letter that you saw their application (give details), and that you believe that what you are sending them is prior art, and that now that they have it, they are obligated to report it to the PTO. The same can be done to their counsel.

Probably, anything sent should be sent with some sort of delivery confirmation, and to make sure that the sending of the prior art is of public record, create a Web site where all sent art is listed, along with destination and confirmation information. This would help show inequitable conduct should the patent later be asserted and the art you provided not be shown as of record in the examination.

Mind you – I have not heard of these being done before (bombarding listed inventors and their agents with prior art, forcing them to have to disclose it), but I think it’s a great idea. One caution – if you send too much, you over inundate the examiner, and then really good art could get overlooked during examination.

Separately, please keep in mind that the claims in a published application have probably not yet even been seen by the examiner at the PTO. These are the claims that the applicant would love to have the examiner accept, but until prosecution of the application actually commences (and completes), there’s no way to know what claims will ultimately result.’

Update: some good additional points:

‘The prior art must have been published or been publicly available at least as early as the earliest priority date of the patent. The priority date is either the filing date, or the filing date of a parent application. This information can be found on the cover page of a patent.

A patent’s scope is covered by the claims. The claims define what the invention is. All other material in the patent is supporting material, and usually non-binding. In order to be anticipatory (the best kind) prior art for a particular claim, the piece of art must contain or described every element of the claim you are seeking to invalidate. Note that dependent claims add additional elements that the prior art needs to contain if you want to invalidate the dependent claims as well.

Prior art which is not anticipatory may be used in combination with other art or knowledge at the time to show obviousness. This type of art may have some impact during prosecution of a patent, but if a patent has already been issued, obviousness is a real uphill battle to fight in the courts. Few patents have been invalidated because of obviousness in trials.’

Another attorney notes: ‘You can actually send it anonymously if you want. Just keep the certified receipt to prove they got it. As long as they know it exists, the onus is on them to disclose it to the PTO.’

‘It’s best to send them something printed out or on tangible media, along with a brief note explaining what it is and most importantly, when it was first publicly available. Certified means using certified mail or FedEx or something where you have a valid receipt.

As far as (discovering) who the (filer’s patent lawyers) are … it’s usually listed on the patent applications. you can search the USPTO website for them.’

And a report that this technique is now in use: ‘some patent attorneys are reporting that this approach is a valid one that people have started using.’

Update 2: More assent from another unnamed patent lawyer:

‘Anyone who wishes to do so can send a letter to the Patent Office letting them know of any prior art of which they are aware. The Patent Office will then place it in the application file. Anyone who cares about this patent will surely order up a copy of the application file from the Patent Office, and will come into possession of whatever you sent.

Later you can see whatever you sent them. Go to
http://portal.uspto.gov/external/portal/pair and plug in the serial number (for the desired patent). Click on “image file wrapper”.’

It’s the right thing to do for any patent or patent application.’

Verizon.net blocks the world

Spam: I’m still catching up, but this is just plain hilarious. Pure, solid-gold, insanity. Verizon.net, the ISP branch of the US telco, has decided that the easiest way to fix their spam problems (uh, spam-receiving problems, that is), is now blocking inbound email from non-U.S. IP ranges:

A little birdie with insider knowledge has confirmed that Verizon is blocking all international IP space from RIPE, APNIC, and more, and is only unblocking specific domains, based on their IP address, when complaints are made and escalated.

According to the source ‘the security team management thinks this is going to stop their inbound spam problems.’

Well, it may stop their inbound spam problem, but it’s also going to stop that pesky ‘wanted email making it to their customers’ problem.

A quick check from my Ireland-hosted colo box does indeed indicate that this is still the case, and I can’t connect to relay.verizon.net (206.46.170.12):

  : jm ftp 1...; telnet 206.46.170.12 25
  Trying 206.46.170.12...
  telnet: Unable to connect to remote host: Connection timed out

Back, in the flurry of a mini-tornado

Meta: Back. Not even ‘mini-tornados’ at Dublin Airport can keep me away — although it gave it a damn good try, with a 3 hour delay, a missed connection, and an overnight stay in Chicago. Arggh.

Mail: I generally leave the laptop at home when on vacation, to do some proper winding down. Not sure it was a great idea this time, since I was joe-jobbed by some pretty extensive spam runs recently, resulting in over 30,000 bounces sitting unread in my email when I got back.

Thankfully, Tim Jackson’s bogus-virus-warnings.cf SpamAssassin ruleset (with a few updates) got most of them, with only a few hundred getting past. I should really hack on making those more complete, but some of the bounces are really obscure; along the lines of ‘Hi from J Random Luser, Esq.! I no longer use this address because it gets too much spam! Please send to this new one instead: [email protected]!’, generally without any obvious identifying headers that indicate it’s an autoresponse.

Sigh — each of those messages is just utterly random, and I can’t see much recourse but to come up with some nasty phrase-based content filtering rules, which I was hoping to avoid. But 29,500 hits isn’t bad ;)

I’m not sure they’d be suitable yet for use as default SpamAssassin rules, since they now generally just match any kind of bounce message, not specifically joe-job or virus-forgery blowback. But that suits me just fine — I can live without bounces, as long as I don’t have to suffer the bounce blow-back.

Science: Good news from New Scientist — they’re opening up their archives! NS has consistently the best science journalism around, and I’ve been a subscriber for years. But until recently, they had a lousy approach to their website — most of the useful stuff, like the archives, were walled-off, subscriber-only features; a classic case of missing the Clue Train. Well, here’s an archive search for ‘spam’ — pretty impressive, and most of the short articles are available in full, with only the full text for features and opinion pieces requiring a login.

In addition, they’ve added a massive batch of RSS feeds. Sadly, no full article text excerpts, however. But still — getting the clue, eventually — this way they may actually get links on the web, in place of the mangled and chinese-whispered versions of their articles republished in the UK newspapers…

Ireland: Due to monopolistic pricing of Irish GIS data, consumer GPS maps of Ireland’s road system are appalling, and this page collects a few great demos — for example, MS Autoroute quintuples the distance from Galway to Roundstone! That’s a major tourist route, BTW. I knew it was bad, but not that bad…

Anyway, I’m still waaay behind, but slowly catching up.