Skip to content

Month: February 2009

Links for 2009-02-27

Blackout Ireland – a response to IRMA’s censorship demands

As Adrian noted last week, IRMA are demanding that Eircom block the Pirate Bay — first on a list of websites they don’t like — on pain of being sued. On top of that, they intend for the other Irish ISPs to follow suit — here’s a key line from the letter they sent to Blacknight MD Michele Neylon:

in the event of a positive response to this letter it is proposed to make practical arrangements with Blacknight of a like nature to those made with eircom.

If that comes to pass, this will be an appalling situation for Irish internet users, and we need to act to ensure it doesn’t happen. Digital Rights Ireland:

The net effect of this scheme, if it is allowed to go into effect, will be to impose an internet death penalty on two groups. On users, who will be cut off on the allegation of a private body, with no court involvement, and on websites, which could be blocked to Irish users based on a court hearing where only one side is heard.

Pace Mulley:

So first they’ll start with the Pirate Bay. Then comes Mininova, IsoHunt, then comes YouTube (they have dodgy stuff, right?), how long before we have Boards.ie because someone quoted a newspaper article or a section of a book?

Digital Rights Ireland have posted an excellent document detailing the following plan of action for Irish internet users concerned about this:

  • Contact your ISP and let them know that this is a key issue for you, as their customer.

  • Join up with your fellow netizens. Subscribe to the Blackout Ireland blog. Follow the #blackoutirl hashtag on Twitter. Join the Blackout Ireland Facebook group. It looks likely that there’ll be a week-long blackout campaign starting next Thursday, March 5th.

  • Contact politicians. This is likely to cause irreparable damage to the Irish internet, so our pols should be very worried. See the DRI post for details on getting in touch with Minister for Communications Eamonn Ryan.

New Zealand is running their own blackout campaign right now, so that may help our planning.

International readers — make no mistake, you’re next. IRMA in this case is acting as the local delegate of IFPI, which stated in 2007 that this was one of the 3 technical options for ISPs to control piracy:

Here’s some other interesting coverage:

Fantastic interview with BitBuzz CEO Alex French:

If ISPs, including Eircom, agree not to oppose blocking access to The Pirate Bay and other similar websites, is this not an agreement to web censorship? “I don’t think there is any other way to interpret it,” said French.

“They are essentially agreeing to censor certain websites at the behest of the recording industry, without these websites ever having necessarily shown to be illegal in the Republic of Ireland. I would have a huge concern over what other websites may be blocked and what other industries will pile in now that the precedent has been set.”

Some sample letters:

And further discussion — here’s a massive boards.ie discussion thread, now closed in favour of this newer thread.

Update: here’s the letter I sent to the Minister, if you’re curious or need inspiration.

Links for 2009-02-26

Links for 2009-02-25

Ubuntu to bundle Eucalyptus

Introducing Karmic Koala, Ubuntu 9.10:

What if you want to build an EC2-style cloud of your own? Of all the trees in the wood, a Koala’s favourite leaf is Eucalyptus. The Eucalyptus project, from UCSB, enables you to create an EC2-style cloud in your own data center, on your own hardware. It’s no coincidence that Eucalyptus has just been uploaded to universe and will be part of Jaunty – during the Karmic cycle we expect to make those clouds dance, with dynamically growing and shrinking resource allocations depending on your needs.

A savvy Koala knows that the best way to conserve energy is to go to sleep, and these days even servers can suspend and resume, so imagine if we could make it possible to build a cloud computing facility that drops its energy use virtually to zero by napping in the midday heat, and waking up when there’s work to be done. No need to drink at the energy fountain when there’s nothing going on. If we get all of this right, our Koala will help take the edge off the bear market.

AWESOME — exactly where the Linux server needs to go. Eucalyptus is the future of server farms. Really looking forward to this…

Links for 2009-02-24

Blimey, I won

Somehow or other, I seem to have won the 2009 Irish Blog Award for Best Technology Blog/Blogger! To be honest, for the last year I haven’t been spending as much time on the blog as before, due mainly to a rather compelling distraction, so I’m doubly grateful for winning.

Unfortunately, I was out of the country, at Nishad and Janet’s wedding, so missed my chance to get up on stage and thank my fellow bloggers in person — but I asked John to do so instead. Seems he in turn got stage fright and delegated to his missus, who picked up the trophy. Thanks Fiona! That’s probably just as well, since I’m pretty incoherent in that kind of situation myself.

Cheers to my fellow nominees, Eoghan, Robin, Michele and Pat. One of you guys should totally have won ;)

And last of all — cheers to BitBuzz for sponsoring the category, and Mulley for the whole bash. I definitely have to turn up next year!

Now I need to put more time in this year to really earn that award…

Links for 2009-02-16

Plenty of money for Dublin’s bikes

So it seems that JC Decaux have been complaining about the costs of running the Velib scheme in Paris:

Since the scheme’s launch, nearly all the original bicycles have been replaced at a cost of 400 euros each.

Of course, this won’t be a problem in Dublin. Going by Newstalk’s estimates of how much the advertising space provided to JC Decaux for free, in exchange for the (as yet nonexistent) 450 bikes would have cost, each bike comes at a public cost of 111,000 Euros. That should cover a lot of “velib extreme”.

(OK, that may be overestimating it. The Irish Times puts a more sober figure of EUR 1m per year; that works out as EUR 2,000 per bike per year. Still should cover a few broken bikes.)

A quick reminder:

ParisDublin
20,000 bikes450 promised
~1,600 billboards~120 installed
~12.5 bikes per billboard~3.8 bikes per billboard
10km range (from 15e to 19e arondissement)4km range (from the Mater Hospital to the Grand Canal)

And, of course, there’s no sign of the bikes here yet… assuming they ever arrive. Heck of a job, Dublin City Council.

BTW, here’s the rate card for advertising on the “Metropole” ad platforms, if you’re curious, via the charmingly-titled Go Ask Me Bollix.

Links for 2009-02-13

Fixing the Gmail Tasks window bug

Hey Gmail users! If you’re using Tasks, there’s a slightly annoying bug in Gmail right now — you may see the “Use this link to open Tasks” tip window appear every time you access the inbox page.

Several other people have reported it, and apparently the Google guys are ‘working to resolve it’ at the moment. In the meantime, though, here’s a way to work around the issue without losing Tasks (you will, unfortunately, lose the offline-gmail functionality, though). Simply disable Offline Gmail (Settings -> Offline -> “Disable Offline Gmail for this computer”), and the bug no longer manifests itself.

You can allow Gmail to keep the stored mail on your computer if you like, which will be handy for when the bug is fixed and Offline can be re-enabled — hopefully sooner rather than later.

Continuous deployment

This is awesome, if a little insane. Continuous Deployment at IMVU: Doing the impossible fifty times a day:

Continuous Deployment means running all your tests, all the time. That means tests must be reliable. We’ve made a science out of debugging and fixing intermittently failing tests. When I say reliable, I don’t mean “they can fail once in a thousand test runs.” I mean “they must not fail more often than once in a million test runs.” We have around 15k test cases, and they’re run around 70 times a day. That’s a million test cases a day. Even with a literally one in a million chance of an intermittent failure per test case we would still expect to see an intermittent test failure every day. It may be hard to imagine writing rock solid one-in-a-million-or-better tests that drive Internet Explorer to click ajax frontend buttons executing backend apache, php, memcache, mysql, java and solr. I am writing this blog post to tell you that not only is it possible, it’s just one part of my day job.

OK, so far, so sensible. But this is where it gets really hairy:

Back to the deploy process, nine minutes have elapsed and a commit has been greenlit for the website. The programmer runs the imvu_push script. The code is rsync’d out to the hundreds of machines in our cluster. Load average, cpu usage, php errors and dies and more are sampled by the push script, as a basis line. A symlink is switched on a small subset of the machines throwing the code live to its first few customers. A minute later the push script again samples data across the cluster and if there has been a statistically significant regression then the revision is automatically rolled back. If not, then it gets pushed to 100% of the cluster and monitored in the same way for another five minutes. The code is now live and fully pushed. This whole process is simple enough that it’s implemented by a handfull of shell scripts.

Mental. So what we’ve got here is:

  • phased rollout: automated gradual publishing of a new version to small subsets of the grid.

  • stats-driven: rollout/rollback is controlled by statistical analysis of error rates, again on an automated basis.

Worth noting some stuff from the comments. MySQL schema changes break this system:

Schema changes are done out of band. Just deploying them can be a huge pain. Doing an expensive alter on the master requires one-by-one applying it to our dozen read slaves (pulling them in and out of production traffic as you go), then applying it to the master’s standby and failing over. It’s a two day affair, not something you roll back from lightly. In the end we have relatively standard practices for schemas (a pseudo DBA who reviews all schema changes extensively) and sometimes that’s a bottleneck to agility. If I started this process today, I’d probably invest some time in testing the limits of distributed key value stores which in theory don’t have any expensive manual processes.

They use an interesting two-phased approach to publishing of the deploy file tree:

We have a fixed queue of 5 copies of the website on each frontend. We rsync with the “next” one and then when every frontend is rsync’d we go back through them all and flip a symlink over.

All in all, this is very intriguing stuff, and way ahead of most sites. Cool!

(thanks to Chris for the link)

Links for 2009-02-11

Config management as cookery

interesting to see Chef, a configuration management framework using cooking as a metaphor.

Back in the early ’90s in Iona, I wrote a user/group synchronization tool called “greenpages” which used a cooking metaphor; “spice” (data) was added to “raw” (template) files to produce “cooked” output. Great minds, eh!

Links for 2009-02-09

IR book recommendation

Thanks to Pierce for pointing me at this review of an interesting-sounding book called Introduction to Information Retrieval. The book sounds quite useful, but I wanted to pick out a particularly noteworthy quote, on compression:

One benefit of compression is immediately clear. We need less disk space.

There are two more subtle benefits of compression. The first is increased use of caching … With compression, we can fit a lot more information into main memory. [For example,] instead of having to expend a disk seek when processing a query … we instead access its postings list in memory and decompress it … Increased speed owing to caching — rather than decreased space requirements — is often the prime motivator for compression.

The second more subtle advantage of compression is faster transfer data from disk to memory … We can reduce input/output (IO) time by loading a much smaller compressed posting list, even when you add on the cost of decompression. So, in most cases, the retrieval system runs faster on compressed postings lists than on uncompressed postings lists.

This is something I’ve been thinking about recently — we’re getting to the stage where CPU speed has so far outstripped disk I/O speed and network bandwidth, that pervasive compression may be worthwhile. It’s simply worth keeping data compressed for longer, since CPU is cheap. There’s certainly little point in not compressing data travelling over the internet, anyway.

On other topics, it looks equally insightful; the quoted paragraphs on Naive Bayes and feature selection algorithms are both things I learned myself, “in the field”, so to speak, working on classifiers — I really should have read this book years ago I think ;)

The entire book is online here, in PDF and HTML. One to read in that copious free time…

Good reasons to host inelastically on EC2

Recently, there’s been a bit of discussion online about whether or not it makes sense for companies to host server infrastructure at Amazon EC2, or on traditional colo infrastructure. Generally, these discussions have focussed on one main selling point of EC2: its elasticity, the ability to horizontally scale the number of server instances at a moment’s notice.

If you’re in a position to gain from elasticity, that’s great. But it is still worth noting that even if you aren’t in that position, there’s another good reason to host at an EC2-like cloud; if you want to deploy another copy of the app, either from a different version-control branch (dev vs staging vs production deployments), or to run separate apps with customizations for different customers. These aren’t scaling an existing app up, they’re creating new copies of the app, and EC2 works nicely to do this.

If you can deploy a set of servers with one click from a source code branch, this is entirely viable and quite useful.

Another reason: EC2-to-S3 traffic is extremely fast and cheap compared to external-to-S3. So if you’re hosting your data on S3, EC2 is a great way to crunch on it efficiently. Update: Walter observed this too on the backend for his Twitter Mosaic service.

Ice Cycling

I seem to have invented a new extreme sport on the way into work: Ice Cycling. The roads were like an ice-skating rink. Scary stuff :(

Here’s some advice for anyone in the same boat:

  • use a high gear: avoid using low gear if possible, even when starting off. Low revs mean you’re more likely to get traction.

  • try to avoid turns: keep the bike as upright as possible.

  • try to avoid braking: braking is very likely to start a skid in icy conditions.

  • use busy roads: where the ice has been melted by car traffic. In icy conditions, you should ride where the cars have been, since they’ll have melted the ice.

  • ride away from the gutters: they’re more likely to be iced over than the centre of a lane. Again, ride where the cars have been.

  • avoid road markings: it seems these were much icier than the other parts of the road; possibly because their high albedo meant the ice on them hadn’t been melted by the sun yet. So look out for that.

Here’s a good thread on cyclechat.co.uk, and don’t miss icebike.org: ‘Whether commuting to work, or just out for a romp in the woods, you arrive feeling very alive, refreshed, and surrounded with the aura of a cycling god. You will be looked upon with the smile of respect by friends and co-workers. – – – Or was that the sneer of derision…no matter, ICEBIKING is a blast!’ o-kay.

Their recommendations are pretty sane, though. ;)

Links for 2009-02-05

Links for 2009-02-03