Cory Doctorow at Boing Boing links to an article at TechCrunch that lists
Better and Cheaper Online File
Storage
as a product that needs to be made. However, Ben Laurie does the
sums on online storage as a useful backup medium,
and found them not exactly compelling (e.g. 100GB of data will take 75 days to
upload over an 128Kbps link).
I tend to agree. An online host isn’t great as a backup host, since, in my
experience, there are two types of backups required:
- The important small files (for example: encrypted password lists, my address book, my ~/bin directory)
- The massive big filesets (for example: MP3s, photos)
The first kind of fileset is amenable to an online backup-storage service, at
first glance. However — in my opinion you’re better off going the whole hog
for these files, and using the distributed, versioned backup method of putting
it in a good networked revision control system, and checking it out
everywhere, so you can also make
changes and check in from any host; otherwise, you face the perils of syncing
up a single backup from multiple “writers”, without conflicts. So far, none
of the online file storage services offer SVN as an access method, so a shell
account at a colo server still seems more useful on that count.
The second kind of fileset, as Ben notes, will take donkey’s years to upload
and sync as a backup mechanism; and the economics are hardly compelling
for the service provider.
I think I prefer Brad Templeton’s idea to
deal with large-data backups —
I propose a software RAID-5, done over a LAN with 3 to 5 drives scattered
over several machines on the LAN.
Slow as hell, of course, having to read and write your data out over the LAN
even at 100mbits. Gigabit would obviously be better. But what is it we have
that’s taking up all this disk space ? it?s video, music and photos. Things
which, if just being played back, don?t need to be accessed very fast. If
you’re not editing video or music, in particular, you can handle having it on
a very slow device. (Photos are a bigger issue, as they do sometimes need
fast access when building thumbnails etc.)
This could even be done among neighbours over 802.11g, with suitable
encryption. In theory.
As a commenter notes, Linux has support for this already, in the form
of software RAID and the network block device.
So: take an external IDE enclosure, add a GumStix board
running Linux with software RAID, LVM, and nbd, and add
wifi. Then add
DAV, SMB and NFS export of the disk, and some decent UI code to organise the
volumes into a single exported RAID volume (hopefully automatically!), and it’d
be a pretty compelling product, in my opinion!
(hey Craig! I said GumStix! ;)