A tool written by Facebook to ease the pain of online MySQL schema-change migrations.
Some ALTER TABLE statements take too long form the perspective of some MySQL users. The fast index create feature for the InnoDB plugin in MySQL 5.1 makes this less of an issue but this can still take minutes to hours for a large table and for some MySQL deployments that is too long. A workaround is to perform the change on a slave first and then promote the slave to be the new master. But this requires a slave located near the master. MySQL 5.0 added support for triggers and some replication systems have been built using triggers to capture row changes. Why not use triggers for this? The openarkkit toolkit did just that with oak-online-alter-table. We have published our version of an online schema change utility (OnlineSchemaChange.php aka OSC).
There will come a time in the life of most systems serving data, when there is a need to migrate data to [another] data store while maintaining or improving data consistency, latency and efficiency. This document explains the data migration technique we used at Netflix to migrate the user’s queue data between two different distributed NoSQL storage systems [SimpleDB to Cassandra].
nice enough, but a lot of moving parts. It would be nice to see a simpler ZK+Graphite setup using the ‘mntr’ verb
includes “429 Too Many Requests”, for rate limits
good +1 for using Netflix’ Curator ZK client library
a high-level API that greatly simplifies using ZooKeeper. It adds many features that build on ZooKeeper and handles the complexity of managing connections to the ZooKeeper cluster and retrying operations. Some of the features are: Automatic connection management: There are potential error cases that require ZooKeeper clients to recreate a connection and/or retry operations. Curator automatically and transparently (mostly) handles these cases. Cleaner API: simplifies the raw ZooKeeper methods, events, etc.; provides a modern, fluent interface Recipe implementations (see Recipes): Leader election, Shared lock, Path cache and watcher, Distributed Queue, Distributed Priority Queue
some pretty interesting lessons, it turns out: a ‘take what you need’ vacation policy means nobody takes vacations (unsurprising); Yammer actively work to avoid employee burnout (good idea); Yammer A/B test every feature; and Yammer mgmt try to let their devs work autonomously.
Some really cool-looking UNIX command line utils, packaged in Debian (and therefore in Ubuntu too). A few of these I’ve reimplemented separately, but it’s always good to replace a hack with a more widely available “official” tool. Thanks, Joey Hess!
sponge: accept input, wait til EOF, then rewrite a file; chronic: runs a command quietly unless it fails; combine: combine the lines in two files using boolean operations; ifdata: get network interface info without parsing ifconfig output; ifne: run a program if the standard input is not empty; isutf8: check if a file or standard input is utf-8; lckdo: execute a program with a lock held; mispipe: pipe two commands, returning the exit status of the first; parallel: run multiple jobs at once; pee: tee standard input to pipes; sponge: soak up standard input and write to a file; ts: timestamp standard input; vidir: edit a directory in your text editor; vipe: insert a text editor into a pipe; zrun: automatically uncompress arguments to command
The book introduces “Infrastructure as Code,” test-driven development, Chef, and cucumber-chef, and then proceeds to a simple example using Chef to provision a shared Linux server. The recipes for the server are developed test-first, demonstrating both the technique and the workflow.
Neat demo of using ptrace to inject into a running process, just like the good old days ;)
Some time ago I ran into a production issue where the init process (upstart) stopped behaving properly. Specifically, instead of spawning new processes, it deadlocked in a transitional state. […] What’s worse, upstart doesn’t allow forcing a state transition and trying to manually create and send DBus events didn’t help either. That meant the sane options we were left with were: restart the host (not desirable at all in that scenario); start the process manually and hope auto-respawn will not be needed. Of course there are also some insane options. Why not cheat like in the old times and just PEEK and POKE the process in the right places? The solution used at the time involved a very ugly script driving gdb which probably summoned satan in some edge cases. But edge cases were not hit and majority of hosts recovered without issues.