A google SRE annotates the Google SRE book with his own thoughts. The source material is great, but the commentary improves it alright. Particularly good for the error budget concept. Also: when did “runbooks” become “playbooks”? Don’t particularly care either way, but needless renaming is annoying.
good advice. See also http://www.teenvogue.com/story/how-to-keep-messages-secure (via Zeynep Tufekci)
Unfortunately, a bug was recently introduced into the allocator which made it sometimes not try hard enough to free kernel cache memory before giving up and invoking the OOM killer. In practice, this means that at random times, the OOM killer would strike at big processes when the kernel tries to allocate, say, 16 kilobytes of memory for a new process’s thread stack?—?even when there are many gigabytes of memory in reclaimable kernel caches!