Horrible, horrible postmortem doc. This is the kicker:
So in other words, out of 5 backup/replication techniques deployed none are working reliably or set up in the first place.Reddit comments: https://www.reddit.com/r/linux/comments/5rd9em/gitlab_is_down_notes_on_the_incident_and_why_you/
This is simply amazing:
Intercom is a dual-citizen company of a sort. We’ve had two offices from day zero. I moved to San Francisco from Ireland in 2011 and now hold a green card and live here. I set up our headquarters here, which contains all of our business functions. My cofounders set up our Dublin office, where our research and development teams are based. And we have over 150 people in each office now. We’d like to use this special position we’re in to try help anyone in our industry feeling unsafe and hurt right now. If you’re in tech, and you’re from one of the newly unfavored countries, or even if you’re not, but you’re feeling persecuted for being Muslim, we’d like to help you consider Dublin as a place to live and work. [….] – If you decide you want to look into moving seriously, we’ll retain our Dublin immigration attorneys for you, and pay your legal bills with them, up to €5k. We’ll do this for as many as we can afford. We should be able to do this for at least 50 people.
The Google SRE book is now online, for free
wow. Stratus fault-tolerant systems ftw. ‘This is a fault tolerant server, which means that hardware components are redundant. Over the years, disk drives, power supplies and some other components have been replaced but Hogan estimates that close to 80% of the system is original.’ (via internetofshit, which this isn’t)
generating a near-optimal external dictionary for Zlib deflate compression
a mere $33.31
People who enjoy playing the cult post-apocalyptic game franchise Fallout are surely familiar with “Nuka Cola”. For those who don’t know, Nuka-Cola is a fictional soft drink that is omnipresent throughout the game. It glows with a sickly radioactive glow, and it satirizes America’s fascination with radium from the beginning of the 20th century. It may seem downright crazy, but a radioactive energy drink actually existed in the 1920s and people believed in its magical properties. [….] “RadiThor”, an energy drink produced from 1918 to 1928 by the Bailey Radium Laboratories in East New Jersey. William J. A. Bailey, a Harvard dropout, created the drink by simply dissolving ridiculous quantities of radium in water.
‘a specific type of flow diagram, in which the width of the arrows is shown proportionally to the flow quantity. Sankey diagrams put a visual emphasis on the major transfers or flows within a system. They are helpful in locating dominant contributions to an overall flow. Often, Sankey diagrams show conserved quantities within defined system boundaries. [….] One of the most famous Sankey diagrams is Charles Minard’s Map of Napoleon’s Russian Campaign of 1812. It is a flow map, overlaying a Sankey diagram onto a geographical map.’
The most important thing to understand is that not all miles are the same. Most miles that we drive are very easy, and we can drive them while daydreaming or thinking about something else or having a conversation. But some miles are really, really hard, and so it’s those difficult miles that we should be looking at: How often do those show up, and can you ensure on a given route that the car will actually be able to handle the whole route without any problem at all? Level 5 autonomy says all miles will be handled by the car in an autonomous mode without any need for human intervention at all, ever. So if we’re talking to a company that says, “We can do full autonomy in this pre-mapped area and we’ve mapped almost every area,” that’s not Level 5. That’s Level 4. And I wouldn’t even stop there: I would ask, “Is that at all times of the day, is it in all weather, is it in all traffic?” And then what you’ll usually find is a little bit of hedging on that too. The trouble with this Level 4 thing, or the “full autonomy” phrase, is that it covers a very wide spectrum of possible competencies. It covers “my car can run fully autonomously in a dedicated lane that has no other traffic,” which isn’t very different from a train on a set of rails, to “I can drive in Rome in the middle of the worst traffic they ever have there, while it’s raining,” which is quite hard. Because the “full autonomy” phrase can mean such a wide range of things, you really have to ask the question, “What do you really mean, what are the actual circumstances?” And usually you’ll find that it’s geofenced for area, it may be restricted by how much traffic it can handle, for the weather, the time of day, things like that. So that’s the elaboration of why we’re not even close.
‘This document is intended to help those with a basic knowledge of machine learning get the benefit of best practices in machine learning from around Google. It presents a style for machine learning, similar to the Google C++ Style Guide and other popular guides to practical programming. If you have taken a class in machine learning, or built or worked on a machine-learned model, then you have the necessary background to read this document.’ Full of good tips, if you wind up using ML in a production service.
Dictator-friendly censorship tools? no probs!
This is fascinating, re “authenticity” of food:
The objection that curry house food was inauthentic was true, but also unfair. It’s worth asking what “authenticity” really means in this context, given that people in India – like humans everywhere – do not themselves eat a perfectly “authentic” diet. When I asked dozens of people, while on a recent visit to India, about their favourite comfort food, most of them – whether from Delhi, Bangalore or Mumbai – told me that what they really loved to eat, especially when drinking beer, was something called Indian-Chinese food. It is nothing a Chinese person would recognise, consisting of gloopy dishes of meat and noodles, thick with cornflour and soy sauce, but spiced with green chillis and vinegar to please the national palate. Indian-Chinese food – just like British curry house food – offers a salty night away from the usual home cooking. The difference is that Indian people accept Indian-Chinese food for the ersatz joy that it is, whereas many British curry house customers seem to have believed that recipe for their Bombay potatoes really did come from Bombay, and felt affronted to discover that it did not.
We raised the issue of discrimination in 2011 with one of the banks and with the Commission for Racial Equality, but as no-one was keeping records, nothing could be proved, until today. How can this discrimination happen? Well, UK rules give banks a lot of discretion to decide whether to refund a victim, and the first responders often don’t know the full story. If your HSBC card was compromised by a skimmer on a Tesco ATM, there’s no guarantee that Tesco will have told anyone (unlike in America, where the law forces Tesco to tell you). And the fraud pattern might be something entirely new. So bank staff end up making judgement calls like “Is this customer telling the truth?” and “How much is their business worth to us?” This in turn sets the stage for biases and prejudices to kick in, however subconsciously. Add management pressure to cut costs, sometimes even bonuses for cutting them, and here we are.
Agreed, this is a big issue.
If artificial intelligence takes over our lives, it probably won’t involve humans battling an army of robots that relentlessly apply Spock-like logic as they physically enslave us. Instead, the machine-learning algorithms that already let AI programs recommend a movie you’d like or recognize your friend’s face in a photo will likely be the same ones that one day deny you a loan, lead the police to your neighborhood or tell your doctor you need to go on a diet. And since humans create these algorithms, they’re just as prone to biases that could lead to bad decisions—and worse outcomes. These biases create some immediate concerns about our increasing reliance on artificially intelligent technology, as any AI system designed by humans to be absolutely “neutral” could still reinforce humans’ prejudicial thinking instead of seeing through it.
Much of my professional work for the last 10+ years has revolved around handing, importing and exporting CSV files. CSV files are frustratingly misunderstood, abused, and most of all underspecified. While RFC4180 exists, it is far from definitive and goes largely ignored. Partially as a companion piece to my recent post about how CSV is an encoding nightmare, and partially an expression of frustration, I’ve decided to make a list of falsehoods programmers believe about CSVs. I recommend my previous post for a more in-depth coverage on the pains of CSVs encodings and how the default tooling (Excel) will ruin your day.(via Tony Finch)
Pretty amazing, particularly for this revelation:
Tetsuya Nomura (Character and battle visual director, Square Japan): OK, so maybe I did kill Aerith. But if I hadn’t stopped you, in the second half of the game, you were planning to kill everyone off but the final three characters the player chooses! Yoshinori Kitase (Director, Square Japan) No way! I wrote that? Where? Tetsuya Nomura (Character and battle visual director, Square Japan) In the scene where they parachute into Midgar. You wanted everyone to die there!
in 1977, Jet Propulsion Lab (JPL) scientists packed a Reed-Solomon encoder in each Voyager, hardware designed to add error-correcting bits to all data beamed back at a rate of efficiency 80 percent higher than an older method also included with Voyager. Where did the hope come in? When the Voyager probes were launched with Reed-Solomon encoders on board, no Reed-Solomon decoders existed on Earth.
Using jemalloc to instrument the contents of the native heap and record stack traces of each chunk’s allocators, so that leakers can be quickly identified (GZIPInputStream in this case). See also https://gdstechnology.blog.gov.uk/2015/12/11/using-jemalloc-to-get-to-the-bottom-of-a-memory-leak/ .
If you’ve always loved Hello Kitty but wish she also came with a deep well of rage, Sanrio has introduced just the character for you: Aggretsuko. An adorable 25-year-old red panda who works as an office associate, Aggretsuko is constantly taken advantage of and bothered by her boss and co-workers. So she deals with it by pounding beers and screaming death-metal karaoke.
This documentation covers parts of the PagerDuty Incident Response process. It is a cut-down version of our internal documentation, used at PagerDuty for any major incidents, and to prepare new employees for on-call responsibilities. It provides information not only on preparing for an incident, but also what to do during and after. It is intended to be used by on-call practitioners and those involved in an operational incident response process (or those wishing to enact a formal incident response process).This is a really good set of processes — quite similar to what we used in Amazon for high-severity outage response.
Dr. Kelly, desperate to become intoxicated while maintaining The Pledge, realized that not only could ether vapors be inhaled, but liquid ether could be swallowed. Around 1845 he began consuming tiny glasses of ether, and then started dispensing these to his patients and friends as a nonalcoholic libation. It wasn’t long before it became a popular beverage, with one priest going so far as to declare that ether was “a liquor on which a man could get drunk with a clean conscience.” In some respects ingesting ether is less damaging to the system than severe alcohol intoxication. Its volatility – ether is a liquid at room temperature but a gas at body temperature -dramatically speeds its effects. Dr. Ernest Hart wrote that “the immediate effects of drinking ether are similar to those produced by alcohol, but everything takes place more rapidly; the stages of excitement, mental confusion, loss of muscular control, and loss of consciousness follow each other so quickly that they cannot be clearly separated.” Recovery is similarly rapid. Not only were ether drunks who were picked up by the police on the street often completely sober by the time they reached the station, but they suffered no hangovers. Ether drinking spread rapidly throughout Ireland, particularly in the North, and the substance soon could be purchased from grocers, druggists, publicans, and even traveling salesmen. Because ether was produced in bulk for certain industrial uses, it could also be obtained quite inexpensively. Its low price and rapid action meant than even the poorest could afford to get drunk several times a day on it. By the 1880s ether, distilled in England or Scotland, was being imported and widely distributed to even the smallest villages. Many Irish market towns would “reek of the mawkish fumes of the drug” on fair days when “its odor seems to cling to the very hedges and houses for some time.”
Can’t help feeling danah boyd is hitting the nail on the head here:
The Internet has long been used for gaslighting, and trolls have long targeted adversaries. What has shifted recently is the scale of the operation, the coordination of the attacks, and the strategic agenda of some of the players. For many who are learning these techniques, it’s no longer simply about fun, nor is it even about the lulz. It has now become about acquiring power. A new form of information manipulation is unfolding in front of our eyes. It is political. It is global. And it is populist in nature. The news media is being played like a fiddle, while decentralized networks of people are leveraging the ever-evolving networked tools around them to hack the attention economy.
per Difford’s Guide — Amaretto Sour, Margarita, Bramble, Espresso Martini, Old-Fashioned, Negroni, White Lady and Manhattan up there.
Instead of discussing recent site visits or photographs we’ll be looking at a recent controversy sparked by comments about the reconstruction of Newgrange and, in particular, three claims made in the media by an Irish archaeologist; 1. That the “roof-box” at Newgrange may not be an original feature, instead it was “fabricated” and has “not a shred of authenticity” 2. That two vitally important structural stones, both decorated with megalithic art, from Newgrange were lost after the excavation and 3. That the photographic evidence that backs up the existing restoration is either inaccessible or never existed at all. I hope to show why we can be sure none of these claims are sustainable and that in fact the winter solstice phenomenon at Newgrange is an original and central feature of the tomb.
Google offers public NTP service with leap smearing — I didn’t realise! (thanks Keith)
The root cause of the bug that affected our DNS service was the belief that time cannot go backwards. In our case, some code assumed that the difference between two times would always be, at worst, zero. RRDNS is written in Go and uses Go’s time.Now() function to get the time. Unfortunately, this function does not guarantee monotonicity. Go currently doesn’t offer a monotonic time source.So the clock went “backwards”, s1 – s2 returned < 0, and the code couldn't handle it (because it's a little known and infrequent failure case). Part of the root cause here is cultural -- Google has solved the leap-second problem internally through leap smearing, and Go seems to be fundamentally a Google product at heart. The easiest fix in general in the "outside world" is to use "ntpd -x" to do a form of smearing. It looks like AWS are leap smearing internally (https://aws.amazon.com/blogs/aws/look-before-you-leap-the-coming-leap-second-and-aws/), but it is a shame they aren't making this a standard part of services running on top of AWS and a feature of the AWS NTP fleet.
via twitter: “interesting conversation between author of a parenting book and the guy who introduced the concept of “flow”” — summary, family life is interrupt-driven (via nagging) and fundamentally hard to align with “flow”
The recent movement to get all traffic encrypted has of course been great for the Internet. But the use of encryption in these protocols is different than in TLS. In TLS, the goal was to ensure the privacy and integrity of the payload. It’s almost axiomatic that third parties should not be able to read or modify the web page you’re loading over HTTPS. QUIC and TOU go further. They encrypt the control information, not just the payload. This provides no meaningful privacy or security benefits. Instead the apparent goal is to break the back of middleboxes . The idea is that TCP can’t evolve due to middleboxes and is pretty much fully ossified. They interfere with connections in all kinds of ways, like stripping away unknown TCP options or dropping packets with unknown TCP options or with specific rare TCP flags set. The possibilities for breakage are endless, and any protocol extensions have to jump through a lot of hoops to try to minimize the damage.
Paper from Google describing one of their internal building block services:
A general purpose sharding service. I normally think of sharding as something that happens within a (typically data) service, not as a general purpose infrastructure service. What exactly is Slicer then? It has two key components: a data plane that acts as an affinity-aware load balancer, with affinity managed based on application-specified keys; and a control plane that monitors load and instructs applications processes as to which keys they should be serving at any one point in time. In this way, the decisions regarding how to balance keys across application instances can be outsourced to the Slicer service rather than building this logic over and over again for each individual back-end service. Slicer is focused exclusively on the problem of balancing load across a given set of backend tasks, other systems are responsible for adding and removing tasks.interesting.
a competing-consumer messaging queue that is durable, fault-tolerant, highly available and scalable. We achieve durability and fault-tolerance by replicating messages across storage hosts, and high availability by leveraging the append-only property of messaging queues and choosing eventual consistency as our basic model. Cherami is also scalable, as the design does not have single bottleneck. […] Cherami is completely written in Go, a language that makes building highly performant and concurrent system software a lot of fun. Additionally, Cherami uses several libraries that Uber has already open sourced: TChannel for RPC and Ringpop for health checking and group membership. Cherami depends on several third-party open source technologies: Cassandra for metadata storage, RocksDB for message storage, and many other third-party Go packages that are available on GitHub. We plan to open source Cherami in the near future.
This is scary shit. It’s amazing how Russia has weaponised transparency, but I guess it’s not new to observers of “kompromat”: https://en.wikipedia.org/wiki/Kompromat
good preso from Percona Live 2015 on the messiness of MySQL vs UTF-8 and utf8mb4
A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means. The t-digest algorithm is also very parallel friendly making it useful in map-reduce and parallel streaming applications. The t-digest construction algorithm uses a variant of 1-dimensional k-means clustering to product a data structure that is related to the Q-digest. This t-digest data structure can be used to estimate quantiles or compute other rank statistics. The advantage of the t-digest over the Q-digest is that the t-digest can handle floating point values while the Q-digest is limited to integers. With small changes, the t-digest can handle any values from any ordered set that has something akin to a mean. The accuracy of quantile estimates produced by t-digests can be orders of magnitude more accurate than those produced by Q-digests in spite of the fact that t-digests are more compact when stored on disk.Super-nice feature is that it’s mergeable, so amenable to parallel usage across multiple hosts if required. Java implementation, ASL licensing.
good hardware recommendations
Good advice — let’s hope it doesn’t come to this. Example: ’17. Watch out for the paramilitaries: When the men with guns who have always claimed to be against the system start wearing uniforms and marching around with torches and pictures of a Leader, the end is nigh. When the pro-Leader paramilitary and the official police and military intermingle, the game is over.’
Why don’t Irish tech startup activity show up on a EU-wide comparisons? Turns out we tend to transition to a US-based model, with US-based management and EU-based operations and engineering, like $work does:
Successful Irish tech companies have a skewed geographic profile. This presents a data gathering problem for the data companies but its also a strong indicator of the market reality for Irish startups. The size of the local market and a focus on software business in particular means many Irish startups are transitioning to the US (some earlier and with more commitment than others), and getting backed by a spectrum of local and international VCs.Correcting for this put Ireland’s tech venture investment in the second half of 2014 at $125m, midway between Sweden and Finland, 8th in Europe overall.
ooh, Lascaux 4 is finally opening:
St-Cyr added: “It’s impossible for anyone to see the original now, but this is the next best thing. What is lost in not having the real thing is balanced by the fact people can see so much more of the detail of the wonderful paintings and engravings.”
Johanson said it’s possible to use an RFID “gate antenna” — two electronic readers spanning a doorway, similar to the anti-theft gates in retail stores — to scan the credit cards of people passing through. With enough high-powered gates installed at key doorways in a city or across the country, someone could collect comprehensive information on people’s movements, buying habits and social patterns. “These days you can buy a $500 antenna to mount in doorways that can read every card that goes through it,” Johanson said.Amazingly, these seem to be rife with holes — they still use the legacy EMV protocol, do not require online verification with backend systems, and allow replay attacks. A Journal.ie article today claims that attackers are sniffing EMV data, then replaying it against card readers in shops in Dublin, which while it may not be true, the attack certainly seems viable…
rather dramatic differences
Donald Trump’s media strategy as a form of Surkovian control via post-truth ‘destabilised perception’, through deliberate flooding with fake news:
By attacking the very notion of shared reality, the president-elect is making normal democratic politics impossible. When the truth is little more than an arbitrary personal decision, there is no common ground to be reached and no incentive to look for it. To men like Surkov, that is exactly as it should be. Government policy should not be set through democratic oversight; instead, the government should “manage” democracy, ensuring that people can express themselves without having any influence over the machinations of the state. According to a 2011 openDemocracy article by Richard Sakwa, a professor of Russian and European politics at the University of Kent, Surkov is “considered the main architect of what is colloquially known as ‘managed democracy,’ the administrative management of party and electoral politics.” “Surkov’s philosophy is that there is no real freedom in the world, and that all democracies are managed democracies, so the key to success is to influence people, to give them the illusion that they are free, whereas in fact they are managed,” writes Sakwa. “In his view, the only freedom is ‘artistic freedom.’”
remove RFID from a payment card with a single drilled hole
Nice comparison of a counting Bloom filter and a Cuckoo Filter, implemented in Python:
This post provides an update by exploring Cuckoo filters, a new probabilistic data structure that improves upon the standard Bloom filter. The Cuckoo filter provides a few advantages: 1) it enables dynamic deletion and addition of items 2) it can be easily implemented compared to Bloom filter variants with similar capabilities, and 3) for similar space constraints, the Cuckoo filter provides lower false positives, particularly at lower capacities. We provide a python implementation of the Cuckoo filter here, and compare it to a counting Bloom filter (a Bloom filter variant).
Football Manager includes what is effectively a parallel universe, so they modelled the effects of Brexit on the UK Premier League: ‘In my own current “save”, Brexit kicked in at the end of season three. Unfortunately I got one of the hard options, where all non-homegrown players are now going through a work permit system, albeit one that’s slightly relaxed. It means I can no longer bring in that 19-year-old Italian keeper I’d been eyeing up as one for the future. Instead I have to wait for him to break into the Italian squad, and play 30% of their fixtures over the next two years. Then he’ll be mine. Meanwhile, my TV revenue has just dropped by a few million. Let’s hope that doesn’t continue, or I won’t even be able to afford him.’
It was recently discovered that some surprising operations on Rust’s standard hash table types could go quadratic.Quite a nice unexpected accidental detour into O(n^2)
This is intriguing — using Jupyter notebooks to embody data analysis work, and ensure it’s reproducible, which brings better rigour similarly to how unit tests improve coding. I must try this.
Reproducibility makes data science at Stripe feel like working on GitHub, where anyone can obtain and extend others’ work. Instead of islands of analysis, we share our research in a central repository of knowledge. This makes it dramatically easier for anyone on our team to work with our data science research, encouraging independent exploration. We approach our analyses with the same rigor we apply to production code: our reports feel more like finished products, research is fleshed out and easy to understand, and there are clear programmatic steps from start to finish for every analysis.
neat — aggregation of histograms for Datadog statsd
auditd -> go-audit -> elasticsearch at Slack
Eir ship vulnerable firmware images AGAIN. ffs
Amazing virtuoso performance — be sure to scroll up all the way to Chapter 1
good call — new EMR feature
LMAX’ approach to acceptance/system-testing time-dependent code. We are doing something similar in Swrve too, so finding that LMAX have taken a similar approach is a great indicator
scumbags. Attempting to pass off their pissy beer under alternative names to con consumers into buying it! ‘There will be no sanctions against Heineken for passing off non-craft beer as “locally produced”, the Food Safety Authority of Ireland (FSAI) has said. The FSAI and HSE launched a joint investigation last month after it emerged that Heineken Ireland had sold some of its products, including Foster’s lager, under craft-type names such as Blasket Blonde and Beanntrai Bru. Two well-known stouts, Beamish and Murphy’s, were also sold under craft-type names by the international brewing giant. C&C, a Tipperary-based drinks company, was also investigated after it admitted selling its Clonmel 1650 lager under a different name, Pana Cork, in Cork.’
great, I’ve looked for this so many times. Only tricky limit I can spot is the 300 tps limit, and it’s US-East/US-West only for now
good intro to Airflow usage preso
by John Allspaw, Morgan Evans and Daniel Schauenberg; the Etsy blameless postmortem style crystallized into a detailed 27-page PDF ebook
‘bike-shedding’, or needless arguing about trivial issues, actually dates back to 1957 as C. Northcote Parkinson’s ‘law of triviality’
simple usage of Docker, blue/green deploys, and AWS ALBs
ICRs are the perfect material for blackmail, which makes them valuable in a way that traditional telephone records are not. And where potentially large sums of money are involved, corruption is sure to follow. Even if ICR databases are secured with the best available technology, they are still vulnerable to subversion by individuals whose jobs give them ready access. This is no theoretical risk. Just one day ago, it emerged that corrupt insiders at offshore call centres used by Australian telecoms were offering to sell phone records, home addresses, and other private details of customers. Significantly, the price requested was more if the target was an Australian “VIP, politician, police [or] celebrity.”