A causal profiler for C++.
Causal profiling is a novel technique to measure optimization potential. This measurement matches developers’ assumptions about profilers: that optimizing highly-ranked code will have the greatest impact on performance. Causal profiling measures optimization potential for serial, parallel, and asynchronous programs without instrumentation of special handling for library calls and concurrency primitives. Instead, a causal profiler uses performance experiments to predict the effect of optimizations. This allows the profiler to establish causality: “optimizing function X will have effect Y,” exactly the measurement developers had assumed they were getting all along.I can see this being a good technique to stochastically discover race conditions and concurrency bugs, too.
This is the version with the superfast petabyte-sort record:
Spark 1.2 includes several cross-cutting optimizations focused on performance for large scale workloads. Two new features Databricks developed for our world record petabyte sort with Spark are turned on by default in Spark 1.2. The first is a re-architected network transfer subsystem that exploits Netty 4’s zero-copy IO and off heap buffer management. The second is Spark’s sort based shuffle implementation, which we’ve now made the default after significant testing in Spark 1.1. Together, we’ve seen these features give as much as 5X performance improvement for workloads with very large shuffles.
As the 1 January deadline gallops towards the EU, microbusinesses desperate to stay open without breaking the law try to find out, “Can I email stuff out instead?” Well… Yes. – No – It depends – and simultaneously yes AND no, according to Schrödinger’s VAT. So that’s clear, then.
Nice work, EU