Bit of detail into Twitter’s TSD metric store.
There are separate online clusters for different data sets: application and operating system metrics, performance critical write-time aggregates, long term archives, and temporal indexes. A typical production instance of the time series database is based on four distinct Cassandra clusters, each responsible for a different dimension (real-time, historical, aggregate, index) due to different performance constraints. These clusters are amongst the largest Cassandra clusters deployed in production today and account for over 500 million individual metric writes per minute. Archival data is stored at a lower resolution for trending and long term analysis, whereas higher resolution data is periodically expired. Aggregation is generally performed at write-time to avoid extra storage operations for metrics that are expected to be immediately consumed. Indexing occurs along several dimensions–service, source, and metric names–to give users some flexibility in finding relevant data.
I didn’t clearly explain that there’s an enormous continuum between, on the one hand, a full break of RSA or Diffie-Hellman (which still seems extremely unlikely to me), and on the other, “pure side-channel attacks” involving no new cryptanalytic ideas. Along that continuum, there are many plausible places where the NSA might be. For example, imagine that they had a combination of side-channel attacks, novel algorithmic advances, and sheer computing power that enabled them to factor, let’s say, ten 2048-bit RSA keys every year. In such a case, it would still make perfect sense that they’d want to insert backdoors into software, sneak vulnerabilities into the standards, and do whatever else it took to minimize their need to resort to such expensive attacks. But the possibility of number-theoretic advances well beyond what the open world knows certainly wouldn’t be ruled out. Also, as Schneier has emphasized, the fact that NSA has been aggressively pushing elliptic-curve cryptography in recent years invites the obvious speculation that they know something about ECC that the rest of us don’t.
Built into the HotSpot JVM [in JDK version 7u40] is something called the Java Flight Recorder. It records a lot of information about/from the JVM runtime, and can be thought of as similar to the Data Flight Recorders you find in modern airplanes. You normally use the Flight Recorder to find out what was happening in your JVM when something went wrong, but it is also a pretty awesome tool for production time profiling. Since Mission Control (using the default templates) normally don’t cause more than a per cent overhead, you can use it on your production server.I’m intrigued by the idea of always-on profiling in production. This could be cool.