New major-version track for protobuf, with some interesting new features: Removal of field presence logic for primitive value fields, removal of required fields, and removal of default values. This makes proto3 significantly easier to implement with open struct representations, as in languages like Android Java, Objective C, or Go. Removal of unknown fields. Removal of extensions, which are instead replaced by a new standard type called Any. Fix semantics for unknown enum values. Addition of maps. Addition of a small set of standard types for representation of time, dynamic data, etc. A well-defined encoding in JSON as an alternative to binary proto encoding.
Interesting priority-queue algorithm optimised for caching data on SSD
Users download and install an Illuminate Daemon using a simple installer which starts up a small stand alone Java process. The Daemon sits quietly unless it is asked to start gathering SLA data and/or to trigger a diagnosis. Users can set SLA’s via the dashboard and can opt to collect latency measurements of their transactions manually (using our library) or by asking Illuminate to automatically instrument their code (Servlet and JDBC based transactions are currently supported). SLA latency data for transactions is collected on a short cycle. When the moving average of latency measurements goes above the SLA value (e.g. 150ms), a diagnosis is triggered. The diagnosis is very quick, gathering key data from O/S, JVM(s), virtualisation and other areas of the system. The data is then run through the machine learned algorithm which will quickly narrow down the possible causes and gather a little extra data if needed. Once Illuminate has determined the root cause of the performance problem, the diagnosis report is sent back to the dashboard and an alert is sent to the user. That alert contains a link to the result of the diagnosis which the user can share with colleagues. Illuminate has all sorts of backoff strategies to ensure that users don’t get too many alerts of the same type in rapid succession!
Binary message marshalling, client/server stubs generated by an IDL compiler, bidirectional binary protocol. CORBA is back from the dead! Intro blog post: http://googledevelopers.blogspot.ie/2015/02/introducing-grpc-new-open-source-http2.html Relevant: Steve Vinoski’s commentary on protobuf-rpc back in 2008: http://steve.vinoski.net/blog/2008/07/13/protocol-buffers-leaky-rpc/
This is a great exposition of why it’s in a company’s interest to engage with open source. Not sure I agree with ‘engineers are the artists of our generation’ but the rest are spot on
MQTT definitely has a smaller size on the wire. It’s also simpler to parse (let’s face it, Huffman isn’t that easy to implement) and provides guaranteed delivery to cater to shaky wireless networks. On the other hand, it’s also not terribly extensible. There aren’t a whole lot of headers and options available, and there’s no way to make custom ones without touching the payload of the message. It seems that HTTP/2 could definitely serve as a reasonable replacement for MQTT. It’s reasonably small, supports multiple paradigms (pub/sub & request/response) and is extensible. Its also supported by the IETF (whereas MQTT is hosted by OASIS). From conversations I’ve had with industry leaders in the embedded software and chip manufacturing, they only want to support standards from the IETF. Many of them are still planning to support MQTT, but they’re not happy about it. I think MQTT is better at many of the things it was designed for, but I’m interested to see over time if those advantages are enough to outweigh the benefits of HTTP. Regardless, MQTT has been gaining a lot of traction in the past year or two, so you may be forced into using it while HTTP/2 catches up.
I like this
Well said — Amazon had a good story around this btw
Ugh, ZK is a bear to work with.
Apache Curator is open source software which is able to handle all of the above scenarios transparently. Curator is a Netflix ZooKeeper Library and it provides a high-level API, CuratorFramework, that simplifies using ZooKeeper. By using a singleton CuratorFramework instance in the new ZooKeeperHiveLockManager implementation, we not only fixed the ZooKeeper connection issues, but also made the code easy to understand and maintain.
Forward secrecy and in-session key “ratcheting”
What a mess.
What’s faster: PV, HVM, HVM with PV drivers, PVHVM, or PVH? Cloud computing providers using Xen can offer different virtualization “modes”, based on paravirtualization (PV), hardware virtual machine (HVM), or a hybrid of them. As a customer, you may be required to choose one of these. So, which one?
Wow, this is excellent work. A formal verification of Tim Peters’ TimSort failed, resulting in a bugfix:
While attempting to verify TimSort, we failed to establish its instance invariant. Analysing the reason, we discovered a bug in TimSort’s implementation leading to an ArrayOutOfBoundsException for certain inputs. We suggested a proper fix for the culprit method (without losing measurable performance) and we have formally proven that the fix actually is correct and that this bug no longer persists.
“Cheap SSL certs from $4.99/yr” — apparently recommended for cheap, low-end SSL certs
Erasure codes, such as Reed-Solomon (RS) codes, are increasingly being deployed as an alternative to data-replication for fault tolerance in distributed storage systems. While RS codes provide significant savings in storage space, they can impose a huge burden on the I/O and network resources when reconstructing failed or otherwise unavailable data. A recent class of erasure codes, called minimum-storage-regeneration (MSR) codes, has emerged as a superior alternative to the popular RS codes, in that it minimizes network transfers during reconstruction while also being optimal with respect to storage and reliability. However, existing practical MSR codes do not address the increasingly important problem of I/O overhead incurred during reconstructions, and are, in general, inferior to RS codes in this regard. In this paper, we design erasure codes that are simultaneously optimal in terms of I/O, storage, and network bandwidth. Our design builds on top of a class of powerful practical codes, called the product-matrix-MSR codes. Evaluations show that our proposed design results in a significant reduction the number of I/Os consumed during reconstructions (a 5 reduction for typical parameters), while retaining optimality with respect to storage, reliability, and network bandwidth.
Two Spark experts from Databricks provide some good tips
Lest we forget, the sheer bullshitting ineptitude of Fianna Fail as they managed to shamble into destroying Ireland’s economy in 2008:
Once that nasty bit of business was done, the Cabinet departed en masse for six weeks on their summer holidays, despite the emerging economic and financial tsunami. Cowen and family famously took up residence in a caravan park in Connemara as opposed to his ‘official’ residence at the Mannin Bay Hotel nearby. When pressed by our reporter Niamh Horan as to why he was not at his station, he defensively replied: “I don’t understand it. First the media have a go at me because I’m taking a holiday with my family and then they come down to see if I’m having a good time!” he exclaimed.
Phone is stolen, shipped to China, and winds up being bought by “Brother Orange” — then the story becomes China’s biggest viral hit
40 minutes of multi-zone network outage for majority of instances. ‘The internal software system which programs GCE’s virtual network for VM egress traffic stopped issuing updated routing information. The cause of this interruption is still under active investigation. Cached route information provided a defense in depth against missing updates, but GCE VM egress traffic started to be dropped as the cached routes expired.’ I wonder if Google Pimms fired the alarms for this ;)
Ryan McGuire, a PhD student in Composition and Computer Technologies at the University of Virginia Center for Computer Music, has created the project The Ghost In The MP3 [....] For his first trick, McGuire took Suzanne Vega’s ‘Tom’s Diner’ and drained it into a vaporous piece titled ‘moDernisT.” McGuire chose the track he explains on his site because it was famously used as one of the main controls in the listening tests used to develop the MP3 algorithm.
A gateway script, now included in PCP
System performance metrics framework, plugged by Netflix, open-source for ages
Superfish, founded and led by former Intel employee and ex-surveillance boffin Adi Pinhas, has been criticised by users the world over since its inception in 2006.
The cracked certificate exposes Lenovo users to man-in-the-middle attacks, similar to those opened up by Heartbleed. Armed with this password and the right software, a coffee shop owner could potentially spy on any Lenovo user on her network, collecting any passwords that were entered during the session. The evil barista could also insert malware into the data stream at will, disguised as a software update or a trusted site.Amazingly stupid.
“Like any responsible father, Hugh Morrison had installed cameras in every room in the flat,” is the opening line of Intrusion, a 2012 novel set in the near future. Originally installed so that Hugh and his wife can keep an eye on their kids, the Internet-connected cameras wind up being used later in the novel by police who tap into the feeds to monitor the couple chatting on their couch when they are suspected of anti-societal behavior. As with so many sci-fi scenarios, the novel’s vision was prophetic. People are increasingly putting small Internet-connected cameras into their homes. And law enforcement officials are using the cameras to collect evidence about them.
not exactly the most challenging reverse I’ve ever seen ;)
Holy shit. Gemalto totally rooted.
With [Gemalto's] stolen encryption keys, intelligence agencies can monitor mobile communications without seeking or receiving approval from telecom companies and foreign governments. Possessing the keys also sidesteps the need to get a warrant or a wiretap, while leaving no trace on the wireless provider’s network that the communications were intercepted. Bulk key theft additionally enables the intelligence agencies to unlock any previously encrypted communications they had already intercepted, but did not yet have the ability to decrypt. [...] According to one secret GCHQ slide, the British intelligence agency penetrated Gemalto’s internal networks, planting malware on several computers, giving GCHQ secret access. We “believe we have their entire network,” the slide’s author boasted about the operation against Gemalto.
half of the [Monitorama] attendees were employees and entrepreneurs at monitoring, metrics, DevOps, and server analytics companies. Most of them had a story about how their metrics API was their key intellectual property that took them years to develop. The other half of the attendees were developers at larger organizations that were rolling their own DevOps stack from a collection of open source tools. Almost all of them were creating a “time series database” with a bunch of web services code on top of some other database or just using Graphite. When everyone is repeating the same work, it’s not key intellectual property or a differentiator, it’s a barrier to entry. Not only that, it’s something that is hindering innovation in this space since everyone has to spend their first year or two getting to the point where they can start building something real. It’s like building a web company in 1998. You have to spend millions of dollars and a year building infrastructure, racking servers, and getting everything ready before you could run the application. Monitoring and analytics applications should not be like this.
Sysdig Cloud users have the ability to view and analyze Java Management Extensions (JMX) metrics out of the box with no additional configuration or setup required.
I think that at this point it is safe to assume that any SSL interception product sold by Komodia or based on the Komodia SDK is going to be using the same method. What does this mean? Well, this means that those dodgy certificates aren’t limited to Lenovo laptops sold over a specific date range. It means that anyone who has come into contact with a Komodia product, or who has had some sort of Parental Control software installed on their computer should probably check to see if they are affected.
Twitter’s mobile-device analytics service architecture, with Kafka and Storm in full Lambda-Architecture mode
The argument for the “monorepo” — ie. lots of projects in a single Git repo. There’s lots more discussion pro/con on twitter, e.g.: https://twitter.com/search?q=monorepo&src=typd , https://twitter.com/hivetheory/timelines/449385567982067713
Poor hardware imaging practices, basically:
It looks like all devices with the fingerprint are Dropbear SSH instances that have been deployed by Telefonica de Espana. It appears that some of their networking equipment comes setup with SSH by default, and the manufacturer decided to re-use the same operating system image across all devices.
If you are running a current kernel r273872 or later, please upgrade your kernel to r278907 or later immediately and regenerate keys. I discovered an issue where the new framework code was not calling randomdev_init_reader, which means that read_random(9) was not returning good random data. This means most/all keys generated may be predictable and must be regenerated.
Lots of good advice here for dealing with SSDs
hmmm, very interesting — the super-low-latency Zing JVM is available as a commercial EC2 instance type, at costs less than the EC2 instance price
‘A great compendium by @harper of techniques for handling trolls and griefers in online communities’, via kragen
Simon McGarr says: ’80% of S.Korea’s population have had their ID number stolen, crimewave ongoing. >> Turns out a pot of honey is sweet’
Have you ever made a phone call, sent an email, or, you know, used the internet? Of course you have! Chances are, at some point over the past decade, your communications were swept up by the U.S. National Security Agency. The NSA then shares information with the UK Government’s intelligence agency GCHQ by default. A recent court ruling found that this sharing was unlawful. But no one could find out if their records were collected and then illegally shared between these two agencies… until now! Because of our recent victory against the UK intelligence agency in court, now anyone in the world — yes, ANYONE, including you — can find out if GCHQ illegally received information about you from the NSA. Join our campaign by entering your details below to find out if GCHQ illegally spied on you, and confirm via the email we send you. We’ll then go to court demanding that they finally come clean on unlawful surveillance.
‘”Equation Group” ran the most advanced hacking operation ever uncovered.’ Mad stuff. The security industry totally failed here
decent set of intro slides
Another Spark intro blog post
‘JOL (Java Object Layout) is the tiny toolbox to analyze object layout schemes in JVMs. These tools are using Unsafe, JVMTI, and Serviceability Agent (SA) heavily to decoder the actual object layout, footprint, and references. This makes JOL much more accurate than other tools relying on heap dumps, specification assumptions, etc.’ Recommended by Nitsan Wakart, looks pretty useful for JVM devs
An excellent intro to HdrHistogram usage
Butterfield insists that Slack improves on the basic messaging functionality offered by its predecessors. The company plans to expand from 100 employees to 250 this year, open an office in Dublin, and launch a version that supports large companies with multiple teams.
A tool for managing Apache Kafka. It supports the following : Manage multiple clusters; Easy inspection of cluster state (topics, brokers, replica distribution, partition distribution); Run preferred replica election; Generate partition assignments (based on current state of cluster); Run reassignment of partition (based on generated assignments)
Vaurien is basically a Chaos Monkey for your TCP connections. Vaurien acts as a proxy between your application and any backend. You can use it in your functional tests or even on a real deployment through the command-line. Vaurien is a TCP proxy that simply reads data sent to it and pass it to a backend, and vice-versa. It has built-in protocols: TCP, HTTP, Redis & Memcache. The TCP protocol is the default one and just sucks data on both sides and pass it along. Having higher-level protocols is mandatory in some cases, when Vaurien needs to read a specific amount of data in the sockets, or when you need to be aware of the kind of response you’re waiting for, and so on. Vaurien also has behaviors. A behavior is a class that’s going to be invoked everytime Vaurien proxies a request. That’s how you can impact the behavior of the proxy. For instance, adding a delay or degrading the response can be implemented in a behavior. Both protocols and behaviors are plugins, allowing you to extend Vaurien by adding new ones. Last (but not least), Vaurien provides a couple of APIs you can use to change the behavior of the proxy live. That’s handy when you are doing functional tests against your server: you can for instance start to add big delays and see how your web application reacts.
‘If it works, a copy of Burgertime for DOS is now in your browser, clickable from my entry. If it doesn’t… well, no Burgertime for you. (Unless you visit the page.) There’s a “share this” link in the new archive.org interface for sharing these in-browser emulations in web pages, weblogs and who knows what else.’
According to a report posted Thursday to the website of the state-run China Youth Daily, the Cyberspace Administration of China choral group this week unveiled a new song, “Cyberspace Spirit,” glorifying the cleanliness and clarity of China’s uniquely managed Internet. The song, an orchestral march built around a chorus that proclaims China’s ambition to become an “Internet power,” opens with lyrics describing celestial bodies keeping careful watch over the sky. From there, the lyrics conjure more vivid imagery, comparing the Internet to “a beam of incorruptible sunlight” that unites “the powers of life from all creation.”
Amazingly shitty. Never buying a Samsung TV if this is what they think is acceptable
The testers at [MAJOR PUBLISHER] had just finished wrapping up testing on a project we’ll call “Biolands.” And to congratulate them, the man in charge arranged a huge bowling/pizza party for the end of the week. Of course everyone is hyped for the event. So the day finally arrives and all the testers show up. They all start bowling and eating pizza. After a few hours of everyone enjoying themselves, the VP asks for everyone’s attention. When he does manage to get the team to listen, he begins to thank them for their hard work and has the leads hand them their termination papers.And many other horror stories from the worst software industry of all — games.
While my friends were getting sucked into “swiping” all day on their phones with Tinder, I eventually got fed up and designed a piece of software that automates everything on Tinder.This is awesome. (via waxy)
Our latest open source release from Swrve Labs: an Apache-licensed, SLF4J-compatible, simple, fluent API for rate-limited logging in Java: ‘A RateLimitedLog object tracks the rate of log message emission, imposes an internal rate limit, and will efficiently suppress logging if this is exceeded. When a log is suppressed, at the end of the limit period, another log message is output indicating how many log lines were suppressed. This style of rate limiting is the same as the one used by UNIX syslog; this means it should be comprehensible, easy to predict, and familiar to many users, unlike more complex adaptive rate limits.’ We’ve been using this in production for months — it’s pretty nifty ;) Never fear your logs again!
Retro console emulation! Mario Kart and Ocarina of Time and Conker’s Bad Fur Day! Nobody actually builds stuff with the Raspberry Pi, it’s just an odd form of nostalgic consumerism wrapped up in a faddish ‘making’ trend! The original Raspberry Pi saw a lot of emulator use, but it was limited: the Pi 1 could handle the NES, SNES, Genesis/Mega Drive, and other earlier consoles with ease. Emulator performance for N64 and original Playstation games was just barely unplayable. Now, the Raspi 2 can easily handle N64 and PSX games. [HoZyVN] tried out N64’s Mario Kart and PSX’s Spyro the Dragon. They’re playable, and an entire generation rushed out to Microcenter to relive their glory days of sitting with their faces embedded in a console television drinking Sunny D all day.
“traditional ML techniques are accurate (95%–99%) in detection but can be highly vulnerable to adversarial attacks”. ain’t that the truth
Nice looking static code validation tool for Java, from Google. I recognise a few of these errors ;)
Nathan Barley was scarcely less prophetic when it came to TV itself. In one episode Nathan’s friend Claire makes a comically po-faced, self-righteous but secretly rather narcissistic documentary about a choir made up of drug addicts. Nine years later, Channel 4 made Addicts’ Symphony for real.
the whole story of GMaps
Good DynamoDB real-world experience post, via Mitch Garnaat. We should write up ours, although it’s pretty scary-stuff-free by comparison
Mad stuff. The South Korean National Intelligence Service directly interfering in a democratic election by posting fake comments and rigging online polls
Simon McGarr has a theory — the indefinite data retention of sensitive data on primary schoolchildren actually has a genesis in the Irish state wishing to protect itself against prosecution from future child abuse cases
The regime that governs the sharing between Britain and the US of electronic communications intercepted in bulk was unlawful until last year, a secretive UK tribunal has ruled. The Investigatory Powers Tribunal (IPT) declared on Friday that regulations covering access by Britain’s GCHQ to emails and phone records intercepted by the US National Security Agency (NSA) breached human rights law.
Digital Rights Europe, Wednesday, April 15th in Dublin. deadly!
‘We suck at dealing with abuse and trolls on the platform and we’ve sucked at it for years. It’s no secret and the rest of the world talks about it every day. We lose core user after core user by not addressing simple trolling issues that they face every day. I’m frankly ashamed of how poorly we’ve dealt with this issue during my tenure as CEO. It’s absurd. There’s no excuse for it. I take full responsibility for not being more aggressive on this front. It’s nobody else’s fault but mine, and it’s embarrassing. We’re going to start kicking these people off right and left and making sure that when they issue their ridiculous attacks, nobody hears them. Everybody on the leadership team knows this is vital.’More like this!
nice deep-dive from Adrian Colyer
Excellent post — Delta sounds like a very well-designed product
Today sees the publication of a report I [Ross Anderson] helped to write for the Nuffield Bioethics Council on what happens to medical ethics in a world of cloud-based medical records and pervasive genomics. As the information we gave to our doctors in private to help them treat us is now collected and treated as an industrial raw material, there has been scandal after scandal. From failures of anonymisation through unethical sales to the care.data catastrophe, things just seem to get worse. Where is it all going, and what must a medical data user do to behave ethically? We put forward four principles. First, respect persons; do not treat their confidential data like were coal or bauxite. Second, respect established human-rights and data-protection law, rather than trying to find ways round it. Third, consult people who’ll be affected or who have morally relevant interests. And fourth, tell them what you’ve done – including errors and security breaches.
A good overview — I like the summary table. tl;dr:
If you are light on DevOps and not latency sensitive use SQS for job management and Kinesis for event stream processing. If latency is an issue, use ELB or 2 RabbitMQs (or 2 beanstalkds) for job management and Redis for event stream processing.
Al Tobey does some trial runs of -XX:+AlwaysPreTouch and -XX:+UseHugePages
ahh, interesting! This looks much easier (via JBaruch)
Marc Brooker: ‘When it comes to building working software in the long term, the emotional pursuit of craft is not as important as the human pursuit of teamwork, or the intellectual pursuit of correctness. Patterns is one of the most powerful ideas we have. The critics may be right that it devalues the craft, but we would all do well to remember that the craft of software is a means, not an end.’
Via Walter, the best description of the appeal of Minecraft I’ve read:
Minecraft is exceptionally good at intrinsic narrative. It recognises, preserves and rewards everything you do. It presses you to play frontiersman. A Minecraft world ends up dotted with torchlit paths, menhirs, landmarks, emergency caches. Here’s the hole where you dug stone for your first house. Here’s the causeway you built from your spawn point to a handy woodland. Here’s the crater in the landscape where the exploding monster took out you and your wheatfield at once. And, of course, here’s your enormous castle above a waterfall. There’s no utility in building anything bigger than a hut, but the temptations of architecture are irresistible. Minecraft isn’t so much a world generator as a screenshot-generator and a war-story generator. This is what will get the game the bulk of its critical attention, and deservedly so. That’s why I want to call attention to the extrinsic narrative. It’s minimal, implicit, accidental and very powerful. It’s this: you wake alone beside an endless sea in a pristine, infinite wilderness. The world is yours. You can literally sculpt mountains, with time and effort. You’ll die and be reborn on the beach where you woke first. You’ll walk across the world forever and never see another face. You can build a whole empire of roads and palaces and beacon towers, and the population of that empire will only ever be you. When you leave, your towers will stand empty forever. I haven’t seen that surfaced in a game before. It’s strong wine.
whoa, this is pretty excellent. The major improvement over a graphite-based system would be the multi-dimensional tagging of metrics, which we currently have to do by simply expanding the graphite metric’s name to encompass all those dimensions and use searching at query time, inefficiently.
Good example of a clean java OSS release, from Soundcloud. will be copying bits of this myself soon…
A good set of basic, controversy-free guidelines for clean java code style
from 1946 to present
According to a major new study in the journal ‘Pediatrics’, trying to [persuade anti-vaxxers to vaccinate] may actually make the problem worse. The paper tested the effectiveness of four separate pro-vaccine messages, three of which were based very closely on how the Centers for Disease Control and Prevention (CDC) itself talks about vaccines. The results can only be called grim: Not a single one of the messages was successful when it came to increasing parents’ professed intent to vaccinate their children. And in several cases the messages actually backfired, either increasing the ill-founded belief that vaccines cause autism or even, in one case, apparently reducing parents’ intent to vaccinate.
“dysaguria” is the perfect noun, and “dysagurian” is the perfect adjective, to describe the eponymous company in Dave Eggers’ The Circle. It’s not in the same league as Orwell, or Huxley, or Bradbury, or Burgess. But it does raise very important questions about what could possibly go wrong if one company controlled all the world’s information. In the novel, the company operates according to the motto “all that happens must be known”; and one of its bosses, Eamon Bailey, encourages everywoman employee Mae Holland to live an always-on (clear, transparent) life according the maxims “secrets are lies”, “sharing is caring”, and “privacy is theft”. Eggers’s debts to dystopian fiction are apparent. But, whereas writers like Orwell, Huxley, Bradbury, and Burgess were concerned with totalitarian states, Eggers is concerned with a totalitarian company. However, the noun “dystopia” and the adjective “dystopian” – perfect though they are for the terror of military/security authoritarianism in 1984, or Brave new World, or Farenheit 451, or A Clockwork Orange – do not to my mind encapsulate the nightmare of industrial/corporate tyranny in The Circle. On the other hand, “dysaguria” as a noun and “dysagurian” as an adjective, in my view really do capture the essence of that “frightening company”.
Via negatendo: ‘I would like to share my excitement about the fact that after almost a year of development, an instance of my NetHack bot has finally managed to ascend a game for the first time without human interventions, wizard mode cheats or bones stuffing, and did so at the public server at acehack.de.’ The bot is written in Clojure. Apparently ‘pudding farming’ did the trick…
League of Legends has set up private network links to a variety of major US ISPs to avoid internet weather (via Nelson)
Because there exists no method known to man, more terribly suited to expose the cosmic meaningless of existence than pairing the words of H.P. Lovecraft with seemingly delightful and charming pictures of adorable kittens.
These are very good — bookmarking for the next time I’m using gdb, probably about 3 years from now
For years, we’ve been working on a strategy to end mass surveillance of digital communications of innocent people worldwide. Today we’re laying out the plan, so you can understand how all the pieces fit together—that is, how U.S. advocacy and policy efforts connect to the international fight and vice versa. Decide for yourself where you can get involved to make the biggest difference. This plan isn’t for the next two weeks or three months. It’s a multi-year battle that may need to be revised many times as we better understand the tools and authorities of entities engaged in mass surveillance and as more disclosures by whistleblowers help shine light on surveillance abuses.
This group aims to consolidate opposition, give clear information and support letter writing and information awareness against the Dept. of Education’s Primary Online Database.
Fraud in Apple Pay will in time, come to be managed – but the fact that easily available PII can waylay best in class protection should give us all pause.
Fred Logue notes how this failed Mayo TD Michelle Mulherin:
From recent reports it mow appears that the Department of Education is discussing anonymisation of the Primary Online Database with the Data Protection Commissioner. Well someone should ask Mayo TD Michelle Mulherin how anonymisation is working for her. The Sunday Times reports that Ms Mulherin was the only TD in the Irish parliament on the dates when expensive phone calls were made to a mobile number in Kenya. The details of the calls were released under the Freedom of Information Act in an “anonymised” database. While it must be said the fact that Ms Mulherin was the only TD present on those occasions does not prove she made the calls – the reporting in the press is now raising the possibility that it was her. From a data protection point of view this is a perfect example of the difficulty with anonymisation. Data protection rules apply to personal data which is defined as data relating to a living individual who is or can be identified from the data or from the data in conjunction with other information. Anonymisation is often cited as a means for processing data outside the scope of data protection law but as Ms Mulherin has discovered individuals can be identified using supposedly anonymised data when analysed in conjunction with other data. In the case of the mysterious calls to Kenya even though the released information was “anonymised” to protect the privacy of public representatives, the phone log used in combination with the attendance record of public representatives and information on social media was sufficient to identify individuals and at least raise evidence of association between individuals and certain phone calls. While this may be well and good in terms of accounting for abuses of the phone service it also has worrying implications for the ability of public representatives to conduct their business in private. The bottom line is that anonymisation is very difficult if not impossible as Ms Mulherin has learned to her cost. It certainly is a lot more complex than simply removing names and other identifying features from a single dataset. The more data that there is and the more diverse the sources the greater the risk that individuals can be identified from supposedly anonymised datasets.
Nice wrapper for ‘tc’ and ‘netem’, for network latency/packet loss emulation
ohhhh this is very nice indeed. Great viz!
538 apply their numbercrunching skills to the BoardGameGeek ratings index
Pretty amazing specs for a 33 quid SBC.
Amlogic ARM® Cortex®-A5(ARMv7) 1.5Ghz quad core CPUs * Mali™-450 MP2 GPU (OpenGL ES 2.0/1.1 enabled for Linux and Android) * 1Gbyte DDR3 SDRAM * Gigabit Ethernet * 40pin GPIOs * eMMC4.5 HS200 Flash Storage slot / UHS-1 SDR50 MicroSD Card slot * USB 2.0 Host x 4, USB OTG x 1, * Infrared(IR) Receiver * Uses Ubuntu 14.04 or Android KitKat operating systemsIncludes HDMI out. (via Conor O’Neill)
good description of the process
A bot created by a group of artists spent the last few months selecting items at random from a Silk Road-style darknet marketplace, buying them with Bitcoin, and having them shipped to a gallery in Switzerland. After the it bought some ecstasy pills and a counterfeit passport, we asked: How will authorities deal with the complex legal and moral issue of a piece of artificial intelligence breaking the law? It turns out, the answer was simple: just arrest the computer.
Java Concurrency Tools for the JVM. This project aims to offer some concurrent data structures currently missing from the JDK: Bounded lock free queues SPSC/MPSC/SPMC/MPMC variations for concurrent queues Alternative interfaces for queues (experimental) Offheap concurrent ring buffer for ITC/IPC purposes (experimental) Executor (planned)
Good, and very accessible even for FP noobs like myself ;)
Great slide deck from Elasticsearch on JVM/dist-sys performance optimization
Nice trick — wrap servers with a libc wrapper to intercept bind(2) and accept(2) calls, so that transparent restarts becode possible
This is spot on —
By flooding the system with false positives, big-data approaches to counterterrorism might actually make it harder to identify real terrorists before they act. Two years before the Boston Marathon bombing, Tamerlan Tsarnaev, the older of the two brothers alleged to have committed the attack, was assessed by the city’s Joint Terrorism Task Force. They determined that he was not a threat. This was one of about a thousand assessments that the Boston J.T.T.F. conducted that year, a number that had nearly doubled in the previous two years, according to the Boston F.B.I. As of 2013, the Justice Department has trained nearly three hundred thousand law-enforcement officers in how to file “suspicious-activity reports.” In 2010, a central database held about three thousand of these reports; by 2012 it had grown to almost twenty-eight thousand. “The bigger haystack makes it harder to find the needle,” Sensenbrenner told me. Thomas Drake, a former N.S.A. executive and whistle-blower who has become one of the agency’s most vocal critics, told me, “If you target everything, there’s no target.”
‘All deleted tweets from politicians’. Great idea
The Youtube music service was introduced to me as a win win and they don’t understand why I don’t see it that way. “We are trying to create a new revenue stream on top of the platform that exists today.” A lot of people in the music industry talk about Google as evil. I don’t think they are evil. I think they, like other tech companies, are just idealistic in a way that works best for them. I think this because I used to be one of them. The people who work at Google, Facebook, etc can’t imagine how everything they make is not, like, totally awesome. If it’s not awesome for you it’s because you just don’t understand it yet and you’ll come around. They can’t imagine scenarios outside their reality and that is how they inadvertently unleash things like the algorithmic cruelty of Facebook’s yearly review (which showed me a picture I had posted after a doctor told me my husband had 6-8 weeks to live).
Jacobin Magazine on the revolutionary political allegory in “Snowpiercer”: ‘If Snowpiercer had merely told the tale of an oppressed working class rising up to seize power from an evil overlord, it would already have been an improvement over most of the political messages in mainstream cinema. There are all sorts of nice touches in its portrayal of a declining capitalism that can maintain its ideological legitimacy even when it literally has no more bullets in its guns. But the story Bong tells goes beyond that. It’s about the limitations of a revolution which merely takes over the existing social machinery rather than attempting to transcend it. ‘
A great resource bookmark from Falkvinge.
There are at least four good reasons to reject this argument solidly and uncompromisingly: The rules may change, it’s not you who determine if you’re guilty, laws must be broken for society to progress, and privacy is a basic human need.
‘Reasons abound for international entrepreneurs and top technical talent to stay away from Silicon Valley and build their startup somewhere else.’ Strongly agreed. This factoid is particularly nuts: ‘As Balaji Srinivasan of a16z has observed, roughly 50%+ of the capital allocated for early stage tech investments is actually flowing into Bay Area real estate, directly through office rentals and indirectly via home rentals as a primary driver of skyrocketing salaries.’
A much better carbon-relay, written in C rather than Python. Linking as we’ve been using it in production for quite a while with no problems.
The main reason to build a replacement is performance and configurability. Carbon is single threaded, and sending metrics to multiple consistent-hash clusters requires chaining of relays. This project provides a multithreaded relay which can address multiple targets and clusters for each and every metric based on pattern matches.
Blanket surveillance of social media is not the solution to combating terrorism and the rights of the individual to privacy must be protected, Data Protection Minister Dara Murphy said on Monday. [He] said Ireland and the European Union must protect the privacy rights of individuals on social media. “Freedom of expression, freedom of movement, and the protection of privacy are core tenets of the European Union, which must be upheld.”
‘Here’s a story for you. I’m not a party to any of this. I’ve done nothing wrong, I’ve never been suspected of doing anything wrong, and I don’t know anyone who has done anything wrong. I don’t even mean that in the sense of “I pissed off the wrong people but technically haven’t been charged.” I mean that I am a vanilla, average, 9-5 working man of no interest to anybody. My geographical location is an accident of my birth. Even still, I wasn’t accidentally born in a high-conflict area, and my government is not at war. I’m a sysadmin at a legitimate ISP and my job is to keep the internet up and running smoothly. This agency has stalked me in my personal life, undermined my ability to trust my friends attempting to connect with me on LinkedIn, and infected my family’s computer. They did this because they wanted to bypass legal channels and spy on a customer who pays for services from my employer. Wait, no, they wanted the ability to potentially spy on future customers. Actually, that is still not accurate – they wanted to spy on everybody in case there was a potentially bad person interacting with a customer. After seeing their complete disregard for anybody else, their immense resources, and their extremely sophisticated exploits and backdoors – knowing they will stop at nothing, and knowing that I was personally targeted – I’ll be damned if I can ever trust any electronic device I own ever again. You all rationalize this by telling me that it “isn’t surprising”, and that I don’t live in the [USA,UK] and therefore I have no rights. I just have one question. Are you people even human?’
‘Broadly, they are satisfied with what we are doing’ versus: ‘We have deep concerns about the Eircode initiative… We want to state clearly that we are not at all ‘satisfied’ with the postcode that has been designed or the implementation proposals.’
The young women interns [in one story in this post] worked in a very different way. As I explored their notes, I noticed that ideas were expanded upon, not abandoned. Challenges were identified, but the male language so often heard in Silicon Valley conference rooms – “Well, let me tell you what the problem with that idea is….” – was not in the room. These young women, without men to define the “appropriate business behavior,” used different behaviors and came up with a startling and valuable solution. They showed many of the values that exist outside of dominance-based leadership: strategic thinking, intuition, nurturing and relationship building, values-based decision-making and acceptance of other’s input. Women need space to be themselves at work. Until people who have created their success by worshipping at the temple of male behavior, like Sheryl Sandberg, learn to value alternate behaviors, the working world will remain a foreign and hostile culture to women. And if we do not continuously work to build corporate cultures where there is room for other behaviors, women will be cast from or abandoned in a world not of our making, where we continuously “just do not fit in,” but where we still must go to earn our livings.
Heh, nice trolling.
Here are two helpful guidelines (for largely disjoint populations): If you are going to use a big data system for yourself, see if it is faster than your laptop. If you are going to build a big data system for others, see that it is faster than my laptop. [...] We think everyone should have to do this, because it leads to better systems and better research.
Give them the power, they’ll use that power. ‘A document obtained under Freedom of Information legislation confirms the BBC’s use of RIPA in Northern Ireland. It states: “The BBC may, in certain circumstances, authorise under the Regulation of Investigatory Powers Act 2000 and Regulation of Investigatory Powers (British Broadcasting Corporation) Order 2001 the lawful use of detection equipment to detect unlicensed use of television receivers… the BBC has used detection authorised under this legislation in Northern Ireland.”‘
Researchers are warned off [discussing] 512-bits-plus key lengths, systems “designed or modified to perform cryptanalytic functions, or “designed or modified to use ‘quantum cryptography’”. [....] “an email to a fellow academic could land you a 10 year prison sentence”.https://twitter.com/_miw/status/556023024009224192 notes ‘the DSGL 5A002 defines it as >512bit RSA, >512bit DH, >112 bit ECC and >56 bit symmetric ciphers; weak as fuck i say.’
I drive a Toyota, and this is scary stuff. Critical software systems need to be coded with care, and this isn’t it — they don’t even have a bug tracking system!
Investigations into potential causes of Unintended Acceleration (UA) for Toyota vehicles have made news several times in the past few years. Some blame has been placed on floor mats and sticky throttle pedals. But, a jury trial verdict was based on expert opinions that defects in Toyota’s Electronic Throttle Control System (ETCS) software and safety architecture caused a fatal mishap. This talk will outline key events in the still-ongoing Toyota UA litigation process, and pull together the technical issues that were discovered by NASA and other experts. The results paint a picture that should inform future designers of safety critical software in automobiles and other systems.
“We have spoken to the National Consumer Agency, logistics companies and Digital Rights Ireland, with which we have had an indepth conversation to see if there is anything in the proposal that might be considered to have an impact on anyone’s privacy. Broadly, they are satisfied with what we are doing,” [Patricia Cronin, head of the Department of Communications’ postcodes division] told the committee. However in his letter, [DRI's] O’Lachtnain said the group “want to state clearly that we are not at all ‘satisfied’ with the postcode that has been designed or the implementation proposals”.Some nerve!
Today, 23andMe announced what Forbes reports is only the first of ten deals with big biotech companies: Genentech will pay up to $60 million for access to 23andMe’s data to study Parkinson’s. You think 23andMe was about selling fun DNA spit tests for $99 a pop? Nope, it’s been about selling your data all along.
Really nice time series dashboarding app. Might consider replacing graphitus with this…
This is pretty incredible.
Balzer downloaded a free software program called InVesalius, developed by a research center in Brazil to convert MRI and CT scan data to 3D images. He used it to create a 3D volume rendering from Scott’s DICOM images, which allowed him to look at the tumor from any angle. Then he uploaded the files to Sketchfab and shared them with neurosurgeons around the country in the hope of finding one who was willing to try a new type of procedure. Perhaps unsurprisingly, he found the doctor he was looking for at UPMC, where Scott had her thyroid removed. A neurosurgeon there agreed to consider a minimally invasive operation in which he would access the tumor through Scott’s left eyelid and remove it using a micro drill. Balzer had adapted the volume renderings for 3D printing and produced a few full-size models of the front section of Scott’s skull on his MakerBot. To help the surgeon vet his micro drilling idea and plan the procedure, Balzer packed up one of the models and shipped it off to Pittsburgh.
Some good advice and guidelines (although some are just silly).
The researchers started with 86,000 subjects who had filled out the 100-question personality profile – and this, of course, was done as another app on Facebook – and whose personality scores had been matched by algorithms with their Facebook likes. They then found 17,000 who were willing to have a friend or family member take the personality test on their behalf, trying to predict the answers they would give. The results, from most humans, were stunningly inaccurate. Friends, family and co-workers were all less able to predict how someone would fill out a personality test than the algorithms that had been primed with the subject’s Facebook likes. With only 10 likes to work on, the computer was more accurate than a work colleague would be. With 150 likes, it described the subject’s personality better than a parent or sibling could. And with 300 likes to work on, it was more accurate than a spouse.
One insider at a major US technology firm told the Guardian that “politicians are fond of asking why it is that tech companies don’t base themselves in the UK” … “I think if you’re saying that encryption is the problem, at a time when consumers and businesses see encryption as a very necessary part of trust online, that’s a very indicative point of view.”
ffs Apple. (Via Tony Finch)
A good reference URL to cut-and-paste when “scanning internet traffic for terrorist plots” rears its head:
This unrealistically accurate system will generate 1 billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999 percent and you’re still chasing 2,750 false alarms per day — but that will inevitably raise your false negatives, and you’re going to miss some of those 10 real plots.Also, Ben Goldacre saying the same thing: http://www.badscience.net/2009/02/datamining-would-be-lovely-if-it-worked/
The Prime Minister said today that he would stop the use of methods of communication that cannot be read by the security services even if they have a warrant. He said: “In our country, do we want to allow a means of communication between people which […] we cannot read?” He made the connection between encrypted communications tools and letters and phone conversations, both of which can be read by security services in extreme situations and with a warrant from the home secretary.Is this key escrow for the UK?
this is a great shopping list ;)
Lots and lots of good detail into the Spotify C* setup (via Bill de hOra)
The report’s revelations, based on a survey of nearly 800 writers worldwide, are alarming. Concern about surveillance is now nearly as high among writers living in democracies (75%) as among those living in non-democracies (80%). The levels of self-censorship reported by writers living in democratic countries are approaching the levels reported by writers living in authoritarian or semi-democratic countries.
the urgency of repealing the Irish blasphemy legislation cannot now be overstated. The same cartoons that saw their authors murdered for blasphemy recently, would see Irish authors hauled before our courts. The same nations that execute their citizens for blasphemy, wish to promote the wording of the Irish blasphemy legislation through the UN, in order to expand such provisions to more countries. Ireland is the only European country to recently introduce a new blasphemy law. Following the horrific recent events in Paris, let us be the next country to repeal our blasphemy laws.
If you haven’t heard about it, it is a compulsory database of the personal information of children, including PPS numbers, ethnicity, race and language skills, to be held for decades and shared across State agencies.
What if Silicon Valley had emerged from a racially integrated community? Would the technology industry be different? Would we? And what can the technology industry do now to avoid repeating the mistakes of the past?Amazing article — this is the best thing I’ve ever read on TechCrunch: the political history of race in Silicon Valley and East Palo Alto.
All of our assets loaded via the CDN [to our client in Australia] in just under 5 seconds. It only took ~2.7s to get those same assets to our friends down under with SPDY. The performance with no CDN blew the CDN performance out of the water. It is just no comparison. In our case, it really seems that the advantages of SPDY greatly outweigh that of a CDN when it comes to speed.