I use it to modify Time Machine’s backup behavior using weighted reservoir sampling. I built Time Warp to preserve important backup snapshots and prevent Time Machine from deleting them.via Aman. Nifty!
Amazing. Massive nanny-stateism of the ‘something must be done’ variety, with a 100% false-alarm hit rate, and it’s now policy.
‘Nominet have made a decision, based on a report by Lord Macdonald QC, that recommends that they check any domain registration that signals sex crime content or is in itself a sex crime. This is screening of domains within 48 hours of registration, and de-registration. The report says that such domains should be reported to the police.’ [....] ‘The report itself states [...] that in 2013 Nominet checked domains for key words used by the IWF, and as a result reported tens of thousands of domains to IWF for checking, all of which were false positives. Not one was, in fact, related to child sex abuse.’
from Ilya Grigorik. nginx version here: http://www.igvita.com/2013/12/16/optimizing-nginx-tls-time-to-first-byte/
A common error when using the Metrics library is to record Timer metrics on things like API calls, using the default settings, then to publish those to a time-series store like Graphite. Here’s why this is a problem.
By default, a Timer uses an Exponentially Decaying Reservoir. The docs say:
‘A histogram with an exponentially decaying reservoir produces quantiles which are representative of (roughly) the last five minutes of data. It does so by using a forward-decaying priority reservoir with an exponential weighting towards newer data. Unlike the uniform reservoir, an exponentially decaying reservoir represents recent data, allowing you to know very quickly if the distribution of the data has changed.’
This is more-or-less correct — but the key phrase is ‘roughly’. In reality, if the frequency of updates to such a timer drops off, it could take a lot longer, and if you stop updating a timer which uses this reservoir type, it’ll never decay at all. The GraphiteReporter will dutifully capture the percentiles, min, max, etc. from that timer’s reservoir every minute thereafter, and record those to Graphite using the current timestamp — even though the data it was derived from is becoming more and more ancient.
Here’s a demo. Note the long stretch of 800ms 99th-percentile latencies on the green line in the middle of this chart:
However, the blue line displays the number of events. As you can see, there were no calls to this API for that 8-hour period — this one was a test system, and the user population was safely at home, in bed. So while Graphite is claiming that there’s an 800ms latency at 7am, in reality the 800ms-latency event occurred 8 hours previously.
I observed the same thing in our production systems for various APIs which suffered variable invocation rates; if rates dropped off during normal operation, the high-percentile latencies hung around for far longer than they should have. This is quite misleading when you’re looking at a graph for 10pm and seeing a high 99th-percentile latency, when the actual high-latency event occurred 12 hours earlier, and caused lots of user confusion.
Here are some potential fixes.
Modify ExponentiallyDecayingReservoir to also call rescaleIfNeeded() inside getSnapshot() — but based on this discussion, it appears the current behaviour is intended (at least for the mean measurement), so that may not be acceptable.
Switch to sliding time window reservoirs, but those are unbounded in size — so a timer on an unexpectedly-popular API could create GC pressure and out-of-memory scenarios. It’s also the slowest reservoir type, according to the docs. That made it too risky for us to adopt in our production code as a general-purpose Timer implementation.
What we eventually did in our code was to use this Reporter class instead of GraphiteReporter; it clears all Timer metrics’ reservoirs after each write to Graphite. This is dumb and dirty, reaching across logical class boundaries, but at the same time it’s simple and comprehensible behaviour: with this, we can guarantee that the percentile/min/max data recorded at timestamp T is measuring events in that timestamp’s 1-minute window — not any time before that. This is exactly what you want to see in a time-series graph like those in Graphite, so is a very valuable feature for our metrics, and one that others have noted to be important in comparable scenarios elsewhere.
Here’s an example of what a graph like the above should look like (captured from our current staging stack):
Note that when there are no invocations, the reported 99th-percentile latency is 0, and each measurement doesn’t stick around after its 1-minute slot.
- Another potential fix, and the best of all IMO, would be to add support to Metrics so that it can use Gil Tene’s LatencyUtils package, and its HdrHistogram class, as a reservoir. This would also address some other bugs in the Exponentially Decaying Reservoir, as Gil describes:
‘In your example of a system logging 10K operations/sec with the histogram being sampled every second, you’ll be missing 9 out of each 10 actual outliers. You can have an outlier every second and think you have one roughly every 10. You can have a huge business affecting outlier happening every hour, and think that they are only occurring once a day.’
via @simonebordet, on the mechanical-sympathy list: ((c & 0x1F) + ((c >> 6) * 0×19) – 0×10)
via Ilya Grigorik: Chrome Canary now has a built-in, always-on, zero-overhead code profiler. I want this in my server-side JVMs!
from tonx. Good advice
‘The web’s only open collection of legal contracts and the best way to negotiate and sign documents online’. (via Kowalshki)
Suffice it to say that the first minute-and-a-half or so of this [speedrun] is merely an effort to spawn a specific set of sprites into the game’s Object Attribute Memory (OAM) buffer in a specific order. The TAS runner then uses a stun glitch to spawn an unused sprite into the game, which in turn causes the system to treat the sprites in that OAM buffer as raw executable code. In this case, that code has been arranged to jump to the memory location for controller data, in essence letting the user insert whatever executable program he or she wants into memory by converting the binary data for precisely ordered button presses into assembly code (interestingly, this data is entered more quickly by simulating the inputs of eight controllers plugged in through simulated multitaps on each controller port).oh. my. god. This is utterly bananas.
Use the mean absolute deviation [...] it corresponds to “real life” much better than the first—and to reality. In fact, whenever people make decisions after being supplied with the standard deviation number, they act as if it were the expected mean deviation.’ Graydon Hoare in turn recommends the median absolute deviation. I prefer percentiles, anyway ;)
Via Tony Finch. Funnily enough, the example describes Swrve: mobile game analytics, backed by a CRDT-based eventually consistent data store ;)
some good data (and graphs) on baby names (via Ruth)
Crowdsourcing transcription of some WWI artifacts: ‘The story of the British Army on the Western Front during the First World War is waiting to be discovered in 1.5 million pages of unit war diaries. We need your help to reveal the stories of those who fought in the global conflict that shaped the world we live in today.’ (via Luke)
massive image. very cool (via burritojustice)
Google Fonts recently switched to using new Zopfli compression algorithm: the fonts are ~6% smaller on average, and in some cases up to 15% smaller! [...] What’s Zopfli? It’s an algorithm that was developed by the compression team at Google that delivers ~3~8% bytesize improvement when compared to gzip with maximum compression. This byte savings comes at a cost of much higher encoding cost, but the good news is, fonts are static files and decompression speed is exactly the same. Google Fonts pays the compression cost once and every clients gets the benefit of smaller download. If you’re curious to learn more about Zopfli: http://bit.ly/Y8DEL4
Horrific. SSDs (including “enterprise-class storage”) storing sync’d writes in volatile RAM while claiming they were synced; one device losing 72.6GB, 30% of its data, after 8 injected power faults; and all SSDs tested displayed serious errors including random bit errors, metadata corruption, serialization errors and shorn writes. Don’t trust lone unreplicated, unbacked-up SSDs!
‘Fine Gael TD for Limerick, Patrick O’Donovan has called for tougher controls on the use of open source internet browsers and payment systems which allow users to remain anonymous in the illegal trade of drugs, weapons and pornography.’ Amazing. Yes, this is real.
‘Apollo 10 had a little known incident in flight as evidenced by this transcript.’ http://pic.twitter.com/NCZy7OdxDU
As can be guessed, the higher the compression ratio, the more efficient FSE becomes compared to Huffman, since Huffman can’t break the “1 bit per symbol” limit. FSE speed is also very stable, under all probabilities. I’m quite please with the result, especially considering that, since the invention of arithmetic coding in the 70′s, nothing really new has been brought to this field. This is still beta stuff, so please consider this first release for testing purposes mostly.Looking forward to this making it into a production release of some form.
A bug in a scheduled OS upgrade script caused live production DB servers to be upgraded while live. Fixes include fixing that script by verifying non-liveness on the host itself, and a faster parallel MySQL binary-log recovery command.
‘Maximising Digital Creativity, Sharing and Innovation’, Event organised by Creative Commons Ireland and Faculty of Law, University College Cork, Lecture Theatre, National Gallery of Ireland, Clare Street entrance, Dublin 2, Friday 17 January 2014, 9.45 a.m. to 1 p.m. (via Darius Whelan)
I understand, to a point, where the anti-vaccine parents are coming from. Back in the ’90s, when I was a concerned, 19-year-old mother, frightened by the world I was bringing my child into, I was studying homeopathy, herbalism, and aromatherapy; I believed in angels, witchcraft, clairvoyants, crop circles, aliens at Nazca, giant ginger mariners spreading their knowledge to the Aztecs, the Incas, and the Egyptians, and that I was somehow personally blessed by the Holy Spirit with healing abilities. I was having my aura read at a hefty price and filtering the fluoride out of my water. I was choosing to have past life regressions instead of taking antidepressants. I was taking my daily advice from tarot cards. I grew all my own veg and made my own herbal remedies. I was so freaking crunchy that I literally crumbled. It was only when I took control of those paranoid thoughts and fears about the world around me and became an objective critical thinker that I got well. It was when I stopped taking sugar pills for everything and started seeing medical professionals that I began to thrive physically and mentally.
Last week, a private space exploration company called Mars One announced that it has shortlisted 1,058 people from 200,000 applicants who wanted to travel to Mars. Roche is the only Irishman on the list. The catch? If he goes, he can never come back.Mad stuff. Works at the Science Gallery, so a co-worker of a friend, to boot
Specifically, unanonymised, confidential, patient-identifying data, for purposes of “admin, healthcare planning, and research”, to be held indefinitely, via the HSCIC. Opt-outs may be requested, however
a John-Looney-recommended MoCA adapter, allowing legacy coax home wiring to be used to transmit ethernet
An important point:
As scarily impressive as [NSA's TAO] implant catalog is, it’s targeted. We can argue about how it should be targeted — who counts as a “bad guy” and who doesn’t — but it’s much better than the NSA’s collecting cell phone location data on everyone on the planet. The more we can deny the NSA the ability to do broad wholesale surveillance on everyone, and force them to do targeted surveillance in individuals and organizations, the safer we all are.
An excellent description of how the Dual_EC_DRBG backdoor works
The history behind the 419 advance-fee fraud scam.
According to Robert Whitaker, a historian at the University of Texas, an earlier version of the con, known as the Spanish Swindle or the Spanish Prisoner trick, plagued Britain throughout the 19th century.
good dataviz of a HTTP page load: ‘this is a visualization of a Facebook News Feed load from the perspective of the client, over a 3G wireless connection. Different packet types have different shapes and colors.’ (via John Harrington)
The EC is looking for feedback — but not much, and pretty sharpish.
Go to www.copywrongs.eu and answer the questions which are important to you. You do not have to answer all the questions, only the ones that matter to you. [...] The deadline is 5 February 2014. Until then, we should provide the European Commission with as many responses as possible!
In response to XKCD 1313. This is excellent. It’s reminiscent of my SpamAssassin SOUGHT-ruleset regexp-discovery algorithm, described in http://taint.org/2007/03/05/134447a.html , albeit without the BLAST step intended to maximise pattern length and minimise false positives
Dogs preferred to excrete with the body being aligned along the North-south axis under calm magnetic field conditions.
Under Graham’s influence, Mark [Zuckerberg], like many in Silicon Valley, subscribes to the Manic Pixie Dream Hacker ideal, making self-started teenage hackers Facebook’s most desired recruiting targets, not even so much for their coding ability as their ability to serve as the faces of hacking culture. “Culture fit”, in this sense, is one’s ability to conform to the Valley’s boyish hacker fantasy, which is easier, obviously, the closer you are to a teenage boy. Like the Manic Pixie Dream Girl’s role of existing to serve the male film protagonist’s personal growth, the Manic Pixie Dream Hacker’s job is to embody the dream hacker role while growing the VC’s portfolio. This is why the dream hacker never ages, never visibly develops interests beyond hardware and code, and doesn’t question why nearly all the other people receiving funding look like him. Like the actress playing the pixie dream girl, the pixie dream boy isn’t being paid to question the role for which he has been cast. In this way, for all his supposed “disruptiveness”, the hacker pixie actually does exactly what he is told: to embody, while he can, the ideal hacker, until he is no longer young, mono-focused, and boyish-seeming enough to qualify for the role (at that point, vested equity may allow him to retire). And like in Hollywood, VCs will have already recruited newer, younger ones to play him.
Flapjack aims to be a flexible notification system that handles: Alert routing (determining who should receive alerts based on interest, time of day, scheduled maintenance, etc); Alert summarisation (with per-user, per media summary thresholds); Your standard operational tasks (setting scheduled maintenance, acknowledgements, etc). Flapjack sits downstream of your check execution engine (like Nagios, Sensu, Icinga, or cron), processing events to determine if a problem has been detected, who should know about the problem, and how they should be told.
The next time you reach for ZooKeeper, ask yourself whether it provides the primitive you really need. If ZooKeeper’s filesystem and znode abstractions truly meet your needs, great. But the odds are, you’ll be better off writing your application as a replicated state machine.
An extensive catalogue of shitty routing. Poor…
It’s expected that any new mapping and routing systems will have errors which will need to be ironed out but the level of issues with the NTA Cycle Planner is far beyond what you’d expect in a light and quiet beta launch. It’s beyond acceptable for a public PR launch directing people to a route planner with no clear warnings. It looks like a rush job which allows junior minister Alan Kelly to get his name in another press release before the end of the year.
The pupil of the eye in a photograph of a face can be mined for hidden information, such as reflected faces of the photographer and bystanders, according to research led by Dr. Rob Jenkins, of the Department of Psychology at the University of York and published in PLOS ONE (open access).(via Waxy)
“It was an out-and-out hijacking,” LeFevre told me. “They counterfeited our product, they pirated our Web site, and they basically directed all of their customer service to us.” At the peak of Willms’s sales, LeFevre says, dazzlesmile was receiving 1,000 calls a day from customers trying to cancel orders for a product it didn’t even sell. When irate consumers made the name dazzlesmile synonymous with online scamming, LeFevre’s sales effectively dropped to zero. Dazzlesmile sued Willms in November 2009; he later paid a settlement.
An exhaustive list from the UK’s Open Rights Group
a fantastic bunch of low-level kernel tweaks and tunables which Netflix have found useful in production to maximise productivity of their fleet. Interesting use of SCHED_BATCH process scheduler class for batch processes, in particular. Also, great docs on their experience with perf and SystemTap. Perf really looks like a tool I need to get to grips with…
our use of networked computers is daily coloured by fear of infection and corruption, of predators and those who would assume our identity, of viruses and data-sucking catastrophes. What if something dark is able to breach that all-important final firewall, the gap between the central processing unit and the person sitting at the keyboard? What if it already has? That would be ‘a malign and particular suspension or defeat of those fixed laws of Nature which are our only safeguard’, without a doubt — but the unplumbed space haunted by demons and chaos is the network, not the cosmos. In using the internet to creep ourselves out recreationally, we begin to understand the real ways in which it haunts our fears.(via etienneshrdlu)
‘Coinbase uses MongoDB for their primary datastore for their web app, api requests, etc.’
Working in technology has an element of pioneering, and with new frontiers come those would prefer to leave civilization behind. But in a time of growing inequality, we need technology that preserves and renews the civilization we already have. The first step in this direction is for technologists to engage with the experiences and struggles of those outside their industry and community. There’s a big, wide, increasingly poor world out there, and it doesn’t need 99% of what Silicon Valley is selling. I’ve enjoyed the thought experiment of Bitcoin as much as the next nerd, but it’s time to dispense with the opportunism and adolescent fantasies of a crypto-powered stateless future and return to the work of building technology and social services that meaningfully and accountably improve our collective quality of life.
the bottom line appears to be “think of the children” — in other words, any degree of overblocking is acceptable as long as children cannot access porn:
The debate and letter confuse legal, illegal and potentially harmful content, all of which require very different tactics to deal with. Without a greater commitment to evidence and rational debate, poor policy outcomes will be the likely result. There’s a pattern, much the same as the Digital Economy Act, or the Snooper’s Charter. Start with moral panic; dismiss evidence; legislate; and finally, watch the policy unravel, either delivering unintended harms, even to children in this case, or simply failing altogether.See https://www.openrightsgroup.org/blog/2013/talktalk-wordpress for a well-written exploration of a case of overblocking and its fallout. Talk Talk, one UK ISP, has filters which incorrectly dealt with IWF data and blocked WordPress.com’s admin interface, resulting in all blogs there become unusable for their owners for over a week, with seemingly nobody able to diagnose and fix the problem competently.
some nice super-optimized Radix Sort code which handles floating point values. See also http://codercorner.com/RadixSortRevisited.htm for more info on the histogramming/counter concept
ie. “i18n”, “a11y” etc.
According to Tex Texin, the first numeronym [..] was “S12n”, the electronic mail account name given to Digital Equipment Corporation (DEC) employee Jan Scherpenhuizen by a system administrator because his surname was too long to be an account name. By 1985, colleagues who found Jan’s name unpronounceable often referred to him verbally as “S12n”. The use of such numeronyms became part of DEC corporate culture.
this is excellent!
The British Library has uploaded one million public domain scans from 17th-19th century books to Flickr! They’re embarking on an ambitious programme to crowdsource novel uses and navigation tools for the huge corpus. Already, the manifest of image descriptions is available through Github. This is a remarkable, public spirited, archival project, and the British Library is to be loudly applauded for it!
Fantastic long-form blog post by Jay Kreps on this key concept. great stuff
The Economist reckons we’re finally seeing the light at the end of the tunnel where the patent troll shakedown is concerned:
If the use of state consumer-protection laws to ward off frivolous patent suits were to catch on, it could give the trolls serious pause for thought—especially if their mass mailings of threatening letters to businesses were met by dozens of law suits from attorneys general demanding their presence in state courts across the land. One way or another, things are beginning to look ominous for those who would exploit the inadequacies of America’s patent system.
If the full European Court of Justice (ECJ) accepts the opinion of its advocate general in a final ruling due early next year – and it almost always does – it will prove a huge vindication of Ireland’s small privacy advocacy group, Digital Rights Ireland (DRI). Its case against Irish retention laws, which began in 2006, forms the basis of this broader David v Goliath challenge and initial opinion. The advocate general’s advice largely upholds the key concerns put forward by DRI against Ireland’s laws. Withholding so much data about every citizen, including children, in case someone commits a future crime, is too intrusive into private life, and could allow authorities to create a “faithful and exhaustive map of a large portion of a person’s [private] conduct”. Retained data is so comprehensive that they could easily reveal private identities, which are supposed to remain anonymous. And the data, entrusted to third parties, is at too much risk of fraudulent or malicious use. Cruz Villalón argues that there must be far greater oversight to the retention process, and controls on access to data, and that citizens should have the right to be notified after the fact if their data has been scrutinised. The Irish Government had repeatedly waved off such concerns from Digital Rights Ireland in the past.
Florida’s spammers strike again – pushing the boundaries of intrusive direct sales and marketing
must give this a spin
Our children should be free to choose to study what really excites them, not subtly steered away from certain subjects because teachers believe in and propagate the stereotypes. Last year the IOP published a report “It’s Different for Girls” which demonstrated that essentially half of state coeducational schools did not see a single girl progress to A-level physics. By contrast, the likelihood of girls progressing from single sex schools were two and a half times greater.Amen to this.
‘SBE is an OSI layer 6 representation for encoding and decoding application messages in binary format for low-latency applications.’ Licensed under ASL2, C++ and Java supported.
‘like inetd, but for WebSockets’ — ‘a small command line tool that will wrap an existing command line interface program, and allow it to be accessed via a WebSocket. It provides a quick mechanism for allowing web-applications to interact with existing command line tools.’ Awesome idea. BSD-licensed. (Via Mike Loukides)
a metric storage daemon, exposing both a carbon listener and a simple web service. Its aim is to become a simple, scalable and drop-in replacement for graphite’s backend.Pretty alpha for now, but definitely worth keeping an eye on to potentially replace our burgeoning Carbon fleet…
In this talk Kaushik Srenevasan describes a new, low overhead, full-stack tool (based on the Linux perf profiler and infrastructure built into the Hotspot JVM) we’ve built at Twitter to solve the problem of dynamically profiling and tracing the behavior of applications (including managed runtimes) in production.Looks very interesting. Haven’t watched it yet though
[MMOGs], the [NSA] analyst wrote, “are an opportunity!”. According to the briefing notes, so many different US intelligence agents were conducting operations inside games that a “deconfliction” group was required to ensure they weren’t spying on, or interfering with, each other.
Fantastic wrap-up of the story so far on the pervasive global surveillance story.
The history of the intelligence community, though, reveals a willingness to violate the spirit and the letter of the law, even with oversight. What’s more, the benefits of the domestic-surveillance programs remain unclear. Wyden contends that the N.S.A. could find other ways to get the information it says it needs. Even Olsen, when pressed, suggested that the N.S.A. could make do without the bulk-collection program. “In some cases, it’s a bit of an insurance policy,” he told me. “It’s a way to do what we otherwise could do, but do it a little bit more quickly.” In recent years, Americans have become accustomed to the idea of advertisers gathering wide swaths of information about their private transactions. The N.S.A.’s collecting of data looks a lot like what Facebook does, but it is fundamentally different. It inverts the crucial legal principle of probable cause: the government may not seize or inspect private property or information without evidence of a crime. The N.S.A. contends that it needs haystacks in order to find the terrorist needle. Its definition of a haystack is expanding; there are indications that, under the auspices of the “business records” provision of the Patriot Act, the intelligence community is now trying to assemble databases of financial transactions and cell-phone location information. Feinstein maintains that data collection is not surveillance. But it is no longer clear if there is a distinction.
Sherlock’s record is spotty at best when it comes to engagement. Setting aside the 80,680 people who were ignored by the minister, he was hostile and counter productive to debate from the beginning, going so far as to threaten to pull out of a public debate because a campaigner against the ['Irish SOPA'] SI would be in attendance. His habit of blocking people online who publicly ask him tough yet legitimate questions has earned him the nickname “Sherblock”.
Most utilities don’t want smart metering. In fact they seem to have used the wrong dictionary. It is difficult to find anything smart about the UK deployment, until you realise that the utilities use smart in the sense of “it hurts”. They consider they have a perfectly adequate business model which has no need for new technology. In many Government meetings, their reluctant support seems to be a veneer for the hope that it will all end in disaster, letting them go back to the world they know, of inflated bills and demands for money with menaces. [...] Even when smart meters are deployed, there is no evidence that any utility will use the resulting data to transform their business, rather than persecute the consumer. At a recent US conference a senior executive for a US utility which had deployed smart meters, stated that their main benefit was “to give them more evidence to blame the customer”. That’s a good description of the attitude displayed by our utilities.
Similar to ACID properties, if you partially provide properties it means the user has to _still_ consider in their application that the property doesn’t exist, because sometimes it doesn’t. In you’re fsync example, if fsync is relaxed and there are no replicas, you cannot consider the database durable, just like you can’t consider Redis a CP system. It can’t be counted on for guarantees to be delivered. This is why I say these systems are hard for users to reason about. Systems that partially offer guarantees require in-depth knowledge of the nuances to properly use the tool. Systems that explicitly make the trade-offs in the designs are easier to reason about because it is more obvious and _predictable_.
Good blog post about EVE’s algorithm to load-balance a 3D map of star systems
a nice pattern for unit tests which need deterministic time behaviour. Trying to think up a really nice API for this….
Simon McGarr on Ireland’s looming data-protection train-crash.
Last week, during the debate of his proposals to increase fees for making a Freedom of Information request, Brendan Howlin was asked how one of his amendments would affect citizens looking for data from the State’s electronic databases. His reply was to cheerfully admit he didn’t even understand the question. “I have no idea what an SQL code is. Does anyone know what an SQL code is?” Unlike the minister, it probably isn’t your job to know that SQL is the computer language that underpins the data industry. The amendment he had originally proposed would have effectively allowed civil servants to pretend that their computer files were made of paper when deciding whether a request was reasonable. His answer showed how the Government could have proposed such an absurd idea in the first place. Like it or not – fair or not – these are not the signals a country that wanted to build a long-term data industry would choose to send out. They are the sort of signals that Ireland used to send out about Financial Regulation. I think it’s agreed, that approach didn’t work out so well.
good blog post writing up the ‘flock -n -c’ trick to ensure single-concurrent-process locking for cron jobs
Good article on road safety and visual perception, for both cyclists and drivers.
a modern HTTP benchmarking tool capable of generating significant load when run on a single multi-core CPU. It combines a multithreaded design with scalable event notification systems such as epoll and kqueue. An optional LuaJIT script can perform HTTP request generation, response processing, and custom reporting.Written in C, ASL2 licensed.
Based on a working paper from University of Toronto researcher Laurina Zhang
Comparing album sales of four major labels before and after the removal of DRM reveals that digital music revenue increases by 10% when restrictions are removed. The effect goes up to 30% for long tail content, while top-selling albums show no significant jump. The findings suggest that dropping technical restrictions can benefit both artists and the major labels.more details: http://inside.rotman.utoronto.ca/laurinazhang/files/2013/11/laurina_zhang_jmp_nov4.pdf , “Intellectual Property Strategy and the Long Tail: Evidence from the Recorded Music Industry”, Laurina Zhang, November 4, 2013
The English bulldog has come to symbolize all that is wrong with the dog fancy and not without good reason; they suffer from almost every possible disease. A 2004 survey by the Kennel Club found that they die at the median age of 6.25 years (n=180). There really is no such thing as a healthy bulldog. The bulldog’s monstrous proportions makes them virtually incapable of mating or birthing without medical intervention.(via Bryan)
Samy Kamkar strikes again. ‘Using a Parrot AR.Drone 2, a Raspberry Pi, a USB battery, an Alfa AWUS036H wireless transmitter, aircrack-ng, node-ar-drone, node.js, and my SkyJack software, I developed a drone that flies around, seeks the wireless signal of any other drone in the area, forcefully disconnects the wireless connection of the true owner of the target drone, then authenticates with the target drone pretending to be its owner, then feeds commands to it and all other possessed zombie drones at my will.’
Good article about emergent behaviour from networked malware: ‘The metabot, therefore, is viral. You get followed because of who follows you. This tendency explains the strange geographical cluster among San Diego high school students. Perhaps one of those kids was being followed by a really popular account (like @Interscope records, perhaps, which follows hundreds of thousands of people), and through that link, the bot stumbled into this little circle of San Diego teens. All of this activity would have remained under the radar, of course, all part of the silent non-human web. Except something went awry. For some reason, Olivia got stuck in a weird loop, and the metabot kept spawning spambots that chose to follow her over and over, relentlessly. Maybe once the metabot reached the San Diego kids, a bug kicked in. Instead of negative feedback keeping her (and everyone else) from being followed too often, we got runaway positive feedback. The bots followed her because other bots followed her. And on and on. Which is, perhaps a kind of reasoning that we can understand: It’s the core logic of fame and celebrity itself. Attention flows to Snooki because attention flowed to Snooki. Attention flows to Olivia because attention flowed to Olivia. Olivia and her friends weren’t wrong when they thought she’d become suddenly famous. Her audience just wasn’t human.’
> reorg Ok, you reorganize all zero of your direct reports. Way to stay out of trouble, Hoss. Perhaps you’d like to coin an acronym?
y’know, for kids. now that would improve the slightly boring, functional helmet my middle kid wears…
Wow, I didn’t know about this. Great idea.
Need a flexible format to record, export, and analyze network performance data? Well, that’s exactly what the HTTP Archive format (HAR) is designed to do! Even better, did you know that Chrome DevTools supports it? In this episode we’ll take a deep dive into the format (as you’ll see, its very simple), and explore the many different ways it can help you capture and analyze your sites performance. Join Ilya Grigorik and Peter Lubbers to find out how to capture HAR network traces in Chrome, visualize the data via an online tool, share the reports with your clients and coworkers, automate the logging and capture of HAR data for your build scripts, and even adapt it to server-side analysis use cases
this is absolutely fantastic. Thanks flood.io!
it might seem that current efforts to identify and track potential terrorists would be approached with caution. Yet the federal government’s main terrorist watch list has grown to at least 700,000 people, with little scrutiny over how the determinations are made or the impact on those marked with the terrorist label. “If you’ve done the paperwork correctly, then you can effectively enter someone onto the watch list,” said Anya Bernstein, an associate professor at the SUNY Buffalo Law School and author of “The Hidden Costs of Terrorist Watch Lists,” published by the Buffalo Law Review in May. “There’s no indication that agencies undertake any kind of regular retrospective review to assess how good they are at predicting the conduct they’re targeting.”
a demo of Doug Lea’s latest concurrent data structure in Java 8
lulz. (via John Handelaar)
A nice worked-through Docker example
Really stupid — Facebook infers a “like” for a site when you send a reference to a URL on that site. Obviously broken behaviour. (via http://www.forbes.com/sites/anthonykosner/2013/01/21/facebook-is-recycling-your-likes-to-promote-stories-youve-never-seen-to-all-your-friends/ )
Newegg, an online retailer that has made a name for itself fighting the non-practicing patent holders sometimes called “patent trolls,” sits on the losing end of a lawsuit tonight. An eight-person jury came back shortly after 7:00pm and found that the company infringed all four asserted claims of a patent owned by TQP Development, a company owned by patent enforcement expert Erich Spangenberg.“patent enforcement expert”. That’s one way to put it. This is insanity.
pretty strong argument. However, I think shlibs still have an advantage in that their pages are easier to share…
“We’ve heard a good bit in this courtroom about public key encryption,” said Albright. “Are you familiar with that? “Yes, I am,” said Diffie, in what surely qualified as the biggest understatement of the trial. “And how is it that you’re familiar with public key encryption?” “I invented it.”(via burritojustice)
Yahoo!’s streaming machine learning platform, built on Storm, implementing:
As a library, SAMOA contains state-of-the-art implementations of algorithms for distributed machine learning on streams. The first alpha release allows classification and clustering. For classification, we implemented a Vertical Hoeffding Tree (VHT), a distributed streaming version of decision trees tailored for sparse data (e.g., text). For clustering, we included a distributed algorithm based on CluStream. The library also includes meta-algorithms such as bagging.
yay (via Tony Finch)
The jury found that Agence France-Presse and Getty Images willfully violated the Copyright Act when they used photos Daniel Morel took in his native Haiti after the 2010 earthquake that killed more than 250,000 people, Morel’s lawyer, Joseph Baio, said