Justin Mason's Weblog Posts
Interesting thread on the current state of low-cost/low-power server hardware; I didn’t realise thin client boxes were so viable for this use case, these days. (I’ve just replaced my current home server with an ODROID HC4, and I’m absolutely delighted with it, though…)
GoMo, the Irish mobile phone operator, is offering roaming eSIMs with 10GB of data roaming in the US for EUR19.99 per month
While not demonstrating a causal link, the correlations are pretty striking — good argument for greatly increasing vaccination rates for many viral diseases.
Around 80 percent of the viruses implicated in brain diseases were considered ‘neurotrophic’, which means they could cross the blood-brain barrier. “Strikingly, vaccines are currently available for some of these viruses, including influenza, shingles (varicella-zoster), and pneumonia,” the researchers write. “Although vaccines do not prevent all cases of illness, they are known to dramatically reduce hospitalization rates. This evidence suggests that vaccination may mitigate some risk of developing neurodegenerative disease.” The impact of viral infections on the brain persisted for up to 15 years in some cases. And there were no instances where exposure to viruses was protective.
‘DynamoDB Shell (ddbsh) is an interactive CLI for Amazon DynamoDB’, emulating an SQL-like command syntax, from AWS Labs
A human freelancer might have a typo here or there, or maybe a misconception about APR versus APY. But an article by an AI can be total, authoritative-sounding gibberish. The poor editor in charge of fact-checking whatever the Machine produces isn’t looking for a needle in a haystack; they’re faced with a stack of needles, many of which look remarkably like hay.
CNET used an AI to generate automated content for their site, and are definitely in the “finding out” stage from the looks of things:
All told, a pattern quickly emerges. Essentially, CNET’s AI seems to approach a topic by examining similar articles that have already been published and ripping sentences out of them. As it goes, it makes adjustments — sometimes minor, sometimes major — to the original sentence’s syntax, word choice, and structure. Sometimes it mashes two sentences together, or breaks one apart, or assembles chunks into new Frankensentences. Then it seems to repeat the process until it’s cooked up an entire article. […] The question of exactly how CNET’s disastrous AI was trained may end up taking center stage as the drama continues to unfold. At a CNET company meeting late last week […] the outlet’s executive vice president of content and audience refused to tell staff — many of them acclaimed tech journalists who have written extensively about the rise of machine learning — what data had been used to train the AI. The legality of using data to train an AI without the consent of the people who created that data is currently being tested by several lawsuits against the makers of prominent image generators, and could become a flashpoint in the commercialization of the tech.
A Python module to abstract usage of several different types of EPD (electronic paper displays), including Inky and Waveshare hardware.
“a picture frame to show you random AI art every day” — nice little epd/pi hack
looks like Amazon are now exposing a bunch of error metrics for their EC2 instance network drivers in Linux
Solid data now up for the bivalent BA.5 SARS-CoV-2 vaccine, says Eric Topol: “we now have extensive data that is quite encouraging — better and broader than expected — that I’m going to briefly review here”
Current state of research into Long COVID, courtesy of Nature Reviews Microbiology.
Long COVID is an often debilitating illness that occurs in at least 10% of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections. More than 200 symptoms have been identified with impacts on multiple organ systems. At least 65 million individuals worldwide are estimated to have long COVID, with cases increasing daily. Biomedical research has made substantial progress in identifying various pathophysiological changes and risk factors and in characterizing the illness; further, similarities with other viral-onset illnesses such as myalgic encephalomyelitis/chronic fatigue syndrome and postural orthostatic tachycardia syndrome have laid the groundwork for research in the field. In this Review, we explore the current literature and highlight key findings, the overlap with other conditions, the variable onset of symptoms, long COVID in children and the impact of vaccinations. Although these key findings are critical to understanding long COVID, current diagnostic and treatment options are insufficient, and clinical trials must be prioritized that address leading hypotheses.
When a 25-year-old activist from Minsk who goes by Pavlo was detained by Belarusian KGB security forces last summer, he knew they would search his phone, looking for evidence of his involvement in anti-government protests. The police officer asked for Pavlo’s password to Telegram, the most popular messenger app among Belarusian activists, which he gave him. The officer entered it and… found nothing. All secret chats and news channels had disappeared, and after a few minutes of questioning Pavlo was released. Pavlo’s secret? A secure version of Telegram, developed by a hacktivist group from Belarus called the Cyber Partisans. Partisan Telegram, or P-Telegram, automatically deletes pre-selected chats when someone enters the so-called SOS password.
… after entering a fake [SOS] password, P-Telegram can automatically log out of the account, delete selected chats and channels, and even send a notification about the arrest of the account owners to their friends or families. P-Telegram also allows other activists to remotely activate the SOS password on the detainee’s phone. For this, they need to send a code word to any of the shared Telegram chats. Another feature on P-Telegram automatically takes photos of law enforcement officers on the front camera when they enter a fake password. “We warn users that this can be dangerous, as this photo will be stored on the phone, revealing that a person may use Partisan Telegram,” Shemetovets said. Cyber Partisans are constantly updating their app, fixing bugs, and adding new features. They also regularly conduct independent audits to ensure that P-Telegram complies with all security measures. A recent audit by Open Technology Fund’s Red Team Lab proved that it is almost impossible for “casual observers without technical knowledge and specialized equipment” to identify the existence of P-Telegram on a device.
“Command line tool for inspecting Parquet files”, replacement for parquet-tools, written in Rust. Now do Orc!
For once, an honest architecture diagram (featuring “VPN of sadness”, “cool databases” vs “real database”, “blame radius” and the “one tiny cron job that keeps everything from falling apart”)
This is an absurd hellscape:
Legal Aid filed a federal lawsuit in 2016, arguing that the state had instituted a new [healthcare] policy without properly notifying the people affected about the change. There was also no way to effectively challenge the system, as they couldn’t understand what information factored into the changes, De Liban argued. No one seemed able to answer basic questions about the process. “The nurses said, ‘It’s not me; it’s the computer,’” De Liban says. When they dug into the system, they discovered more about how it works. Out of the lengthy list of items that assessors asked about, only about 60 factored into the home care algorithm. The algorithm scores the answers to those questions, and then sorts people into categories through a flowchart-like system. It turned out that a small number of variables could matter enormously: for some people, a difference between a score of a three instead of a four on any of a handful of items meant a cut of dozens of care hours a month. (Fries didn’t say this was wrong, but said, when dealing with these systems, “there are always people at the margin who are going to be problematic.”) […] From the state’s perspective, the most embarrassing moment in the dispute happened during questioning in court. Fries was called in to answer questions about the algorithm and patiently explained to De Liban how the system works. After some back-and-forth, De Liban offered a suggestion: “Would you be able to take somebody’s assessment report and then sort them into a category?” […] Fries said he could, although it would take a little time. He looked over the numbers for Ethel Jacobs. After a break, a lawyer for the state came back and sheepishly admitted to the court: there was a mistake. Somehow, the wrong calculation was being used. They said they would restore Jacobs’ hours. “Of course we’re gratified that DHS has reported the error and certainly happy that it’s been found, but that almost proves the point of the case,” De Liban said in court. “There’s this immensely complex system around which no standards have been published, so that no one in their agency caught it until we initiated federal litigation and spent hundreds of hours and thousands of dollars to get here today. That’s the problem.”
Amazing collection of Java async-profiler commands and examples, each one representing a specific common (or not-so-common) use case we are liable to run into with production services: includes continuous profiling, wall-clock vs CPU, allocations, locks, cache misses, page faults, and thread-startup overhead
This is some very impressive work on reverse engineering a fairly advanced IoT device (the Google Home Mini), discovering and exploiting its security holes.
I was recently rewarded a total of $107,500 by Google for responsibly disclosing security issues in the Google Home smart speaker that allowed an attacker within wireless proximity to install a “backdoor” account on the device, enabling them to send commands to it remotely over the Internet, access its microphone feed, and make arbitrary HTTP requests within the victim’s LAN (which could potentially expose the Wi-Fi password or provide the attacker direct access to the victim’s other devices). These issues have since been fixed.
This was an open question from earlier in the pandemic — does vaccination reduce transmission and infectiousness: ‘In our main analysis, we found that any COVID-19 vaccine reduced infectiousness by 22% (6–36%) and prior infection reduced infectiousness by 23% (3–39%). Hybrid immunity reduced infectiousness by 40% (20–55%).’
lhl likes Caddy:
Caddy https://caddyserver.com/ came up in conversation earlier today. It’s been my favorite reverse proxy/web server for the past few years because of how simple it is to setup and for it’s automagic LetsEncrypt setup. (This post is actually being pushed through Caddy on my fediverse server, and was basically the easiest part of the setup). For those interested, it performs pretty competitively with nginx: https://blog.tjll.net/reverse-proxy-hot-dog-eating-contest-caddy-vs-nginx/ but IMO the main selling point (why I first installed it) was the automagic HTTPS setup: https://caddyserver.com/docs/automatic-https
A gateway bot from Twitter to Mastodon —
One of the things I would miss here on Mastodon was all of the alerts from my local infrastructure and government twitter accounts. These will likely take a very long time to make the migration. With https://bird.makeup, you can create bot accounts that put those tweets in your Mastodon timeline.
‘In this blog, we will cover how Panther deployed phishless FIDO2 (WebAuthn) security keys, including details on the hardware, software and steps taken. The aim of this blog is to help other organizations understand, prioritize and deploy this effective security control.’ A lot of good detail into the practical aspects of switching to YubiKeys.
Yikes this is bad. A robot vacuum recorded video, uploaded it to iRobot, then that video was sent to teams of data-labelling gig workers in Venezuela, where they picked out some “highlights” and shared it on Facebook
Quite a complicated process — extracting the eMMC chip is way beyond my abilities! — but using FCCID.io is a neat trick
Digging into what Github Copilot sends back to its servers; particularly of interest is the wealth of accompanying tokens/snippets that get included as context (“prompt suffix/prefix”)
“a hypothetical scenario in which a machine learning system trained on its own output becomes unable to function properly or make meaningful predictions”
Via ted byfield: “If you’ve wondered what AI-bots are ~thinking while they generate an image, here you go.” Reverse-engineering the training samples which Stable Diffusion et al are combining for a given text query, in the laion5B or laion_400m datasets
This image contains a “cursed color”:
There is a cursed color in the Kodak ProPhoto RGB color space which, when converted to sRGB using pre-August-2020-Security-Update Android’s image conversion routines, causes an integer overflow and a crash due to a rounding error. Some dude accidentally created an image ( https://www.flickr.com/photos/gaurav_agrawal/48746079687/ ) which contains the cursed color on a single pixel. In 2020 if you set this image as your desktop on a Google or Samsung device, the device would brick & lose all onboard data.
This is vital context for discussions of revitalised nuclear power.
An older [nuclear waste] reprocessing plant on site earned £9bn over its lifetime, half of it from customers overseas. But the pursuit of commercial reprocessing turned Sellafield and a similar French site into “de facto waste dumps”, the journalist Stephanie Cooke found in her book In Mortal Hands. Sellafield now requires £2bn a year to maintain. What looked like a smart line of business back in the 1950s has now turned out to be anything but. With every passing year, maintaining the world’s costliest rubbish dump becomes more and more commercially calamitous.
Using Auth0 to provide user accounts in a small-scale web side project without requiring lots of extra work
Some updated numbers on Long COVID risk from epidemiologist Katelyn Jetelina
High-performance inference of OpenAI’s Whisper automatic speech recognition (ASR) model: Plain C/C++ implementation without dependencies; Apple silicon first-class citizen – optimized via Arm Neon and Accelerate framework; AVX intrinsics support for x86 architectures; Mixed F16 / F32 precision; Low memory usage (Flash Attention + Flash Forward); Zero memory allocations at runtime; Runs on the CPU; C-style API
The earliest sunset happens today in Ireland — several days prior to the winter solstice (which is the day with the least amount of sunlight). This page explains it all
‘This is your brain on capitalism’. A shitty cyberpunk future:
What about when the [bricked] device is inside your body? Earlier this year, many people with Argus optical implants – which allow blind people to see – lost their vision when the manufacturer, Second Sight, went bust. Nano Precision Medical, the company’s new owners, aren’t interested in maintaining the implants, so that’s the end of the road for everyone with one of Argus’s “bionic” eyes. The $150,000 per eye that those people paid is gone, and they have failing hardware permanently wired into their nervous systems. Having a bricked eye implant doesn’t just rob you of your sight – many Argus users experience crippling vertigo and other side effects of nonfunctional implants. The company has promised to “do our best to provide virtual support” to people whose Argus implants fail – but no more parts and no more patches.”
Looks like I didn’t bookmark this one? Marc Brooker on how DynamoDB use redundant, additional requests to their MemDS caching service in order to avoid surprising variability in service performance which could affect service availability. Good example of the “constant work” pattern described by colmmacc at https://aws.amazon.com/builders-library/reliability-and-constant-work/ .
A 40-page Bachelor’s degree thesis on the legendary bit-hacking Quake III Q_rsqrt() implementation (via redacted):
This function, commonly called InvSqrt, approximates the inverse (or reciprocal) square root of a 32-bit floating point number very quickly. It can be found in many open source libraries and games on the Internet, such as the C source code for Quake III: Arena. This raises many questions. Why is it needed? Who wrote it? How does it work? How well does it work? Is it still useful with modern processors today? And finally, can it be improved to work better? This thesis will examine those questions and give a unique interpretation and optimization of the function itself.
an OSS clone of a Pinboard-style bookmark service. ‘designed be to be minimal, fast, and easy to set up using Docker.’ Bookmarking for emergency use only; if anything happens to Pinboard.in, I’ll have this to fall back to. (via dahamsta)
good write-up on the process to get data out of the SolisCloud backend and into Home Assistant
My Pinboard links feed is now on the Fediverse at botsin.space; I’ll blog up the process shortly
what the hell? “Unless you’re on an operator that sells Pixel phones directly, who basically comprise the “Google list” for these features, [wifi calling] won’t work for any [directly-purchased] Pixel phone [in Ireland]. Same all over Europe. VoLTE won’t work either when on a mobile network (data speeds will drop to 3G when on a voice call) […] Your only option would be to root the phone to get it to work. There seem to have been some recent changes on this but seems like Eir still no go.” I’ve been wondering why VoLTE and VoWifi have been unavailable on my phone for several months now, assuming it was an operator issue. Finally I was sent this link by a poster on another forum — it’s not an issue with the operator, it’s a builtin limitation on the phone. All I can presume is that Google have done exclusivity deals with some providers in some regions, but is keeping this secret for some reason. If I’d known this in advance, I’d probably have bought a different phone; absolutely terrible decision. Reportedly it can be reversed via rooting the phone, at least.
Ranked user-generated content sites like Stack Overflow are really going to have a problem with the incoming plausible-sounding bullshit flood:
“The primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce,” wrote the mods (emphasis theirs). “As such, we need the volume of these posts to reduce […] So, for now, the use of ChatGPT to create posts here on Stack Overflow is not permitted. If a user is believed to have used ChatGPT after this temporary policy is posted, sanctions will be imposed to prevent users from continuing to post such content, even if the posts would otherwise be acceptable.”
“Do you think that the concern over A.I.’s expanding capabilities is misplaced? I do. I think that the problems of A.I. are not its ability to do things well but its ability to do things badly, and our reliance on it nevertheless. So the problem isn’t that A.I. is going to displace all of our truck drivers. The fact that we’re using A.I. decision-making at scale to do things like lending, and deciding who is picked for child-protective services, and deciding where police patrols go, and deciding whether or not to use a drone strike to kill someone, because we think they’re a probable terrorist based on a machine-learning algorithm—the fact that A.I. algorithms don’t work doesn’t make that not dangerous. In fact, it arguably makes it more dangerous. The reason we stick A.I. in there is not just to lower our wage bill so that, rather than having child-protective-services workers go out and check on all the children who are thought to be in danger, you lay them all off and replace them with an algorithm.”
Worrying thread — I didn’t realise Pinboard was at risk of atrophy. This blog is built on it!
yikes. “U.S. Govt. Apps Bundled Russian Code With Ties to Mobile Malware Developer”:
A recent scoop by Reuters revealed that mobile apps for the U.S. Army and the Centers for Disease Control and Prevention (CDC) were integrating software that sends visitor data to a Russian company called Pushwoosh, which claims to be based in the United States. But that story omitted an important historical detail about Pushwoosh: In 2013, one of its developers admitted to authoring the Pincer Trojan, malware designed to surreptitiously intercept and forward text messages from Android mobile devices.
At a meta level, something I find mildly interesting is how many people [jm: ex-Twitter staff specifically] are writing stuff on Mastodon about how it’s impossible for Mastodon to scale up without using an ad supported model (b/c server costs), it’s better to have ranked feeds because most people want them, etc. The thing I think is interesting is that the people writing this stuff, implicitly, seemingly cannot conceive of a model where the organization is not growth and profit maximizing.
Really valuable info if you’re building resilient services atop AWS; Amazon revealing where their services have cross-region or single-region-of-failure dependencies
ESB Networks is (finally) offering end-users access to their smart electricity meter data with 30-minute granularity
‘Rosetta 2 is remarkably fast when compared to other x86-on-ARM emulators. I’ve spent a little time looking at how it works, out of idle curiosity, and found it to be quite unusual, so I figured I’d put together my notes.’
A little-known detail of the EU Consumer Rights Directive: you have a right to repair or replacement of faulty goods if they fail within 2 years of purchase. The nice thing about this is that so much hardware has built-in obsolescence after only 1 year… you may have to invoke the magic words “EU Consumer Rights Directive” to get this to happen, though. Worth noting that according to one account “the rights only apply in the country of purchase. I’ve had Apple refuse to replace a Magic trackpad that died after 14 months and they would not repair an Airpods case that died after 18 months. I had purchased both in the UK.”
Even better than the EU consumer rights directive!
Under Irish consumer law, consumers are entitled to a free of charge repair or (depending on the circumstances) may be entitled to a replacement, discount or refund by the seller, of defective goods or goods which do not conform with the contract of sale. These rights expire six years from delivery of the goods.
Thought-provoking Mastodon thread about full-scale disaster recovery for large-scale modern software platforms. Here’s a gem:
When I was in Azure, I asked around about what the plan was if “the really big one” hit since deep expertise was nearly totally concentrated in Redmond and, at the time, Azure was guaranteed to have a global outage if a major earthquake incapacitated Redmond. Of course the plan was that there was no real plan and people expected that Azure would have a very extended global outage and an org that was on its way to becoming a $1T business unit would have its value basically wiped out.
Generative AI has had a very good year. Corporations like Microsoft, Adobe, and GitHub are integrating the tech into their products; startups are raising hundreds of millions to compete with them; and the software even has cultural clout, with text-to-image AI models spawning countless memes. But listen in on any industry discussion about generative AI, and you’ll hear, in the background, a question whispered by advocates and critics alike in increasingly concerned tones: is any of this actually legal?
Etsy: “Estimating kWh in the Cloud”:
We thought about how we might be able to estimate our energy consumption in Google Cloud using the data we do have: Google provides us with usage data that shows us how many virtual CPU (Central Processing Unit) seconds we used, how much memory we requested for our servers, how many terabytes of data we have stored for how long, and how much networking traffic we were responsible for. Our supposition was that if we could come up with general estimates for how many watt-hours (Wh) compute, storage and networking draw in a cloud environment, particularly based on public information, then we could apply those coefficients to our usage data to get at least a rough estimate of our cloud computing energy impact. We are calling this set of estimated conversion factors Cloud Jewels. Other cloud computing consumers can look at this and see how it might work with their own energy usage across providers and usage data. The goal is to help cloud users across the industry to help refine our estimates, and ultimately help us encourage cloud providers to empower their customers with more accurate cloud energy consumption data.This is a good interim step, but it’s disappointing how inaccurate the CO2 data exposed by cloud providers is. IMO this needs to be fixed
Interesting — I didn’t realise it was possible to connect to the Mastodon fediverse with such a low-impact service —
A single-user instance with about 100 followers/followees uses somewhere between 50 to 100MB of RAM. CPU usage is only intensive when handling media or processing lots of federation requests.
A new form of COVID-19 misinformation has cropped up in Canada:
The term “immunity debt” is circulating widely online as an explanation for a significant surge in respiratory illness in Canada [… This] hypothesis suggests people’s immune systems are weaker now, due to a lack of exposure to viruses while observing COVID-19 public health measures over the last two-and-a-half years. But this notion […] is simply not true, says Colin Furness, an infection control epidemiologist and assistant professor in the faculty of information at the University of Toronto. “That is, in my estimation, and any immunologist will tell you this, nonsense,” he said. Dr. Samira Jeimy, an allergist and clinical immunologist at St Joseph’s Health Care London, agrees, saying the idea that one’s immune system can be weakened due to lack of exposure to illness “shows a basic lack of understanding of how the immune system works.” “There’s almost like an old wives tale, that you need to get sick to develop a healthy immune system. That’s actually not true.”
“Will AI image generators kill the stock image industry? It’s a question asked by many following the rise of text-to-image AI models in recent years. The answer from the industry’s incumbents, though, is “no” — not if we can start selling AI-generated content first. Given that Shutterstock licensed data to OpenAI to train DALL-E in 2021, it means that the model’s output will soon be competing with the same individuals whose content it relies on. At the same time, Shutterstock is also officially banning users from selling AI generated content on its platform. The company’s rationale is that it can’t validate the copyright of this output. However, it also means contributors won’t be able to compete with its own AI art services.” Great, I am looking forward to some really shitty AI output cropping up in stock images in the near future….
Cheery stuff from the Bulletin of Atomic Scientists, based on updated modelling
They would, naturally….
“There are online services that, purportedly using artificial intelligence (AI), extract, or rather, copy, the vocals, instrumentals, or some portion of the instrumentals from a sound recording, and/or generate, master or remix a recording to be very similar to or almost as good as reference tracks by selected, well known sound recording artists […] To the extent these services, or their partners, are training their AI models using our members’ music, that use is unauthorized and infringes our members’ rights by making unauthorized copies of our members works. In any event, the files these services disseminate are either unauthorized copies or unauthorized derivative works of our members’ music”
“Okay, the time has come, it’s been an entire decade, let’s talk about loadbalancing techniques and how they evolved at Google in response to various practical failure modes, from 2008 to 2012.” This thread is great. A solid history of Google’s use of various load balancing techniques, ranging from N+1 service duplication with implicit failover rules, modern-service-mesh-style proxying, client-side builtin load balancing libs, followed by local sidecars which downloaded routing assignment configs periodically and operated mainly offline.
tl;dr: 6.2% average rate, more women than men, 15% continued to suffer after 12 months.
A total of 1.2 million individuals who had symptomatic SARS-CoV-2 infection were included (mean age, 4-66 years; males, 26%-88%). In the modeled estimates, 6.2% (95% uncertainty interval [UI], 2.4%-13.3%) of individuals who had symptomatic SARS-CoV-2 infection experienced at least 1 of the 3 Long COVID symptom clusters in 2020 and 2021, including 3.2% (95% UI, 0.6%-10.0%) for persistent fatigue with bodily pain or mood swings, 3.7% (95% UI, 0.9%-9.6%) for ongoing respiratory problems, and 2.2% (95% UI, 0.3%-7.6%) for cognitive problems after adjusting for health status before COVID-19, comprising an estimated 51.0% (95% UI, 16.9%-92.4%), 60.4% (95% UI, 18.9%-89.1%), and 35.4% (95% UI, 9.4%-75.1%), respectively, of Long COVID cases. The Long COVID symptom clusters were more common in women aged 20 years or older (10.6% [95% UI, 4.3%-22.2%]) 3 months after symptomatic SARS-CoV-2 infection than in men aged 20 years or older (5.4% [95% UI, 2.2%-11.7%]). Both sexes younger than 20 years of age were estimated to be affected in 2.8% (95% UI, 0.9%-7.0%) of symptomatic SARS-CoV-2 infections. The estimated mean Long COVID symptom cluster duration was 9.0 months (95% UI, 7.0-12.0 months) among hospitalized individuals and 4.0 months (95% UI, 3.6-4.6 months) among nonhospitalized individuals. Among individuals with Long COVID symptoms 3 months after symptomatic SARS-CoV-2 infection, an estimated 15.1% (95% UI, 10.3%-21.1%) continued to experience symptoms at 12 months.