Skip to content

Justin's Linklog Posts

_Surveilling the Masses with Wi-Fi-Based Positioning Systems_

  • _Surveilling the Masses with Wi-Fi-Based Positioning Systems_

    This is pretty crazy stuff, I had no idea the WPSes were fully queryable:

    Wi-Fi-based Positioning Systems (WPSes) are used by modern mobile devices to learn their position using nearby Wi-Fi access points as landmarks. In this work, we show that Apple’s WPS can be abused to create a privacy threat on a global scale. We present an attack that allows an unprivileged attacker to amass a worldwide snapshot of Wi-Fi BSSID geolocations in only a matter of days. Our attack makes few assumptions, merely exploiting the fact that there are relatively few dense regions of allocated MAC address space. Applying this technique over the course of a year, we learned the precise locations of over 2 billion BSSIDs around the world. The privacy implications of such massive datasets become more stark when taken longitudinally, allowing the attacker to track devices’ movements. While most Wi-Fi access points do not move for long periods of time, many devices — like compact travel routers — are specifically designed to be mobile. We present several case studies that demonstrate the types of attacks on privacy that Apple’s WPS enables: We track devices moving in and out of war zones (specifically Ukraine and Gaza), the effects of natural disasters (specifically the fires in Maui), and the possibility of targeted individual tracking by proxy — all by remotely geolocating wireless access points. We provide recommendations to WPS operators and Wi-Fi access point manufacturers to enhance the privacy of hundreds of millions of users worldwide. Finally, we detail our efforts at responsibly disclosing this privacy vulnerability, and outline some mitigations that Apple and Wi-Fi access point manufacturers have implemented both independently and as a result of our work.

    (tags: geolocation location wifi wps apple google infosec privacy)

Faking William Morris, Generative Forgery, and the Erosion of Art History

Technical post-mortem on the Google/UniSuper account deletion

  • Technical post-mortem on the Google/UniSuper account deletion

    “Google operators followed internal control protocols. However, one input parameter was left blank when using an internal tool to provision the customer’s Private Cloud. As a result of the blank parameter, the system assigned a then unknown default fixed 1 year term value for this parameter. After the end of the system-assigned 1 year period, the customer’s GCVE Private Cloud was deleted. No customer notification was sent because the deletion was triggered as a result of a parameter being left blank by Google operators using the internal tool, and not due a customer deletion request. Any customer-initiated deletion would have been preceded by a notification to the customer.” Ouch.

    (tags: cloud ops google tools ux via:scott-piper fail infrastructure gcp unisuper)

Innards of MS’ new Recall app

  • Innards of MS’ new Recall app

    Some technical details on the implementation of this new built-in key- and screen-logger, bundled with current versions of Windows, via Kevin Beaumont: “Microsoft have decided to bake essentially an infostealer into base Windows OS and enable by default. From the Microsoft FAQ: “Note that Recall does not perform content moderation. It will not hide information such as passwords or financial account numbers.” Info is stored locally – but rather than something like Redline stealing your local browser password vault, now they can just steal the last 3 months of everything you’ve typed and viewed in one database.” It requires ARM based hardware with a dedicated NPU (“neural processor”). “Recall uses a bunch of services themed CAP – Core AI Platform. Enabled by default. It spits constant screenshots … into the current user’s AppData as part of image storage. The NPU processes them and extracts text, into a database file. The database is SQLite, and you can access it as the user including programmatically. It 100% does not need physical access and can be stolen.” “[The screenshots are] written into an ImageStorage folder and there’s a separate process and SqLite database for them too, it categorises what’s in them. There’s a GUI that lets you view any of them.” Data is not stored with any additional crypto, beyond disk-level encryption via BitLocker. On the upside: for non-corporate users, “there’s a tray icon and you can disable it in Settings.” But for corps: “Recall has been enabled by default globally in Microsoft Intune managed users, for businesses.”

    (tags: microsoft recall security infosec keyloggers via:kevin-beaumont sqlite)

Meredith Whittaker’s speech on winning the Helmut Schmidt Future Prize

  • Meredith Whittaker’s speech on winning the Helmut Schmidt Future Prize

    This is a superb speech, and a great summing up of where we are with surveillance capitalism and AI in 2024. It explains where surveillance-driven advertising came from, in the 1990s:

    First, even though they were warned by advocates and agencies within their own government about the privacy and civil liberties concerns that rampant data collection across insecure networks would produce, [the Clinton administration] put NO restrictions on commercial surveillance. None. Private companies were unleashed to collect and create as much intimate information about us and our lives as they wanted – far more than was permissible for governments. (Governments, of course, found ways to access this goldmine of corporate surveillance, as the Snowden documents exposed.) And in the US, we still lack a federal privacy law in 2024. Second, they explicitly endorsed advertising as the business model of the commercial internet – fulfilling the wishes of advertisers who already dominated print and TV media. 
    How that drove the current wave of AI:
    In 2012, right as the surveillance platforms were cementing their dominance, researchers published a very important paper on AI image classification, which kicked off the current AI goldrush. The paper showed that a combination of powerful computers and huge amounts of data could significantly improve the performance of AI techniques – techniques that themselves were created in the late 1980s. In other words, what was new in 2012 were not the approaches to AI – the methods and procedures. What “changed everything” over the last decade was the staggering computational and data resources newly available, and thus newly able to animate old approaches. Put another way, the current AI craze is a result of this toxic surveillance business model. It is not due to novel scientific approaches that – like the printing press – fundamentally shifted a paradigm. And while new frameworks and architectures have emerged in the intervening decade, this paradigm still holds: it’s the data and the compute that determine who “wins” and who loses.
    And how that is driving a new form of war crimes, pattern-recognition-driven kill lists like Lavender:
    The Israeli Army … is currently using an AI system named Lavender in Gaza, alongside a number of others. Lavender applies the logic of the pattern recognition-driven signature strikes popularized by the United States, combined with the mass surveillance infrastructures and techniques of AI targeting. Instead of serving ads, Lavender automatically puts people on a kill list based on the likeness of their surveillance data patterns to the data patterns of purported militants – a process that we know, as experts, is hugely inaccurate. Here we have the AI-driven logic of ad targeting, but for killing. According to 972’s reporting, once a person is on the Lavender kill list, it’s not just them who’s targeted, but the building they (and their family, neighbours, pets, whoever else) live is subsequently marked for bombing, generally at night when they (and those who live there) are sure to be home. This is something that should alarm us all. While a system like Lavender could be deployed in other places, by other militaries, there are conditions that limit the number of others who could practically follow suit. To implement such a system you first need fine-grained population-level surveillance data, of the kind that the Israeli government collects and creates about Palestinian people. This mass surveillance is a precondition for creating ‘data profiles’, and comparing millions of individual’s data patterns against such profiles in service of automatically determining whether or not these people are added to a kill list. Implementing such a system ultimately requires powerful infrastructures and technical prowess – of the kind that technically capable governments like the US and Israel have access to, as do the massive surveillance companies. Few others also have such access. This is why, based on what we know about the scope and application of the Lavender AI system, we can conclude that it is almost certainly reliant on infrastructure provided by large US cloud companies for surveillance, data processing, and possibly AI model tuning and creation. Because collecting, creating, storing, and processing this kind and quantity of data all but requires Big Tech cloud infrastructures – they’re “how it’s done” these days. This subtle but important detail also points to a dynamic in which the whims of Big Tech companies, alongside those of a given US regime, determines who can and cannot access such weaponry. The use of probabilistic techniques to determine who is worthy of death – wherever they’re used – is, to me, the most chilling example of the serious dangers of the current centralized AI industry ecosystem, and of the very material risks of believing the bombastic claims of intelligence and accuracy that are used to market these inaccurate systems. And to justify carnage under the banner of computational sophistication. As UN Secretary General Antonio Gutiérrez put it, “machines that have the power and the discretion to take human lives are politically unacceptable, are morally repugnant, and should be banned by international law.”

    (tags: pattern-recognition kill-lists 972 lavender gaza war-crimes ai surveillance meredith-whittaker)

The CVM algorithm

  • The CVM algorithm

    A new count-distinct algorithm: “We present a simple, intuitive, sampling-based space-efficient algorithm whose description and the proof are accessible to undergraduates with the knowledge of basic probability theory.” Knuth likes it! “Their algorithm is not only interesting, it is extremely simple. Furthermore, it’s wonderfully suited to teaching students who are learning the basics of computer science. (Indeed, ever since I saw it, a few days ago, I’ve been unable to resist trying to explain the ideas to just about everybody I meet.) Therefore I’m pretty sure that something like this will eventually become a standard textbook topic.” — https://cs.stanford.edu/~knuth/papers/cvm-note.pdf (via mhoye)

    (tags: algorithms approximation cardinality streaming estimation cs papers count-distinct distinct-elements)

Scaleway now offering DC sustainability metrics in real time

  • Scaleway now offering DC sustainability metrics in real time

    Via Lauri on the ClimateAction.tech slack: “Huge respect to Scaleway for offering its data centres power, water (yes, even WUE!) and utilisation stats in real-time on its website. Are you listening AWS, Azure and GCP?” Specifically, Scaleway are reporting real-time Power Usage Effectiveness (iPUE), real-time Water Usage Effectiveness (WUE), total IT kW consumed, freechilling net capacity (depending on DC), outdoor humidity and outdoor temperature for each of their datacenters on the https://www.scaleway.com/en/environmental-leadership/ page. They use a slightly confusing circular 24-hour graph format which I’ve never seen before; although I’m coming around to it, I still think I’d prefer a traditional X:Y chart format. Great to see this level of data granularity being exposed. Hopefully there’ll be a public API soon

    (tags: scaleway sustainability hosting datacenters cloud pue wue climate via:climateaction)

“Unprecedented” Google Cloud event wipes out customer account and its backups

Linux maintainers were infected for 2 years by SSH-dwelling backdoor with huge reach | Ars Technica

American Headache Society recommend CGRP therapies for “first-line” migraine treatment

  • American Headache Society recommend CGRP therapies for “first-line” migraine treatment

    This is big news for migraine treatment, and a good indicator of how reliable and safe these new treatments are, compared to the previous generation: “All migraine preventive therapies previously considered to be first-line treatments were developed for other indications and adopted later for migraine. Adherence to these therapies is often poor due to issues with efficacy and tolerability. Multiple new migraine-specific therapies have been developed based on a broad foundation of pre-clinical and clinical evidence showing that CGRP plays a key role in the pathogenesis of migraine. These CGRP-targeting therapies have had a transformational impact on the management of migraine but are still not widely considered to be first-line approaches.” [….] “The CGRP-targeting therapies should be considered as a first-line approach for migraine prevention […] without a requirement for prior failure of other classes of migraine preventive treatment.” I hope to see this elsewhere soon, too — and I’m also hoping to be prescribed my first CGRP treatments soon so I can reap the benefits myself; migraines have been no fun.

    (tags: migraine health medicine cgrp ahs headaches)

Should people with Long Covid be donating blood?

  • Should people with Long Covid be donating blood?

    Leading Long Covid and ME researchers and patient-advocates who spoke with The Sick Times largely agreed that blood donation could worsen a patient’s symptoms. However, they also cited concerns about a growing body of research that shows a variety of potential issues in the blood of people with Long Covid which could make their blood unsafe for recipients. “Based on the levels of inflammatory markers and microclots we have seen in blood samples from both Long Covid and ME/CFS, I do not think the blood is safe to be used for transfusion,” said Resia Pretorius, a leading Long Covid researcher and distinguished professor from the physiological sciences department at Stellenbosch University in South Africa.

    (tags: me-cfs long-covid covid-19 blood-transfusion medicine)

UN expert attacks ‘exploitative’ world economy in fight to save planet

  • UN expert attacks ‘exploitative’ world economy in fight to save planet

    Outgoing UN special rapporteur on human rights and the environment from 2018 to 2024, David Boyd, says ‘there’s something wrong with our brains that we can’t understand how grave this is’:

    “I started out six years ago talking about the right to a healthy environment having the capacity to bring about systemic and transformative changes. But this powerful human right is up against an even more powerful force in the global economy, a system that is absolutely based on the exploitation of people and nature. And unless we change that fundamental system, then we’re just re-shuffling deck chairs on the Titanic.” “The failure to take a human rights based approach to the climate crisis – and the biodiversity crisis and the air pollution crisis – has absolutely been the achilles heel of [anti-climate-change] efforts for decades. “I expect in the next three or four years, we will see court cases being brought challenging fossil fuel subsidies in some petro-states … These countries have said time and time again at the G7, at the G20, that they’re phasing out fossil-fuel subsidies. It’s time to hold them to their commitment. And I believe that human rights law is the vehicle that can do that. In a world beset by a climate emergency, fossil-fuel subsidies violate states’ fundamental, legally binding human rights obligations.” […] Boyd said: “There’s no place in the climate negotiations for fossil-fuel companies. There is no place in the plastic negotiations for plastic manufacturers. It just absolutely boggles my mind that anybody thinks they have a legitimate seat at the table. “It has driven me crazy in the past six years that governments are just oblivious to history. We know that the tobacco industry lied through their teeth for decades. The lead industry did the same. The asbestos industry did the same. The plastics industry has done the same. The pesticide industry has done the same.”

    (tags: human-rights law david-boyd un climate-change fossil-fuels)

UniSuper members go a week with no account access after Google Cloud misconfig | Hacker News

Bridgy Fed

  • Bridgy Fed

    Bridgy Fed connects web sites, the fediverse, and Bluesky. You can use it to make your profile on one visible in another, follow people, see their posts, and reply and like and repost them. Interactions work in both directions as much as possible.

    (tags: blog fediverse mastodon social bluesky)

My (Current) Solar PV Dashboard

About a year ago, I installed a solar PV system at my home. I wound up with a set of 14 panels on my roof, which can produce a max of 5.6 kilowatts output, and a 4.8 kW Dyness battery to store any excess power.

Since my car is an EV, I already had a home car charger installed, but chose to upgrade this to a MyEnergi Zappi at the same time, as the Zappi has some good features to charge from solar power only — and part of that feature set involved adding a Harvi power monitor.

With HomeAssistant, I’ve been able to extract metrics from both the MyEnergi components and the Solis inverter for the solar PV system, and can publish those from HomeAssistant to my Graphite store, where my home Grafana can access them — and I can thoroughly nerd out on building an optimal dashboard.

I’ve gone through a couple of iterations, and here’s the current top-line dashboard graph which I’m quite happy with…

Let’s go through the components to explain it. First off, the grid power:

Grid Import sans Charging

This is power drawn from the grid, instead of from the solar PV system. Ideally, this is minimised, but generally after about 8pm at night the battery is exhausted, and the inverter switches to run the house’s power needs from the grid.

In this case, there are notable spikes just after midnight, where the EV charge is topped up by a scheduled charge on the Zappi, and then a couple of short duration load spikes of 2kW from some appliance or another over the course of the night.

(What isn’t visible on this graph is a longer spike of 2kW charging from 07:00 until about 08:40, when a scheduled charge on the Solis inverter charges the house batteries to 100%, in order to load shift — I’m on the Energia Smart Data contract, which gives cheap power between 23:00 and 08:00. Since this is just a scheduled load shift, I’ve found it clearer to leave it off, hence “sans charging”.)


Solar Generation

This is the power generated by the panels; on this day, it peaked at 4kW (which isn’t bad for an Irish slightly sunny day in April).


To Battery From Solar

Power charged from the panels to the Dyness battery. As can be seen here, during the period from 06:50 to 09:10, the battery charged using virtually all of the panels’ power output. From then on, it periodically applied short spikes of up to 1kW, presumably to maintain optimal battery operation.


From Battery

Pretty much any time the batteries are not charging, they are discharging at a low rate. So even during the day time with high solar output, there’s a little bit of battery drain going on — until 20:00 when the solar output has tailed off and the battery starts getting used up.

<

p>

Grid Export

This covers excess power, beyond what can be used directly by the house, or charged to the battery; the excess is exported back to the power grid, at the (currently) quite generous rate of 24 cents per kilowatt-hour.

Rendering

All usages of solar power (either from battery or directly from PV) are rendered as positive values, above the 0 axis line; usage of (expensive) grid power is represented as negative, below the line.

For clarity, a number of lines are stacked:

From Battery (orange) and Solar Generation (green) are stacked together, since those are two separate complementary power sources in the PV system.

Grid Export (blue) and To Battery From Solar (yellow) are also stacked together, since those are subsets of the (green) Solar Generation block.

The grafana dashboard JSON export is available here, if you’re curious.

  • Via arclight on Mastodon ( https://oldbytes.space/@arclight/112367348253414752 ): spreadsheet authors/developers have an accuracy rate of 96%-99% when writing new formulas (and, of course, there are no unit tests in the world of spreadsheets). As they put it: “the uncomfortable truth is that any but the most trivial spreadsheets contain errors. It’s not a question of if there are errors, it’s a question of how many and how severe.”

    In the spreadsheet error community, both academics and practitioners generally have ignored the rich findings produced by a century of human error research. These findings can suggest ways to reduce errors; we can then test these suggestions empirically. In addition, research on human error seems to suggest that several common prescriptions and expectations for reducing errors are likely to be incorrect. Among the key conclusions from human error research are that thinking is bad, that spreadsheets are not the cause of spreadsheet errors, and that reducing errors is extremely difficult. In past EuSpRIG conferences, many papers have shown that most spreadsheets contain errors, even after careful development. Most spreadsheets, in fact, have material errors that are unacceptable in the growing realm of compliance laws. Given harsh penalties for non-compliance, we are under considerable pressure to develop good practice recommendations for spreadsheet developers and testers. If we are to reduce errors, we need to understand errors. Fortunately, human error has been studied for over a century across a number of human cognitive domains, including linguistics, writing, software development and testing, industrial processes, automobile accidents, aircraft accidents, nuclear accidents, and algebra, to name just a few. The research that does exist is disturbing because it shows that humans are unaware of most of their errors. This “error blindness” leads people to many incorrect beliefs about error rates and about the difficulty of detecting errors. In general, they are overconfident, substantially underestimating their own error rates and overestimating their ability to reduce and detect errors. This “illusion of control” also leads them to hold incorrect beliefs about spreadsheet errors, such as a belief that most errors are due to spreadsheet technology or to sloppiness rather than being due primarily to inherent human error.

    (tags: spreadsheets errors programming coding bugs research papers via:arclight)

The Immich core team goes full-time

  • The Immich core team goes full-time

    Interesting — the Immich photo hosting open source project is switching IP ownership, and core team employment, to a private company:

    Since the beginning of this adventure, my goal has always been to create a better world for my children. Memories are priceless, and privacy should not be a luxury. However, building quality open source has its challenges. Over the past two years, it has taken significant dedication, time, and effort. Recently, a company in Austin, Texas, called FUTO contacted the team. FUTO strives to develop quality and sustainable open software. They build software alternatives that focus on giving control to users. From their mission statement: “Computers should belong to you, the people. We develop and fund technology to give them back.” FUTO loved Immich and wanted to see if we’d consider working with them to take the project to the next level. In short, FUTO offered to: Pay the core team to work on Immich full-time Let us keep full autonomy about the project’s direction and leadership Continue to license Immich under AGPL Keep Immich’s development direction with no paywalled features Keep Immich “built for the people” (no ads, data mining/selling, or alternative motives) Provide us with financial, technical, legal, and administrative support
    Here are FUTO’s “three pledges”:
    We will never sell out. All FUTO companies and FUTO-funded projects are expected to remain fiercely independent. They will never exacerbate the monopoly problem by selling out to a monopolist. We will never abuse our customers. All FUTO companies and FUTO-funded projects are expected to maintain an honest relationship with their customers. Revenue, if it exists, comes from customers paying directly for software and services. “The users are our product” revenue models are strictly prohibited. We will always be transparently devoted to making delightful software. All FUTO-funded projects are expected to be open-source or develop a plan to eventually become so. No effort will ever be taken to hide from the people what their computers are doing, to limit how they use them, or to modify their behavior through their software.
    I’m not 100% clear on how FUTO will make money, but this is a very interesting move.

    (tags: futo immich open-source photos agpl ip ownership work how-we-work)

How did Ethernet get its 1500-byte MTU?

  • How did Ethernet get its 1500-byte MTU?

    Now this is a great bit of networking trivia!

    1500 bytes is a bit out there as numbers go, or at least it seems that way if you touch computers for a living. It’s not a power of two or anywhere close, it’s suspiciously base-ten-round, and computers don’t care all that much about base ten, so how did we get here? Well, today I learned that if you add the Ethernet header – 36 bytes – then an MTU of 1500 plus that header is 1536 bytes, which is 12288 bits, which takes 2^12 microseconds to transmit at 3Mb/second, and because the Xerox Alto computer for which Ethernet was invented had an internal data path that ran at 3Mhz, then you could just write the bits into the Alto’s memory at the precise speed at which they arrived, saving the very-expensive-then cost of extra silicon for an interface or any buffering hardware. Now, “we need to pick just the right magic number here so we can take data straight off the wire and blow it directly into the memory of this specific machine over there” is, to any modern sensibilities, lunacy. It’s obviously, dangerously insane, there are far too many computers and bad people with computers in the world for that. But back when the idea of network security didn’t exist because computers barely existed, networks mostly didn’t exist and unvetted and unsanctioned access to those networks definitely didn’t exist, I bet it seemed like a very reasonable tradeoff. It really is amazing how many of the things we sort of ambiently accept as standards today, if we even realize we’re making that decision at all, are what they are only because some now-esoteric property of the now-esoteric hardware on which the tech was first invented let the inventors save a few bucks.

    (tags: ethernet networking magic-numbers via:itc hardware history xerox alto)

American flag sort

  • American flag sort

    An efficient, in-place variant of radix sort that distributes items into hundreds of buckets. The first step counts the number of items in each bucket, and the second step computes where each bucket will start in the array. The last step cyclically permutes items to their proper bucket. Since the buckets are in order in the array, there is no collection step. The name comes by analogy with the Dutch national flag problem in the last step: efficiently partition the array into many “stripes”. Using some efficiency techniques, it is twice as fast as quicksort for large sets of strings. See also histogram sort. Note: This works especially well when sorting a byte at a time, using 256 buckets.

    (tags: algorithms sorting sort radix-sort performance quicksort via:hn)

How web bloat impacts users with slow devices

  • How web bloat impacts users with slow devices

    CPU performance for web apps hasn’t scaled nearly as quickly as bandwidth so, while more of the web is becoming accessible to people with low-end connections, more of the web is becoming inaccessible to people with low-end devices even if they have high-end connections. For example, if I try browsing a “modern” Discourse-powered forum on a Tecno Spark 8C, it sometimes crashes the browser. Between crashes, on measuring the performance, the responsiveness is significantly worse than browsing a BBS with an 8 MHz 286 and a 1200 baud modem.

    (tags: dan-luu performance web bloat cpu hardware internet profiling)

Ex-Amazon AI exec claims she was asked to ignore IP law

  • Ex-Amazon AI exec claims she was asked to ignore IP law

    This is really appalling stuff, on two counts: (a) how does it not surprise me that maternity leave was considered “weak” and grounds for firing. (b) check this shit out:

    According to Ghaderi’s account in the complaint, she returned to work after giving birth in January 2023, inheriting a large language model project. Part of her role was flagging violations of Amazon’s internal copyright policies and escalating these concerns to the in-house legal team. In March 2023, the filing claims, her team director, Andrey Styskin, challenged Ghaderi to understand why Amazon was not meeting its goals on Alexa search quality. The filing alleges she met with a representative from the legal department to explain her concerns and the tension they posed with the “direction she had received from upper management, which advised her to violate the direction from legal.” According to the complaint, Styskin rejected Ghaderi’s concerns, allegedly telling her to ignore copyright policies to improve the results. Referring to rival AI companies, the filing alleges he said: “Everyone else is doing it.”
    Move fast and break laws!

    (tags: aws amazon llms alexa maternity-leave parenting parental-leave work dont-be-evil copyright ip ai)

“Randar” exploit for Minecraft

  • “Randar” exploit for Minecraft

    This is great — I love a good pRNG state-leakage exploit:

    Every time a block is broken in Minecraft versions Beta 1.8 through 1.12.2, the precise coordinates of the dropped item can reveal another player’s location. “Randar” is an exploit for Minecraft which uses LLL lattice reduction to crack the internal state of an incorrectly reused java.util.Random in the Minecraft server, then works backwards from that to locate other players currently loaded into the world.
    Don’t reuse those java.util.Randoms! (via Dan Hon)

    (tags: exploits security infosec minecraft prngs rngs random coding via:danhon)

NHS and OpenSAFELY

  • NHS and OpenSAFELY

    It seems the UK have created a “Trusted Research Environment” for working with the extremely privacy-sensitive datasets around NHS users’ health data, using OpenSAFELY; it is basically a hosting environment allowing the execution of user-submitted Python query code, which must be open source, hosted on Github, designed with care to avoid releasing user-identifying sensitive data, and of course fully auditable. This looks like a decent advance in privacy-sensitive technology! Example code, from the OpenSAFELY tutorial docs: “` from ehrql import create_dataset from ehrql.tables.core import patients, medications dataset = create_dataset() dataset.define_population(patients.date_of_birth.is_on_or_before(“1999-12-31”)) asthma_codes = [“39113311000001107”, “39113611000001102”] latest_asthma_med = ( medications.where(medications.dmd_code.is_in(asthma_codes)) .sort_by(medications.date) .last_for_patient() ) dataset.asthma_med_date = latest_asthma_med.date dataset.asthma_med_code = latest_asthma_med.dmd_code “`

    (tags: privacy data-protection nhs medical-records medicine research python sql opensafely uk)

Recommending Toxicity: How TikTok and YouTube Shorts are bombarding boys and men with misogynist content

  • Recommending Toxicity: How TikTok and YouTube Shorts are bombarding boys and men with misogynist content

    This is, frankly, disgusting.

    A new study from Dublin City University’s Anti-Bullying Centre shows that the recommender algorithms used by social media platforms are rapidly amplifying misogynistic and male supremacist content. The study, conducted by Professor Debbie Ging, Dr Catherine Baker and Dr Maja Andreasen, tracked, recorded and coded the content recommended to 10 experimental or ‘sockpuppet’ accounts on 10 blank smartphones – five on YouTube Shorts and five on TikTok. The researchers found that all of the male-identified accounts were fed masculinist, anti-feminist and other extremist content, irrespective of whether they sought out general or male supremacist-related content, and that they all received this content within the first 23 minutes of the experiment. Once the account showed interest by watching this sort of content, the amount rapidly increased. By the last round of the experiment (after 400 videos or two to three hours viewing), the vast majority of the content being recommended to the phones was toxic (TikTok 76% and YouTube Shorts 78%), primarily falling into the manosphere (alpha male and anti-feminist) category.

    (tags: tiktok youtube hate misogyny dcu research social-media)

How many bathrooms have Neanderthals in the tile?

  • How many bathrooms have Neanderthals in the tile?

    The [Reddit] poster is a dentist and visited his parents house to see the new travertine they installed. It’s no surprise that he recognized something right away: […] A section cut at a slight angle through a very humanlike jaw! […] The Reddit user who posted the story (Kidipadeli75) has followed up with some updates over the course of the day. The travertine was sourced in Turkey, and a close search of some of the other installed panels revealed some other interesting possible fossils, although none are as strikingly identifiable as the mandible. A number of professionals have reached out to offer assistance and I have no doubt that they will be able to learn a lot about the ancient person whose jaw ended up in this rock. This naturally raises a broader question: How many other people have installed travertine with hominin fossils inside?

    (tags: reddit mandibles bones archaeology history neanderthals travertine turkey)

AI and Israel’s Dystopian Promise of War without Responsibility

  • AI and Israel’s Dystopian Promise of War without Responsibility

    From the Center for International Policy:

    In Gaza we see an “indiscriminate” and “over the top” bombing campaign being actively rebranded by Israel as a technological step up, when in actuality there is currently no evidence that their so-called Gospel has produced results qualitatively better than those made by minds of flesh and blood. Instead, Israel’s AI has produced an endless list of targets with a decidedly lower threshold for civilian casualties. Human eyes and intelligence are demoted to rubber stamping a conveyor belt of targets as fast they can be bombed. It’s a path that the US military and policy makers should not only be wary of treading, but should reject loudly and clearly. In the future we may develop technology worthy of the name Artificial Intelligence, but we are not there yet. Currently the only promise a system such as Gospel AI holds is the power to occlude responsibility, to allow blame to fall on the machine picking the victims instead of the mortals providing the data.

    (tags: ai war grim-meathook-future israel gaza automation war-crimes lavender gospel)

Quick plug for Cronitor.IO

Quick plug for a good tool for self-hosting — Cronitor.io. I have been using this for the past year or so as I migrate more of my personal stuff off cloud and back onto self-hosted setups, and it’s been a really nice way to monitor simple cron-driven home workloads, and (together with graphite/grafana alerts) has saved my bacon many times. Integrates nicely with Slack, or even PagerDuty (although that would be overkill for my setup for sure).

90-GWh thermal energy storage facility could heat a city for a year

  • 90-GWh thermal energy storage facility could heat a city for a year

    Some cool green engineering:

    The project has a total volume of 1.1 million cubic meters (38.85 million cubic feet), including processing facilities, and will be built into [Vantaa]’s bedrock at around 100 m (330 ft) below ground – though the deepest parts of the setup could go down as far as 140 m. Three caverns will be created, each measuring 300 m (984.25 ft) in length, 40 m (131.2 ft) in height and 20 m (65.6 ft) in width. These will be filled with hot water by a pair of 60-MW electric boilers, powered by renewables when it’s cheap to do so. Pressure within the space allows for temperatures to get as high as 140 °C (284 °F) without the water boiling over or steaming away. Waste heat from industry will also feed the setup, with a smart control system balancing energy sources. The Varanto facility is reported to have a total thermal capacity of 90 GWh when “fully charged” – enough to meet the year-round domestic heating needs of a “medium-sized Finnish city.”

    (tags: engineering finland district-heating energy energy-storage caves cool)

AWS told to pay $525M in cloud storage patent suit – The Register

leaked Kremlin documents detailing current Russian troll tactics

  • leaked Kremlin documents detailing current Russian troll tactics

    A rare view into Russia’s current propaganda tactics, really useful to spot it in action:

    In an ongoing campaign that seeks to influence congressional and other political debates to stoke anti-Ukraine sentiment, Kremlin-linked political strategists and trolls have written thousands of fabricated news articles, social media posts and comments that promote American isolationism, stir fear over the United States’ border security and attempt to amplify U.S. economic and racial tensions, according to a trove of internal Kremlin documents obtained by a European intelligence service […] One of the political strategists … instructed a troll farm employee working for his firm to write a comment of “no more than 200 characters in the name of a resident of a suburb of a major city.” The strategist suggested that this fictitious American “doesn’t support the military aid that the U.S. is giving Ukraine and considers that the money should be spent defending America’s borders and not Ukraine’s. He sees that Biden’s policies are leading the U.S. toward collapse.” … The files are part of a series of leaks that have allowed a rare glimpse into Moscow’s parallel efforts to weaken support for Ukraine in France and Germany, as well as destabilize Ukraine itself … [via] the creation of websites designed to impersonate legitimate media outlets in Europe, part of a campaign that Western officials have called “Doppelganger”. Plans by Gambashidze’s team refer to using “short-lived” social media accounts aimed at avoiding detection. Social media manipulators have established a technique of using accounts to send out links to material and then deleting their posts or accounts once others have reshared the content. The idea is to obscure the true origin of misleading information and keep the channel open for future influence operations, disinformation researchers said. Propaganda operatives have used another technique to spread just a web address, rather than the words in a post, to frustrate searches for that material, according to the social media research company Alethea, which called the tactic “writing with invisible ink.” Other obfuscation tricks include redirecting viewers through a series of seemingly random websites until they arrive at a deceptive article. One of the documents reviewed by The Post called for the use of Trump’s Truth Social platform as the only way to disseminate posts “without censorship,” while “short-lived” accounts would be created for Facebook, Twitter (now known as X) and YouTube. “You just have to push content every single day … someone will stumble over it, a politician or celebrity will find it over time just based on the availability of content.”
    “Flooding the zone with shit”, as Steve Bannon put it.

    (tags: propaganda russia tactics spam trolls troll-farms destabilization social-media)

How Tech Giants Cut Corners to Harvest Data for A.I. – The New York Times

  • How Tech Giants Cut Corners to Harvest Data for A.I. – The New York Times

    Can’t wait for all the lawsuits around this stuff.

    Meta could not match ChatGPT unless it got more data, Mr. Al-Dahle told colleagues. In March and April 2023, some of the company’s business development leaders, engineers and lawyers met nearly daily to tackle the problem. [….] They also talked about how they had summarized books, essays and other works from the internet without permission and discussed sucking up more, even if that meant facing lawsuits. One lawyer warned of “ethical” concerns around taking intellectual property from artists but was met with silence, according to the recordings.

    (tags: ai copyright data training openai meta google privacy surveillance data-protection ip)

Python Mutable Defaults Are The Source of All Evil