Skip to content


Links for 2023-04-14

  • “Why Banker Bob (still) Can’t Get TLS Right: A Security Analysis of TLS in Leading UK Banking Apps”

    Jaysus this is a litany of failure.

    Abstract. This paper presents a security review of the mobile apps provided by the UK’s leading banks; we focus on the connections the apps make, and the way in which TLS is used. We apply existing TLS testing methods to the apps which only find errors in legacy apps. We then go on to look at extensions of these methods and find five of the apps have serious vulnerabilities. In particular, we find an app that pins a TLS root CA certificate, but do not verify the hostname. In this case, the use of certificate pinning means that all existing test methods would miss detecting the hostname verification flaw. We also find one app that doesn’t check the certificate hostname, but bypasses proxy settings, resulting in failed detection by pentesting tools. We find that three apps load adverts over insecure connections, which could be exploited for in-app phishing attacks. Some of the apps used the users’ PIN as authentication, for which PCI guidelines require extra security, so these apps use an additional cryptographic protocol; we study the underlying protocol of one banking app in detail and show that it provides little additional protection, meaning that an active man-in-the-middle attacker can retrieve the user’s credentials, login to the bank and perform every operation the legitimate user could.
    See also:

    (tags: ssl tls certificates certificate-pinning security infosec banking apps uk pci mobile)

  • Using DuckDB to repartition parquet data in S3

    Wow, DuckDB is very impressive — I had no idea it could handle SELECTs against Parquet data in S3:

    A common pattern to ingest streaming data and store it in S3 is to use Kinesis Data Firehose Delivery Streams, which can write the incoming stream data as batched parquet files to S3. You can use custom S3 prefixes with it when using Lambda processing functions, but by default, you can only partition the data by the timestamp (the timestamp the event reached the Kinesis Data Stream, not the event timestamp!). So, a few common use cases for data repartitioning could include: Repartitioning the written data for the real event timestamp if it’s included in the incoming data; Repartitioning the data for other query patterns, e.g. to support query filter pushdown and optimize query speeds and costs; Aggregation of raw or preprocessed data, and storing them in an optimized manner to support analytical queries.

    (tags: duckdb repartitioning s3 parquet orc hive kinesis firehose)

  • Timnit Gebru’s anti-‘AI pause’

    Couldn’t agree more with Timnit Gebru’s comments here:

    What is your appeal to policymakers? What would you want Congress and regulators to do now to address the concerns you outline in the open letter? Congress needs to focus on regulating corporations and their practices, rather than playing into their hype of “powerful digital minds.” This, by design, ascribes agency to the products rather than the organizations building them. This language obfuscates the amount of data that is being collected — and the amount of worker exploitation involved with those who are labeling and supplying the datasets, and moderating model outputs. Congress needs to ensure corporations are not using people’s data without their consent, and hold them responsible for the synthetic media they produce — whether it is text or media spewing disinformation, hate speech or other types of harmful content. Regulations need to put the onus on corporations, rather than understaffed agencies. There are probably existing regulations these organizations are breaking. There are mundane “AI” systems being used daily; we just heard about another Black man being wrongfully arrested because of the use of automated facial analysis systems. But that’s not what we’re talking about, because of the hype.

    (tags: data privacy ai ml openai monopoly)