Skip to content


Links for 2023-10-03

  • Vector Embeddings

    Interesting technique from the LLM community to search, cluster and classify text strings:

    Text [vector] embeddings measure the relatedness of text strings. Embeddings are commonly used for: Search (where results are ranked by relevance to a query string); Clustering (where text strings are grouped by similarity); Recommendations (where items with related text strings are recommended); Anomaly detection (where outliers with little relatedness are identified); Diversity measurement (where similarity distributions are analyzed); Classification (where text strings are classified by their most similar label); An embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.
    Commonly used as a storage format in vector databases (cf. Search using text embeddings is therefore implemented using cosine similarity or k-nearest neighbour to find vector similarity. Looks like is the current open source vector DB of choice, at the moment. (via Simon Willison)

    (tags: ai openai via:simonw vector-embeddings text-embeddings text storage databases search similarity clustering recommendations anomaly-detection classification vector-databases)

  • Covid inquiry: UK’s top pandemic scientist gives damning verdict on Boris Johnson and Rishi Sunak

    None of this is remotely surprising, unfortunately:

    The inquiry also heard that in October 2020, Mr Johnson wrote “bollocks” in capital letters across a Department of Health guidance document on Long Covid, from which it is estimated more than a million people are suffering. Anthony Metzer KC, representing Long Covid sufferers, said the former PM has admitted in his own witness statement that he did not believe the condition “truly existed”

    (tags: long-covid boris-johnson politics uk covid-19 patrick-vallance)

Comments closed