Three major LLM releases in 24 hours
Google Gemini Pro 1.5 is free, GPT-4 Turbo has Vision, Mixtral 8x22B released in a tweet
In this newsletter:
Three major LLM releases in 24 hours
Plus 7 links
Three major LLM releases in 24 hours - 2024-04-10
I'm a bit behind on my weeknotes, so there's a lot to cover here. But first... a review of the last 24 hours of Large Language Model news. All times are in US Pacific.
11:01am: Google Gemini Pro 1.5 hits general availability, here's the blog post - their 1 million token context GPT-4 class model now has no waitlist, is available to anyone in 180 countries (not including Europe or the UK as far as I can tell) and most impressively all the API has a free tier that allows up to 50 requests a day, though rate limited to 2 per minute. Beyond that you can pay $7/million input tokens and $21/million output tokens, which is slightly less than GPT-4 Turbo and a little more than Claude 3 Sonnet. Gemini Pro also now support audio inputs and system prompts.
11:44am: OpenAI finally released the non-preview version of GPT-4 Turbo, integrating GPT-4 Vision directly into the model (previously it was separate). Vision mode now supports both functions and JSON output, previously unavailable for image inputs. OpenAI also claim that the new model is "Majorly improved" but no-one knows what they mean by that.
6:20pm (3:20am in their home country of France): Mistral tweet a link to a 281GB magnet BitTorrent of Mixtral 8x22B - their latest openly licensed model release, significantly larger than their previous best open model Mixtral 8x7B. I've not seen anyone get this running yet but it's likely to perform extremely well, given how good the original Mixtral was.
And while it wasn't released today (it came out last week), this morning Cohere's Command R+ (an excellent openly licensed model) reached position 6 on the LMSYS Chatbot Arena Leaderboard - the highest ever ranking for an open weights model.
Since I have a lot of software that builds on these models, I spent a bunch of time today publishing new releases of things.
Datasette Extract with GPT-4 Turbo Vision
I've been working on Datasette Extract for a while now: it's a plugin for Datasette that adds structured data extraction from unstructured text, powered by GPT-4 Turbo.
I updated it for the new model releases this morning, and decided to celebrate by making a video showing what it can do:
I want to start publishing videos like this more often, so this felt like a great opportunity to put that into practice.
The Datasette Cloud blog hasn't had an entry in a while, so I published screenshots and notes there to accompany the video.
Gemini Pro 1.5 system prompts
I really like system prompts - extra prompts you can pass to an LLM that give it instructions about how to process the main input. They're sadly not a guaranteed solution for prompt injection - even with instructions separated from data by a system prompt you can still over-ride them in the main prompt if you try hard enough - but they're still useful for non-adversarial situations.
llm-gemini 0.1a2 adds support for them, so now you can do things like this:
llm -m p15 'say hi three times three different ways' \
--system 'in spanish'
And get back output like this:
¡Hola! 👋 ¡Buenos días! ☀️ ¡Buenas tardes! 😊
Interestingly "in german" doesn't include emoji, but "in spanish" does.
I had to reverse-engineer the REST format for sending a system prompt from the Python library as the REST documentation hasn't been updated yet - notes on that in my issue.
datasette-enrichments-gpt using GPT-4 Turbo
Another small release: the datasette-enrichments-gpt plugin can enrich data in a table by running prompts through GPT-3.5, GPT-4 Turbo or GPT-4 Vision. I released version 0.4 switching to the new GPT-4 Turbo model.
Everything else
That covers today... but my last weeknotes were nearly four weeks ago! Here's everything else, with a few extra annotations:
Blog entries
All five of my most recent posts are about ways that I use LLM tools in my own work - see also my How I use LLMs and ChatGPT series.
Running OCR against PDFs and images directly in your browser
Building and testing C extensions for SQLite with ChatGPT Code Interpreter
Releases
Many of these releases relate to ongoing work on Datasette Cloud. In particular there's a flurry of minor releases to add descriptions to the action menu items added by various plugins, best illustrated by this screenshot:
datasette-enrichments-gpt 0.4 - 2024-04-10
Datasette enrichment for analyzing row data using OpenAI's GPT modelsllm-gemini 0.1a2 - 2024-04-10
LLM plugin to access Google's Gemini family of modelsdatasette-public 0.2.3 - 2024-04-09
Make specific Datasette tables visible to the publicdatasette-enrichments 0.3.2 - 2024-04-09
Tools for running enrichments against data stored in Datasettedatasette-extract 0.1a4 - 2024-04-09
Import unstructured data (text and images) into structured tablesdatasette-cors 1.0 - 2024-04-08
Datasette plugin for configuring CORS headersasgi-cors 1.0 - 2024-04-08
ASGI middleware for applying CORS headers to an ASGI applicationfiles-to-prompt 0.2.1 - 2024-04-08
Concatenate a directory full of files into a single prompt for use with LLMsdatasette-embeddings 0.1a3 - 2024-04-08
Store and query embedding vectors in Datasette tablesdatasette-studio 0.1a3 - 2024-04-06
Datasette pre-configured with useful plugins. Experimental alpha.datasette-paste 0.1a5 - 2024-04-06
Paste data to create tables in Datasettedatasette-import 0.1a4 - 2024-04-06
Tools for importing data into Datasettedatasette-enrichments-quickjs 0.1a2 - 2024-04-05
Enrich data with a custom JavaScript functions3-credentials 0.16.1 - 2024-04-05
A tool for creating credentials for accessing S3 bucketsllm-command-r 0.2 - 2024-04-04
Access the Cohere Command R family of modelsllm-nomic-api-embed 0.1 - 2024-03-30
Create embeddings for LLM using the Nomic APItextract-cli 0.1 - 2024-03-29
CLI for running files through AWS Textractllm-cmd 0.1a0 - 2024-03-26
Use LLM to generate and execute commands in your shelldatasette-write 0.3.2 - 2024-03-18
Datasette plugin providing a UI for executing SQL writes against the database
TILs
impaste: pasting images to piped commands on macOS - 2024-04-04
Installing tools written in Go - 2024-03-26
Google Chrome --headless mode - 2024-03-24
Reviewing your history of public GitHub repositories using ClickHouse - 2024-03-20
Running self-hosted QuickJS in a browser - 2024-03-20
Programmatically comparing Python version strings - 2024-03-17
Link 2024-04-09 Hello World:
Lennon McLean dives deep down the rabbit hole of what happens when you execute the binary compiled from "Hello world" in C on a Linux system, digging into the details of ELF executables, objdump disassembly, the C standard library, stack frames, null-terminated strings and taking a detour through musl because it's easier to read than Glibc.
Link 2024-04-09 llm.c:
Andrej Karpathy implements LLM training - initially for GPT-2, other architectures to follow - in just over 1,000 lines of C on top of CUDA. Includes a tutorial about implementing LayerNorm by porting an implementation from Python.
Link 2024-04-09 Command R+ now ranked 6th on the LMSYS Chatbot Arena:
The LMSYS Chatbot Arena Leaderboard is one of the most interesting approaches to evaluating LLMs because it captures their ever-elusive "vibes" - it works by users voting on the best responses to prompts from two initially hidden models
Big news today is that Command R+ - the brand new open weights model (Creative Commons non-commercial) by Cohere - is now the highest ranked non-proprietary model, in at position six and beating one of the GPT-4s.
(Linking to my screenshot on Mastodon.)
Link 2024-04-09 A solid pattern to build LLM Applications (feat. Claude):
Hrishi Olickel is one of my favourite prompt whisperers. In this YouTube video he walks through his process for building quick interactive applications with the assistance of Claude 3, spinning up an app that analyzes his meeting transcripts to extract participants and mentioned organisations, then presents a UI for exploring the results built with Next.js and shadcn/ui.
An interesting tip I got from this: use the weakest, not the strongest models to iterate on your prompts. If you figure out patterns that work well with Claude 3 Haiku they will have a significantly lower error rate with Sonnet or Opus. The speed of the weaker models also means you can iterate much faster, and worry less about the cost of your experiments.
Link 2024-04-09 Extracting data from unstructured text and images with Datasette and GPT-4 Turbo:
Datasette Extract is a new Datasette plugin that uses GPT-4 Turbo (released to general availability today) and GPT-4 Vision to extract structured data from unstructured text and images.
I put together a video demo of the plugin in action today, and posted it to the Datasette Cloud blog along with screenshots and a tutorial describing how to use it.
Link 2024-04-10 Mistral tweet a magnet link for mixtral-8x22b:
Another open model release from Mistral using their now standard operating procedure of tweeting out a raw torrent link.
This one is an 8x22B Mixture of Experts model. Their previous most powerful openly licensed release was Mixtral 8x7B, so this one is a whole lot bigger (a 281GB download) - and apparently has a 65,536 context length, at least according to initial rumors on Twitter.
Link 2024-04-10 Gemini 1.5 Pro public preview:
Huge release from Google: Gemini 1.5 Pro - the GPT-4 competitive model with the incredible 1 million token context length - is now available without a waitlist in 180+ countries (including the USA but not Europe or the UK as far as I can tell)... and the API is free for 50 requests/day (rate limited to 2/minute).
Beyond that you'll need to pay - $7/million input tokens and $21/million output tokens, which is slightly less than GPT-4 Turbo and a little more than Claude 3 Sonnet.
They also announced audio input (up to 9.5 hours in a single prompt), system instruction support and a new JSON mod.