Three major LLM releases in 24 hours

Google Gemini Pro 1.5 is free, GPT-4 Turbo has Vision, Mixtral 8x22B released in a tweet

Apr 10, 2024

In this newsletter:

Three major LLM releases in 24 hours

Plus 7 links

Three major LLM releases in 24 hours - 2024-04-10

I'm a bit behind on my weeknotes, so there's a lot to cover here. But first... a review of the last 24 hours of Large Language Model news. All times are in US Pacific.

11:01am: Google Gemini Pro 1.5 hits general availability, here's the blog post - their 1 million token context GPT-4 class model now has no waitlist, is available to anyone in 180 countries (not including Europe or the UK as far as I can tell) and most impressively all the API has a free tier that allows up to 50 requests a day, though rate limited to 2 per minute. Beyond that you can pay $7/million input tokens and $21/million output tokens, which is slightly less than GPT-4 Turbo and a little more than Claude 3 Sonnet. Gemini Pro also now support audio inputs and system prompts.
11:44am: OpenAI finally released the non-preview version of GPT-4 Turbo, integrating GPT-4 Vision directly into the model (previously it was separate). Vision mode now supports both functions and JSON output, previously unavailable for image inputs. OpenAI also claim that the new model is "Majorly improved" but no-one knows what they mean by that.
6:20pm (3:20am in their home country of France): Mistral tweet a link to a 281GB magnet BitTorrent of Mixtral 8x22B - their latest openly licensed model release, significantly larger than their previous best open model Mixtral 8x7B. I've not seen anyone get this running yet but it's likely to perform extremely well, given how good the original Mixtral was.

And while it wasn't released today (it came out last week), this morning Cohere's Command R+ (an excellent openly licensed model) reached position 6 on the LMSYS Chatbot Arena Leaderboard - the highest ever ranking for an open weights model.

Since I have a lot of software that builds on these models, I spent a bunch of time today publishing new releases of things.

Datasette Extract with GPT-4 Turbo Vision

I've been working on Datasette Extract for a while now: it's a plugin for Datasette that adds structured data extraction from unstructured text, powered by GPT-4 Turbo.

I updated it for the new model releases this morning, and decided to celebrate by making a video showing what it can do:

I want to start publishing videos like this more often, so this felt like a great opportunity to put that into practice.

The Datasette Cloud blog hasn't had an entry in a while, so I published screenshots and notes there to accompany the video.

Gemini Pro 1.5 system prompts

I really like system prompts - extra prompts you can pass to an LLM that give it instructions about how to process the main input. They're sadly not a guaranteed solution for prompt injection - even with instructions separated from data by a system prompt you can still over-ride them in the main prompt if you try hard enough - but they're still useful for non-adversarial situations.

llm-gemini 0.1a2 adds support for them, so now you can do things like this:

llm -m p15 'say hi three times three different ways' \
  --system 'in spanish'

And get back output like this:

¡Hola! 👋 ¡Buenos días! ☀️ ¡Buenas tardes! 😊

Interestingly "in german" doesn't include emoji, but "in spanish" does.

I had to reverse-engineer the REST format for sending a system prompt from the Python library as the REST documentation hasn't been updated yet - notes on that in my issue.

datasette-enrichments-gpt using GPT-4 Turbo

Another small release: the datasette-enrichments-gpt plugin can enrich data in a table by running prompts through GPT-3.5, GPT-4 Turbo or GPT-4 Vision. I released version 0.4 switching to the new GPT-4 Turbo model.

Everything else

That covers today... but my last weeknotes were nearly four weeks ago! Here's everything else, with a few extra annotations:

Blog entries

All five of my most recent posts are about ways that I use LLM tools in my own work - see also my How I use LLMs and ChatGPT series.

Releases

Many of these releases relate to ongoing work on Datasette Cloud. In particular there's a flurry of minor releases to add descriptions to the action menu items added by various plugins, best illustrated by this screenshot:

A screenshot showing the database actions, table actions and row actions menus in Datasette running on Datasette Cloud. The database menu items are: Upload CSV. Create a new table by uploading a CSV file. Execute SQL write. Run queries like insert/update/delete against this database. Query this database with Al assistance. Ask a question to build a SQL query. Create table with Al extracted data. Paste in text or an image to extract structured data. Edit database metadata. Set the description, source and license for this database. Create a table. Define a new table with specified columns. Create table with pasted data. Paste in JSON, CSV or TSV data (e.g. from Google Sheets). Export this database. Create and download a snapshot of this SQLite database (1.3 GB). The table menu items: Delete this table. Delete table and all rows within it. Enrich selected data. Run a data cleaning operation against every selected row. Query this table with Al assistance. Ask a question to build a SQL query. Extract data into this table with Al. Paste in text or an image to extract structured data. Edit table metadata. Set the description, source and license for this table. Edit table schema. Rename the table, add and remove columns.... Make table public. Allow anyone to view this table. Configure full-text search. Select columns to make searchable for this table. The row menu items: Enrich this row. Run a dat acleaning operation against this row.

datasette-enrichments-gpt 0.4 - 2024-04-10
Datasette enrichment for analyzing row data using OpenAI's GPT models
llm-gemini 0.1a2 - 2024-04-10
LLM plugin to access Google's Gemini family of models
datasette-public 0.2.3 - 2024-04-09
Make specific Datasette tables visible to the public
datasette-enrichments 0.3.2 - 2024-04-09
Tools for running enrichments against data stored in Datasette
datasette-extract 0.1a4 - 2024-04-09
Import unstructured data (text and images) into structured tables
datasette-cors 1.0 - 2024-04-08
Datasette plugin for configuring CORS headers
asgi-cors 1.0 - 2024-04-08
ASGI middleware for applying CORS headers to an ASGI application
files-to-prompt 0.2.1 - 2024-04-08
Concatenate a directory full of files into a single prompt for use with LLMs
datasette-embeddings 0.1a3 - 2024-04-08
Store and query embedding vectors in Datasette tables
datasette-studio 0.1a3 - 2024-04-06
Datasette pre-configured with useful plugins. Experimental alpha.
datasette-paste 0.1a5 - 2024-04-06
Paste data to create tables in Datasette
datasette-import 0.1a4 - 2024-04-06
Tools for importing data into Datasette
datasette-enrichments-quickjs 0.1a2 - 2024-04-05
Enrich data with a custom JavaScript function
s3-credentials 0.16.1 - 2024-04-05
A tool for creating credentials for accessing S3 buckets
llm-command-r 0.2 - 2024-04-04
Access the Cohere Command R family of models
llm-nomic-api-embed 0.1 - 2024-03-30
Create embeddings for LLM using the Nomic API
textract-cli 0.1 - 2024-03-29
CLI for running files through AWS Textract
llm-cmd 0.1a0 - 2024-03-26
Use LLM to generate and execute commands in your shell
datasette-write 0.3.2 - 2024-03-18
Datasette plugin providing a UI for executing SQL writes against the database

TILs

impaste: pasting images to piped commands on macOS - 2024-04-04
Installing tools written in Go - 2024-03-26
Google Chrome --headless mode - 2024-03-24
Reviewing your history of public GitHub repositories using ClickHouse - 2024-03-20
Running self-hosted QuickJS in a browser - 2024-03-20
Programmatically comparing Python version strings - 2024-03-17

Link 2024-04-09 Hello World:

Lennon McLean dives deep down the rabbit hole of what happens when you execute the binary compiled from "Hello world" in C on a Linux system, digging into the details of ELF executables, objdump disassembly, the C standard library, stack frames, null-terminated strings and taking a detour through musl because it's easier to read than Glibc.

Link 2024-04-09 llm.c:

Andrej Karpathy implements LLM training - initially for GPT-2, other architectures to follow - in just over 1,000 lines of C on top of CUDA. Includes a tutorial about implementing LayerNorm by porting an implementation from Python.

Link 2024-04-09 Command R+ now ranked 6th on the LMSYS Chatbot Arena:

The LMSYS Chatbot Arena Leaderboard is one of the most interesting approaches to evaluating LLMs because it captures their ever-elusive "vibes" - it works by users voting on the best responses to prompts from two initially hidden models

Big news today is that Command R+ - the brand new open weights model (Creative Commons non-commercial) by Cohere - is now the highest ranked non-proprietary model, in at position six and beating one of the GPT-4s.

(Linking to my screenshot on Mastodon.)

Link 2024-04-09 A solid pattern to build LLM Applications (feat. Claude):

Hrishi Olickel is one of my favourite prompt whisperers. In this YouTube video he walks through his process for building quick interactive applications with the assistance of Claude 3, spinning up an app that analyzes his meeting transcripts to extract participants and mentioned organisations, then presents a UI for exploring the results built with Next.js and shadcn/ui.

An interesting tip I got from this: use the weakest, not the strongest models to iterate on your prompts. If you figure out patterns that work well with Claude 3 Haiku they will have a significantly lower error rate with Sonnet or Opus. The speed of the weaker models also means you can iterate much faster, and worry less about the cost of your experiments.

Link 2024-04-09 Extracting data from unstructured text and images with Datasette and GPT-4 Turbo:

Datasette Extract is a new Datasette plugin that uses GPT-4 Turbo (released to general availability today) and GPT-4 Vision to extract structured data from unstructured text and images.

I put together a video demo of the plugin in action today, and posted it to the Datasette Cloud blog along with screenshots and a tutorial describing how to use it.

Link 2024-04-10 Mistral tweet a magnet link for mixtral-8x22b:

Another open model release from Mistral using their now standard operating procedure of tweeting out a raw torrent link.

This one is an 8x22B Mixture of Experts model. Their previous most powerful openly licensed release was Mixtral 8x7B, so this one is a whole lot bigger (a 281GB download) - and apparently has a 65,536 context length, at least according to initial rumors on Twitter.

Link 2024-04-10 Gemini 1.5 Pro public preview:

Huge release from Google: Gemini 1.5 Pro - the GPT-4 competitive model with the incredible 1 million token context length - is now available without a waitlist in 180+ countries (including the USA but not Europe or the UK as far as I can tell)... and the API is free for 50 requests/day (rate limited to 2/minute).

Beyond that you'll need to pay - $7/million input tokens and $21/million output tokens, which is slightly less than GPT-4 Turbo and a little more than Claude 3 Sonnet.

They also announced audio input (up to 9.5 hours in a single prompt), system instruction support and a new JSON mod.

Simon Willison’s Newsletter

Discussion about this post