In this newsletter:
sqlite-utils now supports plugins
Plus 8 links
sqlite-utils now supports plugins - 2023-07-24
sqlite-utils 3.34 is out with a major new feature: support for plugins.
sqlite-utils
is my combination Python library and command-line tool for manipulating SQLite databases. It recently celebrated its fifth birthday, and has had over 100 releases since it first launched back in 2018.
The new plugin system is inspired by similar mechanisms in Datasette and LLM. It lets developers add new features to sqlite-utils
without needing to get their changes accepted by the core project.
I love plugin systems. As an open source maintainer they are by far the best way to encourage people to contribute to my projects - I can genuinely wake up in the morning and my software has new features, and I didn't even need to review a pull request.
Plugins also offer a fantastic medium for exploration and experimentation. I can try out new ideas without committing to supporting them in core, and without needing to tie improvements to them to the core release cycle.
Version 3.34 adds two initial plugin hooks: register_commands()
and prepare_connection()
. These are both based on the equivalent hooks in Datasette.
I planned to just ship register_commands()
, but Alex Garcia spotted my activity on the repo and submitted a PR adding prepare_connection()
literally minutes before I had intended to ship the release!
register_commands()
The register_commands()
hook lets you add new commands to the sqlite-utils
command-line tool - so users can run sqlite-utils your-new-command
to access your feature.
I've learned from past experience that you should never ship a plugin hook without also releasing at least one plugin that uses it. I've built two so far for register_commands()
:
sqlite-utils-shell adds a simply interactive shell, accessed using
sqlite-utils shell
for an in-memory database orsqlite-utils shell data.db
to run it against a specific database file.sqlite-migrate is my first draft of a database migrations system for SQLite, loosely inspired by Django migrations and previewed by the migration mechanism I added to LLM.
Try out the shell plugin like this:
sqlite-utils install sqlite-utils-shell
sqlite-utils shell
The interface looks like this:
In-memory database, content will be lost on exit
Type 'exit' to exit.
sqlite-utils> select 3 + 5;
3 + 5
-------
8
sqlite-utils>
prepare_connection()
This hook, contributed by Alex, lets you modify the connection object before it is used to execute any SQL. Most importantly, this lets you register custom SQLite functions.
I expect this to be the most common category of plugin. I've built one so far: sqlite-utils-dateutil, which adds functions for parsing dates and times using the dateutil library.
It lets you do things like this:
sqlite-utils install sqlite-utils-dateutil
sqlite-utils memory "select dateutil_parse('3rd october')" -t
Output:
dateutil_parse('3rd october')
-------------------------------
2023-10-03T00:00:00
This works inside sqlite-shell
too.
Plugins that you install also become available in the Python API interface to sqlite-utils
:
>>> import sqlite_utils
>>> db = sqlite_utils.Database(memory=True)
>>> list(db.query("select dateutil_parse('3rd october')"))
[{"dateutil_parse('3rd october')": '2023-10-03T00:00:00'}]
You can opt out of executing installed plugins by passing execute_plugins=False
to the Database()
constructor:
>>> db = sqlite_utils.Database(memory=True, execute_plugins=False)
>>> list(db.query("select dateutil_parse('3rd october')"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../site-packages/sqlite_utils/db.py", line 494, in query
cursor = self.execute(sql, params or tuple())
File ".../site-packages/sqlite_utils/db.py", line 512, in execute
return self.conn.execute(sql, parameters)
sqlite3.OperationalError: no such function: dateutil_parse
sqlite-ml by Romain Clement
I quietly released sqlite-utils 3.34
on Saturday. The community has already released several plugins for it!
Romain Clement built sqlite-utils-ml, a plugin wrapper for his sqlite-ml project.
This adds custom SQL functions for training machine learning models and running predictions, entirely within SQLite, using algorithms from scikit-learn.
Here's what that looks like running inside sqlite-utils shell
:
sqlite-utils install sqlite-utils-shell sqlite-utils-ml
sqlite-utils shell ml.db
Attached to ml.db
Type 'exit' to exit.
sqlite-utils> select sqml_load_dataset('iris') as dataset;
dataset
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
{"table": "dataset_iris", "feature_names": ["sepal length (cm)", "sepal width (cm)", "petal length (cm)", "petal width (cm)"], "target_names": ["setosa", "versicolor", "virginica"], "size": 150}
sqlite-utils> select sqml_train(
...> 'Iris prediction',
...> 'classification',
...> 'logistic_regression',
...> 'dataset_iris',
...> 'target'
...> ) as training;
training
--------------------------------------------------------------------------------------------------------------------------------------------------------------
{"experiment_name": "Iris prediction", "prediction_type": "classification", "algorithm": "logistic_regression", "deployed": true, "score": 0.9736842105263158}
sqlite-utils> select
...> dataset_iris.*,
...> sqml_predict(
...> 'Iris prediction',
...> json_object(
...> 'sepal length (cm)', [sepal length (cm)],
...> 'sepal width (cm)', [sepal width (cm)],
...> 'petal length (cm)', [petal length (cm)],
...> 'petal width (cm)', [petal width (cm)]
...> )
...> ) as prediction
...> from dataset_iris
...> limit 1;
sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target prediction
------------------- ------------------ ------------------- ------------------ -------- ------------
5.1 3.5 1.4 0.2 0 0
SQLite extensions by Alex Garcia
Alex Garcia has a growing collection of SQLite extensions, many of which are written in Rust but are packaged as wheels for ease of installation using Python.
Alex released five plugins for SQLite corresponding to five of his existing extensions:
sqlite-utils-sqlite-regex
sqlite-utils-sqlite-path
sqlite-utils-sqlite-url
sqlite-utils-sqlite-ulid
sqlite-utils-sqlite-lines
Here's an example of sqlite-utils-sqlite-ulid
in action:
sqlite-utils install sqlite-utils-sqlite-ulid
sqlite-utils memory 'select ulid() u1, ulid() u2, ulid() u3' | jq
Output:
[
{
"u1": "01h64d1ysg1rx63z1gwy7nah4n",
"u2": "01h64d1ysgd7vx04sc9pncqh10",
"u3": "01h64d1ysgz1sy7njkqt86dkq9"
}
]
I've started a sqlite-utils plugin directory with a list of all of the plugins so far.
Building your own plugin
If you want to try building your own plugin, the documentation includes a simple step-by-step guide. A plugin can be built with as little as two files: a Python module implementing the hooks, and a pyproject.toml
module with metadata about how it should be installed.
I've also released a new cookiecutter template: simonw/sqlite-utils-plugin. Here's how to use that to get started building a plugin:
cookiecutter gh:simonw/sqlite-utils-plugin
Answer the form fields like this:
plugin_name []: rot13
description []: select rot13('text') as a sqlite-utils plugin
hyphenated [rot13]:
underscored [rot13]:
github_username []: your-username
author_name []: your-name
Change directory into the new folder and use sqlite-utils install -e
to install an editable version of your plugin, so changes you make will be reflected when you run the tool:
cd sqlite-utils-rot13
sqlite-utils install -e .
Run this command to confirm the plugin has been installed:
sqlite-utils plugins
You should see this:
[
{
"name": "sqlite-utils-rot13",
"hooks": [
"prepare_connection"
],
"version": "0.1"
}
]
Now drop this code into the sqlite_utils_rot13.py
file:
import sqlite_utils
def rot13(s):
chars = []
for v in s:
c = ord(v)
if c >= ord("a") and c <= ord("z"):
if c > ord("m"):
c -= 13
else:
c += 13
elif c >= ord("A") and c <= ord("Z"):
if c > ord("M"):
c -= 13
else:
c += 13
chars.append(chr(c))
return "".join(chars)
@sqlite_utils.hookimpl
def prepare_connection(conn):
conn.create_function("rot13", 1, rot13)
And try it out like this:
sqlite-utils memory "select rot13('hello world')"
Output:
[{"rot13('hello world')": "uryyb jbeyq"}]
And to reverse that:
sqlite-utils memory "select rot13('uryyb jbeyq')"
Output:
[{"rot13('uryyb jbeyq')": "hello world"}]
As you can see, building plugins can be done with very little code. I'm excited to see what else people build with this new capability!
Link 2023-07-18 Ollama: This tool for running LLMs on your own laptop directly includes an installer for macOS (Apple Silicon) and provides a terminal chat interface for interacting with models. They already have Llama 2 support working, with a model that downloads directly from their own registry service without need to register for an account or work your way through a waiting list.
Link 2023-07-19 llama2-mac-gpu.sh: Adrien Brault provided this recipe for compiling llama.cpp on macOS with GPU support enabled ("LLAMA_METAL=1 make") and then downloading and running a GGML build of Llama 2 13B.
Link 2023-07-19 Llama 2: The New Open LLM SOTA: I'm in this Latent Space podcast, recorded yesterday, talking about the Llama 2 release.
Link 2023-07-20 Study claims ChatGPT is losing capability, but some experts aren’t convinced: Benj Edwards talks about the ongoing debate as to whether or not GPT-4 is getting weaker over time. I remain skeptical of those claims - I think it's more likely that people are seeing more of the flaws now that the novelty has worn off.
I'm quoted in this piece: "Honestly, the lack of release notes and transparency may be the biggest story here. How are we meant to build dependable software on top of a platform that changes in completely undocumented and mysterious ways every few months?"
Link 2023-07-20 sqlite-vss v0.1.1 Annotated Release Notes: Alex Garcia's sqlite-vss adds vector search directly to SQLite through a custom extension. It's now easily installed for Python, Node.js, Deno, Elixir, Go, Rust and Ruby ("gem install sqlite-vss"), and is being used actively by enough people that Alex is getting actionable feedback, including fixes for memory leaks spotted in production.
Link 2023-07-20 Prompt injected OpenAI's new Custom Instructions to see how it is implemented: ChatGPT added a new "custom instructions" feature today, which you can use to customize the system prompt used to control how it responds to you. swyx prompt-inject extracted the way it works:
"The user provided the following information about themselves. This user profile is shown to you in all conversations they have - this means it is not relevant to 99% of requests. Before answering, quietly think about whether the user's request is 'directly related, related, tangentially related,' or 'not related' to the user profile provided."
I'm surprised to see OpenAI using "quietly think about..." in a prompt like this - I wouldn't have expected that language to be necessary.
Link 2023-07-24 LLM can now be installed directly from Homebrew: I spent a bunch of time on this at the weekend: my LLM tool for interacting with large language models from the terminal has now been accepted into Homebrew core, and can be installed directly using "brew install llm". I was previously running my own separate tap, but having it in core means that it benefits from Homebrew's impressive set of build systems - each release of LLM now has Bottles created for it automatically across a range of platforms, so "brew install llm" should quickly download binary assets rather than spending several minutes installing dependencies the slow way.
Link 2023-07-24 asgi-replay: As part of submitting LLM to Homebrew core I needed an automated test that demonstrated that the tool was working - but I couldn't test against the live OpenAI API because I didn't want to have to reveal my API token as part of the test. I solved this by creating a dummy HTTP endpoint that simulates a hit to the OpenAI API, then configuring the Homebrew test to hit that instead. As part of THAT I ended up building this tiny tool which uses my asgi-proxy-lib package to intercept and log the details of hits made to a service, then provides a mechanism to replay that traffic.