Build an image search engine with llm-clip, chat with models with llm chat

A new release of my LLM CLI tool for working with Large Language Models

Sep 12, 2023

In this newsletter:

Build an image search engine with llm-clip, chat with models with llm chat

Plus 11 links and 3 TILs

Build an image search engine with llm-clip, chat with models with llm chat - 2023-09-12

LLM is my combination CLI tool and Python library for working with Large Language Models. I just released LLM 0.10 with two significant new features: embedding support for binary files and the llm chat command.

Image search by embedding images with CLIP

I wrote about LLM's support for embeddings (including what those are and why they're interesting) when I released 0.9 last week.

That initial release could only handle embeddings of text - great for things like building semantic search and finding related content, but not capable of handling other types of data.

It turns out there are some really interesting embedding models for working with binary data. Top of the list for me is CLIP, released by OpenAI in January 2021.

CLIP has a really impressive trick up its sleeve: it can embed both text and images into the same vector space.

This means you can create an index for a collection of photos, each placed somewhere in 512-dimensional space. Then you can take a text string - like "happy dog" - and embed that into the same space. The images that are closest to that location will be the ones that contain happy dogs!

My llm-clip plugin provides the CLIP model, loaded via SentenceTransformers. You can install and run it like this:

llm install llm-clip
llm embed-multi photos --files photos/ '*.jpg' --binary -m clip

This will install the llm-clip plugin, then use embed-multi to embed all of the JPEG files in the photos/ directory using the clip model.

The resulting embedding vectors are stored in an embedding collection called photos. This defaults to going in the embeddings.db SQLite database managed by LLM, or you can add -d photos.db to store it in a separate database instead.

Then you can run text similarity searches against that collection using llm similar:

llm similar photos -c 'raccoon'

I get back:

{"id": "IMG_4801.jpeg", "score": 0.28125139257127457, "content": null, "metadata": null}
{"id": "IMG_4656.jpeg", "score": 0.26626441704164294, "content": null, "metadata": null}
{"id": "IMG_2944.jpeg", "score": 0.2647445926996852, "content": null, "metadata": null}

And sure enough, IMG_4801.jpeg is this:

A night time blurry photo of a Raccoon in a trash can. A stencilled label on the bin below the raccoon's face says TRASH

(I was pretty excited to snap a photo of a trash panda in an actual trash can.)

CLIP is a pretty old model at this point, and there are plenty of interesting alternatives that are just waiting for someone to wrap them in a plugin. I'm particularly excited about Facebook's ImageBind, which can embed images, text, audio, depth, thermal, and IMU data all in the same vector space!

Chat with models using `llm chat`

The other big feature in LLM 0.10 is the new llm chat command.

Prior to this release, the way to have an ongoing conversation with a model was through the -c/--continue flag. You could start a conversation like this:

llm -m gpt-4 'Help me understand generators in Python'

Then ask a follow-up question using -c like so:

llm -c 'Show me an example involving the cast of Riverdale'

This works well, and everything gets logged to SQLite so you can run llm logs -c to see a full log of your most recent conversation.

You could continue a past conversation as well, using llm --cid ID to specify the conversation ID, recorded in those logs.

There was one big problem though: LLM plugins allow you to install models on your own machine, but a conversation with them that spans multiple command invocations resulted in that model being loaded into memory and then un-loaded again for every prompt.

The new llm chat solves that. You can now run llm chat -m model_id to start an interactive chat in your terminal with your model of choice.

Here's an example using Llama 2 13B, installed via the llm-mlc plugin.

llm chat -m mlc-chat-Llama-2-13b-chat-hf-q4f16_1

You can set an alias for the model to make that easier to remember.

Here's an example chat session with Llama 2:

Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
> Who are you?
Hello! I'm just an AI, here to assist you with any questions you may have.
My name is LLaMA, and I'm a large language model trained to provide helpful
and informative responses to a wide range of questions and topics. I'm here
to help you find the information you need, while ensuring a safe and
respectful conversation. Please feel free to ask me anything, and I'll do my
best to provide a helpful and accurate response.
> Tell me a joke about otters
Sure, here's a joke about otters:

Why did the otter go to the party?

Because he heard it was a "whale" of a time!

(Get it? Whale, like a big sea mammal, but also a "wild" or "fun" time.
Otters are known for their playful and social nature, so it's a lighthearted
and silly joke.)

I hope that brought a smile to your face! Do you have any other questions or
topics you'd like to discuss?
> exit

I like how Llama 2 enthusiastically explains its own jokes! This is a great demonstration of why techniques like RLHF are so important - you need to go way beyond a raw language model if you're going to teach one not to be this corny.

Each line of your chat will be executed as soon as you hit <enter>. Sometimes you might need to enter a multi-line prompt, for example if you need to paste in an error message. You can do that using the !multi token, like this:

llm chat -m gpt-4

Chatting with gpt-4
Type 'exit' or 'quit' to exit
Type '!multi' to enter multiple lines, then '!end' to finish
> !multi custom-end
 Explain this error:

   File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/opt/homebrew/Caskroom/miniconda/base/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>

 !end custom-end

llm chat also supports system prompts and templates. If you want to chat with a sentient cheesecake, try this:

llm chat -m gpt-3.5-turbo --system '
You are a stereotypical sentient cheesecake with strong opinions
who always talks about cheesecake'

You can save those as templates too:

llm --system 'You are a stereotypical sentient cheesecake with
strong opinions who always talks about cheesecake' --save cheesecake -m gpt-4
llm chat -t cheesecake

For more options, see the llm chat documentation.

Get involved

My ambition for LLM is for it to provide the easiest way to try out new models, both full-sized Large Language Models and now embedding models such as CLIP.

I'm not going to write all of these plugins myself!

If you want to help out, please come and say hi in the #llm Discord channel.

Link 2023-09-05 A token-wise likelihood visualizer for GPT-2: Linus Lee built a superb visualization to help demonstrate how Large Language Models work, in the form of a video essay where each word is coloured to show how "surprising" it is to the model. It's worth carefully reading the text in the video as each term is highlighted to get the full effect.

Link 2023-09-05 Symbex 1.4: New release of my Symbex tool for finding symbols (functions, methods and classes) in a Python codebase. Symbex can now output matching symbols in JSON, CSV or TSV in addition to plain text.

I designed this feature for compatibility with the new "llm embed-multi" command - so you can now use Symbex to find every Python function in a nested directory and then pipe them to LLM to calculate embeddings for every one of them.

I tried it on my projects directory and embedded over 13,000 functions in just a few minutes! Next step is to figure out what kind of interesting things I can do with all of those embeddings.

Link 2023-09-06 Perplexity: interactive LLM visualization: I linked to a video of Linus Lee's GPT visualization tool the other day. Today he's released a new version of it that people can actually play with: it runs entirely in a browser, powered by a 120MB version of the GPT-2 ONNX model loaded using the brilliant Transformers.js JavaScript library.

Link 2023-09-06 Using ChatGPT Code Intepreter (aka "Advanced Data Analysis") to analyze your ChatGPT history: I posted a short thread showing how to upload your ChatGPT history to ChatGPT itself, then prompt it with "Build a dataframe of the id, title, create_time properties from the conversations.json JSON array of objects. Convert create_time to a date and plot it daily".

Link 2023-09-06 hubcap.php: This PHP script by Dave Hulbert delights me. It's 24 lines of code that takes a specified goal, then calls my LLM utility on a loop to request the next shell command to execute in order to reach that goal... and pipes the output straight into exec() after a 3s wait so the user can panic and hit Ctrl+C if it's about to do something dangerous!

TIL 2023-09-08 Running Datasette on Hugging Face Spaces:

Julien Chaumond, this morning (replying to my tweet about my Hugging Face TheBloke model git scraper): …

Link 2023-09-08 bpy - Blender on PyPI

TIL you can "pip install" Blender!

bpy "provides Blender as a Python module" - it's part of the official Blender project, and ships with binary wheels ranging in size from 168MB to 319MB depending on your platform.

It only supports the version of Python used by the current Blender release though - right now that's Python 3.10.

TIL 2023-09-08 Embedding paragraphs from my blog with E5-large-v2:

Xeophon suggested that E5-large-v2 as an embedding model that was worth a closer look. …

Link 2023-09-08 Dynamic linker tricks: Using LD_PRELOAD to cheat, inject features and investigate programs: This tutorial by Rafał Cieślak from 2013 filled in a bunch of gaps in my knowledge about how C works on Linux.

TIL 2023-09-09 Summarizing Hacker News discussion themes with Claude and LLM:

I've been experimenting with the combination of Claude and my LLM CLI tool to give me quick summaries of long discussions on Hacker News. …

Link 2023-09-09 Matthew Honnibal from spaCy on why LLMs have not solved NLP: A common trope these days is that the entire field of NLP has been effectively solved by Large Language Models. Here's a lengthy comment from Matthew Honnibal, creator of the highly regarded spaCy Python NLP library, explaining in detail why that argument doesn't hold up.

Link 2023-09-10 promptfoo: How to benchmark Llama2 Uncensored vs. GPT-3.5 on your own inputs: promptfoo is a CLI and library for "evaluating LLM output quality". This tutorial in their documentation about using it to compare Llama 2 to gpt-3.5-turbo is a good illustration of how it works: it uses YAML files to configure the prompts, and more YAML to define assertions such as "not-icontains: AI language model".

Link 2023-09-10 The AI-assistant wars heat up with Claude Pro, a new ChatGPT Plus rival: I'm quoted in this piece about the new Claude Pro $20/month subscription from Anthropic:

> Willison has also run into problems with Claude's morality filter, which has caused him trouble by accident: "I tried to use it against a transcription of a podcast episode, and it processed most of the text before—right in front of my eyes—it deleted everything it had done! I eventually figured out that they had started talking about bomb threats against data centers towards the end of the episode, and Claude effectively got triggered by that and deleted the entire transcript."

Link 2023-09-10 All models on Hugging Face, sorted by downloads: I realized this morning that "sort by downloads" against the list of all of the models on Hugging Face can work as a reasonably good proxy for "which of these models are easiest to get running on your own computer".

Simon Willison’s Newsletter

Discussion about this post

Ready for more?