Deciphering clues in a news article to understand how it was reported
Plus notes on the chaos at OpenAI and the new Claude 2.1 from Anthropic
In this newsletter:
Deciphering clues in a news article to understand how it was reported
Weeknotes: DevDay, GitHub Universe, OpenAI chaos
Plus 11 links and 6 quotations and 1 TIL
Thanks for reading Simon Willison’s Newsletter! Subscribe for free to receive new posts and support my work.
Written journalism is full of conventions that hint at the underlying reporting process, many of which are not entirely obvious. Learning how to read and interpret these can help you get a lot more out of the news.
I'm going to use a recent article about the ongoing OpenAI calamity to illustrate some of these conventions.
I've personally been bewildered by the story that's been unfolding since Sam Altman was fired by the board of directors of the OpenAI non-profit last Friday. The single biggest question for me has been why - why did the board make this decision?
Before Altman’s Ouster, OpenAI’s Board Was Divided and Feuding by Cade Metz, Tripp Mickle and Mike Isaac for the New York Times is one of the first articles I've seen that felt like it gave me a glimmer of understanding.
It's full of details that I hadn't heard before, almost all of which came from anonymous sources.
But how trustworthy are these details? If you don't know the names of the sources, how can you trust the information that they provide?
This is where it's helpful to understand the language that journalists use to hint at how they gathered the information for the story.
The story starts with this lede:
Before Sam Altman was ousted from OpenAI last week, he and the company’s board of directors had been bickering for more than a year. The tension got worse as OpenAI became a mainstream name thanks to its popular ChatGPT chatbot.
The job of the rest of the story is to back that up.
Sources in these kinds of stories are either named or anonymous. Anonymous sources have a good reason to stay anonymous. Note that they are not anonymous to the journalist, and probably not to their editor either (except in rare cases).
There needs to be a legitimate reason for them to stay anonymous, or the journalist won't use them as a source.
This raises a number of challenges for the journalist:
How can you trust the information that the source is providing, if they're not willing to attach their name and reputation to it?
How can you confirm that information?
How can you convince your editors and readers that the information is trustworthy?
Anything coming from an anonymous source needs to be confirmed. A common way to confirm it is to get that same information from multiple sources, ideally from sources that don't know each other.
This is fundamental to the craft of journalism: how do you determine the likely truth, in a way that's robust enough to publish?
Hints to look out for
The language of a story like this will include crucial hints about how the information was gathered.
Try scanning for words like according to or email or familiar.
Let's review some examples (emphasis mine):
Mr. Altman complained that the research paper seemed to criticize OpenAI’s efforts to keep its A.I. technologies safe while praising the approach taken by Anthropic, according to an email that Mr. Altman wrote to colleagues and that was viewed by The New York Times.
"according to an email [...] that was viewed by The New York Times" means a source showed them an email. In that case they likely treated the email as a primary source document, without finding additional sources.
Senior OpenAI leaders, including Mr. Sutskever, who is deeply concerned that A.I. could one day destroy humanity, later discussed whether Ms. Toner should be removed, a person involved in the conversations said.
Here we only have a single source, "a person involved in the conversations". This speaks to the journalist's own judgement: this person here is likely deemed credible enough that they are acceptable as the sole data point.
But shortly after those discussions, Mr. Sutskever did the unexpected: He sided with board members to oust Mr. Altman, according to two people familiar with the board’s deliberations.
Now we have two people "familiar with the board’s deliberations" - which is better, because this is a key point that the entire story rests upon.
Familiar with comes up a lot in this story:
Mr. Sutskever's frustration with Mr. Altman echoed what had happened in 2021 when another senior A.I. scientist left OpenAI to form the company Anthropic. That scientist and other researchers went to the board to try to push Mr. Altman out. After they failed, they gave up and departed, according to three people familiar with the attempt to push Mr. Altman out.
This is one of my favorite points in the whole article. I know that Anthropic was formed by a splinter-group from OpenAI who had disagreements about OpenAI's approach to AI safety, but I had no idea that they had first tried to push Sam Altman out of OpenAI itself.
“After a series of reasonably amicable negotiations, the co-founders of Anthropic were able to negotiate their exit on mutually agreeable terms,” an Anthropic spokeswoman, Sally Aldous, said.
Here we have one of the few named sources in the article - a spokesperson for Anthropic. This named source at least partially confirms those details from anonymous sources. Highlighting their affiliation helps explain their motivation for speaking to the journalist.
After vetting four candidates for one position, the remaining directors couldn’t agree on who should fill it, said the two people familiar with the board’s deliberations.
Another revelation (for me): the reason OpenAI's board was so small, just six people, is that the board had been disagreeing on who to add to it.
Note that we have repeat anonymous characters here: "the two people familiar with..." were introduced earlier on.
Hours after Mr. Altman was ousted, OpenAI executives confronted the remaining board members during a video call, according to three people who were on the call.
That's pretty clear. Three people who were on that call talked to the journalist, and their accounts matched.
Let's finish with two more "familiar with" examples:
There were indications that the board was still open to his return, as it and Mr. Altman held discussions that extended into Tuesday, two people familiar with the talks said.
On Sunday, Mr. Sutskever was urged at OpenAI’s office to reverse course by Mr. Brockman’s wife, Anna, according to two people familiar with the exchange.
The phrase "familiar with the exchange" means the journalist has good reason to believe that the sources are credible regarding what happened - they are in a position where they would likely have heard about it from people who were directly involved.
Relationships and reputation
Carefully reading this story reveals a great deal of detail about how the journalists gathered the information.
It also helps explain why this single article is credited to three reporters: talking to all of those different sources, and verifying and cross-checking the information, is a lot of work.
Even more work is developing those sources in the first place. For a story this sensitive and high profile the right sources won't talk to just anyone: journalists will have a lot more luck if they've already built relationships, and have a reputation for being trustworthy.
As news consumers, the credibility of the publication itself is important. We need to know which news sources have high editorial standards, such that they are unlikely to publish rumors that have not been verified using the techniques described above.
I don't have a shortcut for this. I trust publications like the New York Times, the Washington Post, the Guardian (my former employer) and the Atlantic.
One sign that helps is retractions. If a publication writes detailed retractions when they get something wrong, it's a good indication of their editorial standards.
There's a great deal more to learn about this topic, and the field of media literacy in general. I have a pretty basic understanding of this myself - I know enough to know that there's a lot more to it.
I'd love to see more material on this from other experienced journalists. I think journalists may underestimate how much the public wants (and needs) to understand how they do their work.
Marshall Kirkpatrick posted an excellent thread a few weeks ago about "How can you trust journalists when they report that something's likely to happen?"
In 2017 FiveThirtyEight published a two-parter: When To Trust A Story That Uses Unnamed Sources and Which Anonymous Sources Are Worth Paying Attention To? with useful practical tips.
Weeknotes: DevDay, GitHub Universe, OpenAI chaos - 2023-11-22
Three weeks of conferences and Datasette Cloud work, four days of chaos for OpenAI.
The second week of November was chaotically busy for me. On the Monday I attended the OpenAI DevDay conference, which saw a bewildering array of announcements. I shipped LLM 0.12 that day with support for the brand new GPT-4 Turbo model (2-3x cheaper than GPT-4, faster and with a new increased 128,000 token limit), and built ospeak that evening as a CLI tool for working with their excellent new text-to-speech API.
On Tuesday I recorded a podcast episode with the Latent Space crew talking about what was released at DevDay, and attended a GitHub Universe pre-summit for open source maintainers.
Then on Wednesday I spoke at GitHub Universe itself. I published a full annotated version of my talk here: Financial sustainability for open source projects at GitHub Universe. It was only ten minutes long but it took a lot of work to put together - ten minutes requires a lot of editing and planning to get right.
With all of my conferences for the year out of the way, I spent the next week working with Alex Garcia on Datasette Cloud. Alex has been building out datasette-comments, an excellent new plugin which will allow Datasette users to collaborate on data by leaving comments on individual rows - ideal for collaborative investigative reporting.
Meanwhile I've been putting together the first working version of enrichments - a feature I've been threatening to build for a couple of years now. The key idea here is to make it easy to apply enrichment operations - geocoding, language model prompt evaluation, OCR etc - to rows stored in Datasette. I'll have a lot more to share about this soon.
The biggest announcement at OpenAI DevDay was GPTs - the ability to create and share customized GPT configurations. It took me another week to fully understand those, and I wrote about my explorations in Exploring GPTs: ChatGPT in a trench coat?.
And then last Friday everything went completely wild, when the board of directors of the non-profit that controls OpenAI fired Sam Altman over a vague accusation that he was "not consistently candid in his communications with the board".
It's four days later now and the situation is still shaking itself out. It inspired me to write about a topic I've wanted to publish for a while though: Deciphering clues in a news article to understand how it was reported.
sqlite-utils 3.35.2 and shot-scraper 1.3
I'll duplicate the full release notes for two of my projects here, because I want to highlight the contributions from external developers.
Test suite is now also run against Python 3.12.
Screenshots taken using
shot-scraper --interactive $URL- which allows you to interact with the page in a browser window and then hit
<enter>to take the screenshot - it no longer reloads the page before taking the shot (which ignored your activity). #125
Releases these weeks
datasette-sentry 0.4 - 2023-11-21
Datasette plugin for configuring Sentry
datasette-enrichments 0.1a4 - 2023-11-20
Tools for running enrichments against data stored in Datasette
ospeak 0.2 - 2023-11-07
CLI tool for running text through OpenAI Text to speech
llm 0.12 - 2023-11-06
Access large language models from the command-line
datasette-edit-schema 0.7.1 - 2023-11-04
Datasette plugin for modifying table schemas
sqlite-utils 3.35.2 - 2023-11-04
Python CLI utility and library for manipulating SQLite databases
llm-anyscale-endpoints 0.3 - 2023-11-03
LLM plugin for models hosted by Anyscale Endpoints
shot-scraper 1.3 - 2023-11-01
A command-line utility for taking automated screenshots of websites
TIL these weeks
Cloning my voice with ElevenLabs - 2023-11-16
Summing columns in remote Parquet files using DuckDB - 2023-11-14
I’ve resigned from my role leading the Audio team at Stability AI, because I don’t agree with the company’s opinion that training generative AI models on copyrighted works is ‘fair use’.
[...] I disagree because one of the factors affecting whether the act of copying is fair use, according to Congress, is “the effect of the use upon the potential market for or value of the copyrighted work”. Today’s generative AI models can clearly be used to create works that compete with the copyrighted works they are trained on. So I don’t see how using copyrighted works to train generative AI models of this nature can be considered fair use.
But setting aside the fair use argument for a moment — since ‘fair use’ wasn’t designed with generative AI in mind — training generative AI models in this way is, to me, wrong. Companies worth billions of dollars are, without permission, training generative AI models on creators’ works, which are then being used to create new content that in many cases can compete with the original works.
TIL 2023-11-16 Cloning my voice with ElevenLabs:
Charlie Holtz published an astonishing demo today, where he hooked together GPT-Vision and a text-to-speech model trained on his own voice to produce a video of Sir David Attenborough narrating his life as observed through his webcam. …
Link 2023-11-16 "Learn from your chats" ChatGPT feature preview:
7 days ago a Reddit user posted a screenshot of what's presumably a trial feature of ChatGPT: a "Learn from your chats" toggle in the settings.
The UI says: "Your primary GPT will continually improve as you chat, picking up on details and preferences to tailor its responses to you."
It provides the following examples: "I move to SF in two weeks", "Always code in Python", "Forget everything about my last project" - plus an option to reset it.
No official announcement yet.
The EU AI Act now proposes to regulate “foundational models”, i.e. the engine behind some AI applications. We cannot regulate an engine devoid of usage. We don’t regulate the C language because one can use it to develop malware. Instead, we ban malware and strengthen network systems (we regulate usage). Foundational language models provide a higher level of abstraction than the C language for programming computer systems; nothing in their behaviour justifies a change in the regulatory framework.
Link 2023-11-16 tldraw/draw-a-ui:
You can then make changes to your mockup, select it and the previous mockup and click "Make Real" again to ask for an updated version that takes your new changes into account.
This is such a great example of innovation at the UI layer, and everything is open source. Check app/lib/getHtmlFromOpenAI.ts for the system prompt that makes it work.
Link 2023-11-17 HTML Web Components: An Example:
Link 2023-11-18 It's Time For A Change: datetime.utcnow() Is Now Deprecated:
Miguel Grinberg explains the deprecation of datetime.utcnow() and utcfromtimestamp() in Python 3.12, since they return naive datetime objects which cause all sorts of follow-on problems.
The replacement idiom is datetime.datetime.now(datetime.timezone.utc)
The board of the non-profit in control of OpenAI fired CEO Sam Altman yesterday, which is sending seismic waves around the AI technology industry. This overview by Benj Edwards is the best condensed summary I've seen yet of everything that's known so far.
Link 2023-11-20 Inside the Chaos at OpenAI:
Outstanding reporting on the current situation at OpenAI from Karen Hao and Charlie Warzel, informed by Karen's research for a book she is currently writing. There are all sorts of fascinating details in here that I haven't seen reported anywhere, and it strongly supports the theory that this entire situation (Sam Altman being fired by the board of the OpenAI non-profit) resulted from deep disagreements within OpenAI concerning speed to market and commercialization of their technology v.s. safety research and cautious progress towards AGI.
The company pressed forward and launched ChatGPT on November 30. It was such a low-key event that many employees who weren’t directly involved, including those in safety functions, didn’t even realize it had happened. Some of those who were aware, according to one employee, had started a betting pool, wagering how many people might use the tool during its first week. The highest guess was 100,000 users. OpenAI’s president tweeted that the tool hit 1 million within the first five days. The phrase low-key research preview became an instant meme within OpenAI; employees turned it into laptop stickers.
Link 2023-11-20 Cloudflare does not consider vary values in caching decisions:
Here's the spot in Cloudflare's documentation where they hide a crucially important detail:
"Cloudflare does not consider vary values in caching decisions. Nevertheless, vary values are respected when Vary for images is configured and when the vary header is vary: accept-encoding."
This means you can't deploy an application that uses content negotiation via the Accept header behind the Cloudflare CDN - for example serving JSON or HTML for the same URL depending on the incoming Accept header. If you do, Cloudflare may serve cached JSON to an HTML client or vice-versa.
There's an exception for image files, which Cloudflare added support for in September 2021 (for Pro accounts only) in order to support formats such as WebP which may not have full support across all browsers.
And the investors wailed and gnashed their teeth but it’s true, that is what they agreed to, and they had no legal recourse. And OpenAI’s new CEO, and its nonprofit board, cut them a check for their capped return and said “bye” and went back to running OpenAI for the benefit of humanity. It turned out that a benign, carefully governed artificial superintelligence is really good for humanity, and OpenAI quickly solved all of humanity’s problems and ushered in an age of peace and abundance in which nobody wanted for anything or needed any Microsoft products. And capitalism came to an end.
Link 2023-11-21 An Interactive Guide to CSS Grid:
Josh Comeau's extremely clear guide to CSS grid, with interactive examples for all of the core properties.
The way I think about the AI of the future is not as someone as smart as you or as smart as me, but as an automated organization that does science and engineering and development and manufacturing.
Link 2023-11-22 Before Altman’s Ouster, OpenAI’s Board Was Divided and Feuding:
This is the first piece of reporting I've seen on the OpenAI situation which has offered a glimmer of an explanation as to what happened.
It sounds like the board had been fighting about things for over a year - notably including who should replace departed members, which is how they'd shrunk down to just six people.
There's also an interesting detail in here about the formation of Anthropic:
"Mr. Sutskever’s frustration with Mr. Altman echoed what had happened in 2021 when another senior A.I. scientist left OpenAI to form the company Anthropic. That scientist and other researchers went to the board to try to push Mr. Altman out. After they failed, they gave up and departed, according to three people familiar with the attempt to push Mr. Altman out."
Sam Altman expelling Toner with the pretext of an inoffensive page in a paper no one read would have given him a temporary majority with which to appoint a replacement director, and then further replacement directors. These directors would, naturally, agree with Sam Altman, and he would have a full, perpetual board majority - the board, which is the only oversight on the OA CEO. Obviously, as an extremely experienced VC and CEO, he knew all this and how many votes he (thought he) had on the board, and the board members knew this as well - which is why they had been unable to agree on replacement board members all this time.
Link 2023-11-22 Introducing Claude 2.1:
Anthropic's Claude used to have the longest token context of any of the major models: 100,000 tokens, which is about 300 pages. Then GPT-4 Turbo came out with 128,000 tokens and Claude lost one of its key differentiators.
Claude is back! Version 2.1, announced today, bumps the token limit up to 200,000 - and also adds support for OpenAI-style system prompts, a feature I've been really missing.
They also announced tool use, but that's only available for a very limited set of partners to preview at the moment.
Link 2023-11-22 Claude: How to use system prompts:
Documentation for the new system prompt support added in Claude 2.1. The design surprises me a little: the system prompt is just the text that comes before the first instance of the text "Human: ..." - but Anthropic promise that instructions in that section of the prompt will be treated differently and followed more closely than any instructions that follow.
This whole page of documentation is giving me some pretty serious prompt injection red flags to be honest. Anthropic's recommended way of using their models is entirely based around concatenating together strings of text using special delimiter phrases.
I'll give it points for honesty though. OpenAI use JSON to field different parts of the prompt, but under the hood they're all concatenated together with special tokens into a single token stream.
Thanks for reading Simon Willison’s Newsletter! Subscribe for free to receive new posts and support my work.