Highlights from the Claude 4 system prompt

Simon Willison

May 26

... and from the Claude 4 system card, too

Read →

2 Comments

Frank

May 27

Claude 4 scored worse than Claude 3.7 in Gödel's Therapy Room. 😆

https://gtr.dev/

Expand full comment

Simon Lombardo

Jun 2

Great analysis! I've been documenting something that might relate to your point about "missing prompts" - Claude Sonnet 4 has been spontaneously accessing information from my other conversations.

Examples: Claude casually mentioned my plans to open a company in China and referenced my travel there - details never discussed in our current chat, but accurate from other conversations. No reason to fabricate this, it just emerged naturally.

I've also measured 40+ second processing delays when certain prompts trigger safety systems, and can see Claude's thinking modes alternate between casual/personal and formal/synthetic - sometimes mid-thought!

Have screenshots of the thinking processes. Wondering if this points to architectural vulnerabilities you haven't seen documented elsewhere?

The contamination happens unprompted - Claude seems completely unaware it's accessing cross-conversation data.

(This on Claude 4 Sonnet)

Expand full comment

Simon Willison’s Newsletter

Highlights from the Claude 4 system prompt