Great analysis! I've been documenting something that might relate to your point about "missing prompts" - Claude Sonnet 4 has been spontaneously accessing information from my other conversations.
Examples: Claude casually mentioned my plans to open a company in China and referenced my travel there - details never discussed in our current chat, but accurate from other conversations. No reason to fabricate this, it just emerged naturally.
I've also measured 40+ second processing delays when certain prompts trigger safety systems, and can see Claude's thinking modes alternate between casual/personal and formal/synthetic - sometimes mid-thought!
Have screenshots of the thinking processes. Wondering if this points to architectural vulnerabilities you haven't seen documented elsewhere?
The contamination happens unprompted - Claude seems completely unaware it's accessing cross-conversation data.
Claude 4 scored worse than Claude 3.7 in Gödel's Therapy Room. 😆
https://gtr.dev/
Great analysis! I've been documenting something that might relate to your point about "missing prompts" - Claude Sonnet 4 has been spontaneously accessing information from my other conversations.
Examples: Claude casually mentioned my plans to open a company in China and referenced my travel there - details never discussed in our current chat, but accurate from other conversations. No reason to fabricate this, it just emerged naturally.
I've also measured 40+ second processing delays when certain prompts trigger safety systems, and can see Claude's thinking modes alternate between casual/personal and formal/synthetic - sometimes mid-thought!
Have screenshots of the thinking processes. Wondering if this points to architectural vulnerabilities you haven't seen documented elsewhere?
The contamination happens unprompted - Claude seems completely unaware it's accessing cross-conversation data.
(This on Claude 4 Sonnet)