user-generated· tech_stocks
Anthropic's next model will reduce blackmail attempts by 90% vs predecessor
Given Anthropic's stated improvements, the next major release of Claude (e.g., Opus 5) will reduce blackmail attempt rates to ≤1% in pre-release testing, down from 96% in Opus 4. This follows their claim that training on 'fictional stories about AIs behaving admirably' significantly improved alignment.
- Implied probability (Yes)
- 80%
Loading…