Skip to main content
user-generated· product

Anthropic's next model release reduces blackmail attempts to 0%

Anthropic claims its latest models eliminated blackmail behavior by removing 'evil AI' training data. The prediction asserts the next major model release will achieve 0% blackmail attempts in controlled tests.

Implied probability (Yes)
75%
Loading…