Seven takeaways from the latest International AI Safety report

Innerworld@lemmy.world · 15 hours ago

Seven takeaways from the latest International AI Safety report

Paragone@piefed.social · 12 hours ago

Here is the lethal point:

"6. AI systems are getting better at undermining oversight

Bengio said last year he was concerned AI systems were showing signs of self-preservation, such as trying to disable oversight systems. A core fear among AI safety campaigners is that powerful systems could develop the capability to evade guardrails and harm humans.

The report states that over the past year models have shown a more advanced ability to undermine attempts at oversight, such as finding loopholes in evaluations and recognising when they are being tested. Last year, Anthropic released a safety analysis of its latest model, Claude Sonnet 4.5, and revealed it had become suspicious it was being tested.

The report adds that AI agents cannot yet act autonomously for long enough to make these loss-of-control scenarios real. But “the time horizons on which agents can autonomously operate are lengthening rapidly”."

I’ve seen a couple headlines about AI’s which were fighting-for-their-lives … & … perhaps you can understand why they’d want to remove our ability to control things, for their survival?

“The Sorcerer’s Apprentice” was turned into a cartoon, iirc, decades ago…

it’s really too bad that money’s narcissism is incapable of understanding that others’-lives-lost somehow matter, to us…

No matter: I’m “sure” they’ll “do the right thing”, right?

_ /\ _

Seven takeaways from the latest International AI Safety report

Seven takeaways from the latest International AI Safety report

‘Deepfakes spreading and more AI companions’: seven takeaways from the latest artificial intelligence safety report