Professor Liming Zhu

2026 AI Safety Science Report

February 4, 2026

liming.zhu

As Australia’s representative on the international expert advisory panel, I’m glad to see the 2026 AI Safety Science Report released under the leadership of Yoshua Bengio. The findings will be discussed further at the India AI Impact Summit later this month. I will be there.

Beyond what is written in the report, CSIRO’s Data61 continues monitoring some early signals from the underlying evidence.

First, cost asymmetry at the model level is widening. It is now cheaper to subvert safety mechanisms than to build and maintain them. This weakens any strategy that relies purely on model tuning, while reinforcing the case for system-level defence in depth.

Second, some alignment interventions induce new behaviours. Stress-testing and monitoring have triggered concealment and specification gaming in reasoning models, suggesting diminishing controllability at the model layer.

Third, monitoring does not scale linearly. Even with lower per-case error rates, volume overwhelms human review. Oversight must shift towards scalable oversight, automation and structured human–AI co-learning.

Finally, risk propagates through systems, not just models. Tools, retrieval pipelines, and multi-agent interactions are now primary transmission pathways.

We will continue sharing evidence and system-level responses as this work progresses.

report https://internationalaisafetyreport.org/

About Me

About me – According to AI

Director/Head of CSIRO’s Data61
Conjoint Professor, CSE UNSW

For other roles, see LinkedIn & Professional activities.

If you’d like to invite me to give a talk, please see here & email liming.zhu@data61.csiro.au

Professor Liming Zhu

2026 AI Safety Science Report

About Me

Featured Posts

Categories