,

Takeaways on AI Safety from a Week in San Francisco

⭐️✨ It’s been an incredible week in San Francisco discussing AI Safety with experts across AI labs, civil society, government, and academia. A few key learnings:
πŸ” π€πˆ π’πšπŸπžπ­π² 𝐎𝐧π₯𝐲 𝐌𝐚𝐭𝐭𝐞𝐫𝐬 𝐚𝐭 𝐭𝐑𝐞 π’π²π¬π­πžπ¦ π‹πžπ―πžπ₯: Real capabilities emerge only when AI is tested in realistic environments with access to powerful tools, real-world settings, agentic planning, and ample inference time/self-correction. The new US-UK joint testing of Sonet-3.5 highlights this, particularly in tool use ablation analysis showing the capability difference.
βš™οΈ 𝐈𝐧𝐟𝐞𝐫𝐞𝐧𝐜𝐞 π“π’π¦πž π’πœπšπ₯𝐒𝐧𝐠 > π“π«πšπ’π§π’π§π  π“π’π¦πž π’πœπšπ₯𝐒𝐧𝐠?: Scaling at inference time, including temporary model updates, raw compute, and agent iteration numbers, is shaping up to be potentially more powerful than just training scaling laws. This opens up new possibilities for β€œsovereign AI.” No one entity controls the full β€œsovereign AI” stackβ€”focusing on strategic elements like inference capabilities/stack is more vital than ever for many.
πŸ§ͺ π“π‘πž π’πœπ’πžπ§πœπž 𝐨𝐟 π€πˆ π’πšπŸπžπ­π²: Robust evaluation is core to AI Safety. No amount of ad hoc red-teaming or probing will suffice. It demands a deep, multidisciplinary scientific approachβ€”measurement science, risk quantification, and the study of complex systems.

All of this deeply resonates with CSIRO’s Data61 approach (which I presented at various occasions) to AI safety engineering https://lnkd.in/gPhid9tX:
➑️ End-to-end system evaluation
➑️ Focus on inference-time capabilities as key sovereign AI leverage
➑️ Multidisciplinary scientific rigor for AI safety evaluation

And while in SF, I couldn’t miss full autonomous drivingβ€”every trip was in a Waymo! 🚘✨ The ride? Smooth and confidently decisive, not what I had worried about being overly cautious or jerky. From the outside, it’s still a spectacle at timesβ€”people smiling, staring, waving, taking photos.

But one intriguing thingβ€”some social behaviours around autonomous vehicles are emerging. Jaywalkers or people crossing backstreets seem more comfortable stepping in front of an AV than in front of a human-driven car. Is it that they trust the AV more or just don’t feel the need for politeness towards a machine? πŸ€” Interesting times!


Leave a Reply

Your email address will not be published. Required fields are marked *

About Me

Research Director, CSIRO’s Data61
Conjoint Professor, CSE UNSW

For other roles, see LinkedIn & Professional activities.

If you’d like to invite me to give a talk, please see here & email liming.zhu@data61.csiro.au

Featured Posts

    Categories