Professor Liming Zhu

Runtime Guardrails for Foundation Models

August 12, 2024

liming.zhu

The word “𝒈𝒖𝒂𝒓𝒅𝒓𝒂𝒊𝒍” has become very popular recently as a major approach for AI developers, deployers, and regulators to achieve responsible and safe AI. But what does it mean exactly? 🤔

For some, it means anything that can help safeguard and achieve responsible and safe AI, ranging from governance practices to stakeholder engagement, design, testing/assessment/evaluation, and transparency mechanisms before deployment, and post-deployment runtime controls. However, this might be stretching the word a bit far. 🛤️

Literally, guardrails, i.e., protective/derailment-prevention rails, are less about safer train design itself and more about preventing a train from derailing at runtime via outside control.

This is especially relevant for advanced accelerating trains we do not fully underrstand and find hard to steer—𝐚.𝐤.𝐚 𝐀𝐈. Once you deploy an AI model or an AI system, whether developed by others or by yourself, you need to control/steer and safeguard it via runtime guardrails built for your specific organisational context, risk appetite, and risk profiles. This is particularly crucial for many organisations using third-party AI models, especially the less controllable/steerable foundation models. 🔧

While CSIRO’s Data61 works on some very specific guardrails for AI and agentic AI, we have just released a general paper on runtime guardrails and all the associated concepts. 📄✨ https://lnkd.in/guSScDcg

About Me

About me – According to AI

Research Director, CSIRO’s Data61
Conjoint Professor, CSE UNSW

For other roles, see LinkedIn & Professional activities.

If you’d like to invite me to give a talk, please see here & email liming.zhu@data61.csiro.au

Professor Liming Zhu

Runtime Guardrails for Foundation Models

About Me

Featured Posts

Categories