Runtime Guardrails for Foundation Models

The word β€œπ’ˆπ’–π’‚π’“π’…π’“π’‚π’Šπ’β€ has become very popular recently as a major approach for AI developers, deployers, and regulators to achieve responsible and safe AI. But what does it mean exactly? πŸ€”

For some, it means anything that can help safeguard and achieve responsible and safe AI, ranging from governance practices to stakeholder engagement, design, testing/assessment/evaluation, and transparency mechanisms before deployment, and post-deployment runtime controls. However, this might be stretching the word a bit far. πŸ›€οΈ

Literally, guardrails, i.e., protective/derailment-prevention rails, are less about safer train design itself and more about preventing a train from derailing at runtime via outside control.

This is especially relevant for advanced accelerating trains we do not fully underrstand and find hard to steerβ€”πš.𝐀.𝐚 π€πˆ. Once you deploy an AI model or an AI system, whether developed by others or by yourself, you need to control/steer and safeguard it via runtime guardrails built for your specific organisational context, risk appetite, and risk profiles. This is particularly crucial for many organisations using third-party AI models, especially the less controllable/steerable foundation models. πŸ”§

While CSIRO’s Data61 works on some very specific guardrails for AI and agentic AI, we have just released a general paper on runtime guardrails and all the associated concepts. πŸ“„βœ¨ https://lnkd.in/guSScDcg


About Me

Research Director, CSIRO’s Data61
Conjoint Professor, CSE UNSW

For other roles, see LinkedIn & Professional activities.

If you’d like to invite me to give a talk, please see here & email liming.zhu@data61.csiro.au

Featured Posts

    Categories