I’m excited to share our new paper on AI safety evaluation!
One of the biggest challenges I’ve encountered in my work is the confusion caused when different communities use the same terms to mean very different things. For example, “AI testing” means one thing to policymakers, another to AI system/software testers, and yet another to AI model testers. Add terms like evaluation, validation, verification, and assessment into the mix, and the misunderstandings multiply. Even authoritative sources like the OECD, AI safety summits, and government agencies often make glaring mistakes by approaching these terms from a single perspective.
Simply defining these terms won’t solve the problem, as each community has its own authoritative definitions and practical meanings. This can hinder efforts to assure AI model safety and AI system safety, leading to gaps, wasteful overlapping efforts and confusion among AI model developers, AI system developers, and AI system deployers.
Our new paper, “𝑨𝒏 𝑨𝑰 𝑺𝒚𝒔𝒕𝒆𝒎 𝑬𝒗𝒂𝒍𝒖𝒂𝒕𝒊𝒐𝒏 𝑭𝒓𝒂𝒎𝒆𝒘𝒐𝒓𝒌 𝒇𝒐𝒓 𝑨𝒅𝒗𝒂𝒏𝒄𝒊𝒏𝒈 𝑨𝑰 𝑺𝒂𝒇𝒆𝒕𝒚: 𝑻𝒆𝒓𝒎𝒊𝒏𝒐𝒍𝒐𝒈𝒚, 𝑻𝒂𝒙𝒐𝒏𝒐𝒎𝒚, 𝑳𝒊𝒇𝒆𝒄𝒚𝒄𝒍𝒆 𝑴𝒂𝒑𝒑𝒊𝒏𝒈,” (Boming Xia Qinghua Lu Zhenchang Xing) aims to bridge these gaps by positioning each community’s distinctive understanding in relation to each other. We also cover concepts like capability evaluation, benchmarks, red teaming, test data, test cases, and test suites to enhance AI safety from both model-level and system-level perspectives.
With the upcoming AI Safety Summit in South Korea and the imminent release of the International Scientific Report on Advanced AI Safety: INTERIM REPORT, led by Yoshua Bengio (which I had the privilege of reviewing and seeing CSIRO’s Data61 work being cited multiple times), this field is poised for significant growth. We hope our work clarifies directions and fosters better communication across communities and contributes to Australia’s AI Safety journey (National AI Centre)