NZ Gov Summit: AI Evaluator/Peer Reviewer

๐Ÿค” ๐‘ฏ๐’๐’˜ ๐’…๐’ ๐’š๐’๐’– ๐’Œ๐’๐’๐’˜ ๐’Š๐’‡ ๐‘จ๐‘ฐ ๐’Ž๐’‚๐’Œ๐’†๐’” ๐’•๐’‰๐’Š๐’๐’ˆ๐’” ๐’ƒ๐’†๐’•๐’•๐’†๐’“โ€”๐’˜๐’‰๐’†๐’ ๐’š๐’๐’– ๐’…๐’๐’โ€™๐’• ๐’†๐’—๐’†๐’ ๐’Œ๐’๐’๐’˜ ๐’‰๐’๐’˜ ๐’ˆ๐’๐’๐’… ๐’•๐’‰๐’† ๐’„๐’–๐’“๐’“๐’†๐’๐’• ๐’‰๐’–๐’Ž๐’‚๐’ ๐’‘๐’“๐’๐’„๐’†๐’”๐’” ๐’Š๐’”? ๐‘จ๐’๐’… ๐’˜๐’‰๐’†๐’ ๐‘จ๐‘ฐ ๐’†๐’™๐’‘๐’๐’‚๐’Š๐’๐’” ๐’Š๐’•๐’”๐’†๐’๐’‡, ๐’‰๐’๐’˜ ๐’…๐’ ๐’š๐’๐’– ๐’Œ๐’๐’๐’˜ ๐’Š๐’‡ ๐’š๐’๐’– ๐’„๐’‚๐’ ๐’•๐’“๐’–๐’”๐’• ๐’•๐’‰๐’† ๐’†๐’™๐’‘๐’๐’‚๐’๐’‚๐’•๐’Š๐’๐’?

At the NZ Government Data Summit, I shared three important, often overlooked insights from real-world AI deployments:
– AI alone often outperforms AI-human collaboration. But where and how human oversight is introduced can either reduce or increase overall risk.
– Evaluating AI also reveals human and process errorsโ€”which can generate resistance, especially when current processes were never rigorously assessed.
– Getting AI to explain its recommendations or conclusions isnโ€™t enough. What matters is whether those explanations meet recognised expert standards and make sense to human reviewersโ€”not whether they reflect how the AI works internally.

I demonstrated these with two case studies:
– ๐€๐ˆ ๐„๐ฏ๐š๐ฅ๐ฎ๐š๐ญ๐จ๐ซ: How to assess whether AI lowers or raises riskโ€”even when the absolute risk of the current human process is unknown (using marginal risk evaluation techniques).
– ๐€๐ˆ ๐๐ž๐ž๐ซ ๐‘๐ž๐ฏ๐ข๐ž๐ฐ๐ž๐ซ: How to ensure AI explanations are meaningful to humansโ€”focusing not on whether they mirror the AIโ€™s internal logic, but whether they align with expert judgement and procedural fairness (what we call external reasoning faithfulness).

Selected Slides Here https://www.dropbox.com/scl/fi/5hfxx1ae0kswj0dflaot7/20250508-AI-for-Evaluation-NZ-Gov.pdf?rlkey=wsufwwph4dl73zdkxaicx0suy&dl=0


Leave a Reply

Your email address will not be published. Required fields are marked *

About Me


About me – According to AI

Research Director, CSIRO’s Data61
Conjoint Professor, CSE UNSW

For other roles, see LinkedIn & Professional activities.

If you’d like to invite me to give a talk, please see here & email liming.zhu@data61.csiro.au

Featured Posts

    Categories