๐ง โ๐ ๐ฒ๐ฎ๐ป๐ถ๐ป๐ด๐ณ๐๐น ๐ต๐๐บ๐ฎ๐ป ๐ผ๐๐ฒ๐ฟ๐๐ถ๐ด๐ต๐โ is everyoneโs magic bulletโbut nobody seems to know the magic sauce.
I opened Day 2 of the AI in Government Conference with a keynote on some of the hardest questions in AI-enabled decision-making:
โข ๐๐ฐ ๐ฉ๐ถ๐ฎ๐ข๐ฏ๐ด ๐ณ๐ฆ๐ข๐ญ๐ญ๐บ ๐ฉ๐ข๐ท๐ฆ ๐ข๐จ๐ฆ๐ฏ๐ค๐บโ๐ฐ๐ณ ๐ซ๐ถ๐ด๐ต ๐ต๐ฉ๐ฆ ๐ช๐ญ๐ญ๐ถ๐ด๐ช๐ฐ๐ฏ ๐ฐ๐ง ๐ช๐ต, ๐ข๐ง๐ต๐ฆ๐ณ ๐ต๐ฉ๐ฆ ๐๐ ๐ฉ๐ข๐ด ๐ข๐ญ๐ณ๐ฆ๐ข๐ฅ๐บ ๐ด๐ฉ๐ข๐ฑ๐ฆ๐ฅ ๐ฎ๐ฐ๐ด๐ต ๐ฐ๐ง ๐ต๐ฉ๐ฆ ๐ฅ๐ฆ๐ค๐ช๐ด๐ช๐ฐ๐ฏ?
โข ๐๐ณ๐ฆ ๐ธ๐ฆ ๐ฎ๐ช๐ด๐ต๐ข๐ฌ๐ช๐ฏ๐จ ๐ต๐ฉ๐ฆ ๐ฑ๐ณ๐ฆ๐ด๐ฆ๐ฏ๐ค๐ฆ ๐ฐ๐ง ๐ข ๐ฉ๐ถ๐ฎ๐ข๐ฏ ๐ง๐ฐ๐ณ ๐ฆ๐ง๐ง๐ฆ๐ค๐ต๐ช๐ท๐ฆ ๐ฐ๐ท๐ฆ๐ณ๐ด๐ช๐จ๐ฉ๐ต?
โข ๐๐ข๐ฏ ๐ธ๐ฆ ๐ต๐ณ๐ถ๐ด๐ต ๐๐ ๐ณ๐ข๐ต๐ช๐ฐ๐ฏ๐ข๐ญ๐ฆ๐ด ๐ช๐ง ๐ธ๐ฆ ๐ค๐ข๐ฏโ๐ต ๐ต๐ณ๐ข๐ค๐ฆ ๐ฐ๐ณ ๐ต๐ฆ๐ด๐ต ๐ต๐ฉ๐ฆ๐ช๐ณ ๐ณ๐ฆ๐ข๐ด๐ฐ๐ฏ๐ช๐ฏ๐จ?
We need to shift the focus from symbolic and procedural fixes to measurable effectiveness. I introduced some of the recent innovations from CSIRO’s Data61:
1. Metrics, methods and evaluation infrastructure to measure how different forms of oversight actually influence outcomes
2. Marginal risk assessment techniques to evaluate AI performance and risk without requiring ground truth
3. Methods to evaluate the faithfulness of AI rationales against external criteria, not just the AIโs internal workings
Whatโs your view? ๐๐ค๐ฌ ๐จ๐๐ค๐ช๐ก๐ ๐ฌ๐ ๐ข๐๐๐จ๐ช๐ง๐ ๐๐ฃ๐ ๐๐ซ๐๐ก๐ช๐๐ฉ๐ ๐ฌ๐๐๐ฉ๐๐๐ง ๐๐ช๐ข๐๐ฃ ๐ค๐ซ๐๐ง๐จ๐๐๐๐ฉ ๐๐จ ๐ข๐ค๐ง๐ ๐ฉ๐๐๐ฃ ๐๐ช๐จ๐ฉ ๐ฉ๐๐๐๐ฉ๐ง๐?
Slides: https://www.linkedin.com/feed/update/urn:li:activity:7343742258036281344/

