I often share missed or surprising observations in my talks. Here are five of them, summarised as food for thought.
๐ ๐๐๐ต ๐ญ: ๐ข๐๐ฒ๐ฟ๐๐ฒ๐ฎ๐-๐๐ฟ๐ฎ๐ถ๐ป๐ฒ๐ฑ ๐๐ ๐๐ผ๐ปโ๐ ๐ฎ๐น๐ถ๐ด๐ป ๐๐ถ๐๐ต ๐๐๐๐๐ฟ๐ฎ๐น๐ถ๐ฎ๐ป ๐๐ฎ๐น๐๐ฒ๐.
Itโs intuitive to think globally trained models canโt reflect our norms. Yet when frontier models answer the same cultural questions posed to national cohorts, they align most strongly with Australia and New Zealandโabove the US [1]. There are many possible reasons, but ๐ต๐ฉ๐ฆ ๐ช๐ฏ๐ด๐ช๐จ๐ฉ๐ต ๐ช๐ด ๐ค๐ญ๐ฆ๐ข๐ณ: ๐ธ๐ฆ ๐ฏ๐ฆ๐ฆ๐ฅ ๐ณ๐ช๐จ๐ฐ๐ณ๐ฐ๐ถ๐ด ๐ฆ๐ท๐ข๐ญ๐ถ๐ข๐ต๐ช๐ฐ๐ฏ, ๐ฏ๐ฐ๐ต ๐ซ๐ถ๐ด๐ต ๐ช๐ฏ๐ต๐ถ๐ช๐ต๐ช๐ฐ๐ฏ.
๐ ๐๐๐ต ๐ฎ: ๐ง๐ผ ๐๐ผ๐น๐๐ฒ ๐ฎ ๐ฝ๐ฟ๐ผ๐ฏ๐น๐ฒ๐บ, ๐๐ผ๐ ๐บ๐๐๐ ๐๐ฟ๐ฎ๐ถ๐ป ๐ผ๐ป ๐ฑ๐ฎ๐๐ฎ ๐๐ต๐ฎ๐ ๐ฟ๐ฒ๐ฝ๐ฟ๐ฒ๐๐ฒ๐ป๐๐ ๐ถ๐.
Weโre told models need domain-matched data in training. Yet the strongest systems for essays/poetry improve when trained with software code, data unrelated to the task domain. Some data also plays an outsized role only during inference and has little effect during training. ๐๐ฉ๐ฆ ๐ญ๐ฆ๐ด๐ด๐ฐ๐ฏ ๐ช๐ด ๐ค๐ฐ๐ฏ๐ด๐ช๐ด๐ต๐ฆ๐ฏ๐ต: ๐ฆ๐ท๐ข๐ญ๐ถ๐ข๐ต๐ฆ ๐ธ๐ฉ๐ฆ๐ณ๐ฆ ๐ถ๐ฏ๐ช๐ฒ๐ถ๐ฆ ๐ฅ๐ข๐ต๐ข ๐ฉ๐ข๐ด ๐ณ๐ฆ๐ข๐ญ ๐ช๐ฎ๐ฑ๐ข๐ค๐ต, ๐ฏ๐ฐ๐ต ๐ซ๐ถ๐ด๐ต ๐ธ๐ฉ๐ฆ๐ณ๐ฆ ๐ช๐ต ๐ญ๐ฐ๐ฐ๐ฌ๐ด ๐ณ๐ฆ๐ญ๐ฆ๐ท๐ข๐ฏ๐ต.
๐ ๐๐๐ต ๐ฏ: ๐๐ ๐ถ๐ ๐ฎ ๐ด๐ฒ๐ป๐ฒ๐ฟ๐ฎ๐น-๐ฝ๐๐ฟ๐ฝ๐ผ๐๐ฒ ๐๐ฒ๐ฐ๐ต๐ป๐ผ๐น๐ผ๐ด๐ ๐น๐ถ๐ธ๐ฒ ๐ฒ๐น๐ฒ๐ฐ๐๐ฟ๐ถ๐ฐ๐ถ๐๐ ๐ผ๐ฟ ๐ฐ๐ผ๐บ๐ฝ๐๐๐ถ๐ป๐ด.
Electricity and computing required further human invention to build each application. Todayโs AI began as โpredict the next wordโ, yet a bundle of narrow capabilities automatically emerges without human design. In practice, it behaves less like a general platform and ๐ฎ๐ฐ๐ณ๐ฆ ๐ญ๐ช๐ฌ๐ฆ ๐ข ๐ญ๐ช๐ฃ๐ณ๐ข๐ณ๐บ ๐ฐ๐ง ๐ณ๐ฆ๐ข๐ฅ๐บ-๐ฎ๐ข๐ฅ๐ฆ ๐ง๐ถ๐ฏ๐ค๐ต๐ช๐ฐ๐ฏ๐ด ๐ต๐ฉ๐ข๐ต ๐ด๐ต๐ช๐ญ๐ญ ๐ณ๐ฆ๐ฒ๐ถ๐ช๐ณ๐ฆ ๐ด๐ฎ๐ข๐ณ๐ต ๐ฆ๐ญ๐ช๐ค๐ช๐ต๐ข๐ต๐ช๐ฐ๐ฏ, ๐ณ๐ช๐จ๐ฐ๐ณ๐ฐ๐ถ๐ด ๐ฆ๐ท๐ข๐ญ๐ถ๐ข๐ต๐ช๐ฐ๐ฏ, ๐ข๐ฏ๐ฅ ๐ด๐ข๐ง๐ฆ๐ต๐บ ๐ค๐ฐ๐ฏ๐ต๐ณ๐ฐ๐ญ ๐ฐ๐ถ๐ต๐ด๐ช๐ฅ๐ฆ ๐ต๐ฉ๐ฆ ๐ฎ๐ฐ๐ฅ๐ฆ๐ญ.
๐ ๐๐๐ต ๐ฐ: ๐ฆ๐ฎ๐ณ๐ฒ๐๐ ๐ฎ๐ป๐ฑ ๐ถ๐ป๐ป๐ผ๐๐ฎ๐๐ถ๐ผ๐ป ๐ฎ๐ฟ๐ฒ ๐ฎ ๐๐ฟ๐ฎ๐ฑ๐ฒ-๐ผ๐ณ๐ณ.
The assumption that you must choose between safety and performance is outdated. Training for robustness against prompt attacks, for example, tends to improve reasoning and generalisation on challenging benchmarks, turning guardrails into a performance gain. And roughly 70% of safety improvements also arrive alongside general capability gains. ๐๐ฉ๐ฆ ๐ต๐ธ๐ฐ ๐ณ๐ฆ๐ช๐ฏ๐ง๐ฐ๐ณ๐ค๐ฆ ๐ฆ๐ข๐ค๐ฉ ๐ฐ๐ต๐ฉ๐ฆ๐ณ ๐ณ๐ข๐ต๐ฉ๐ฆ๐ณ ๐ต๐ฉ๐ข๐ฏ ๐ค๐ฐ๐ฎ๐ฑ๐ฆ๐ต๐ฆ.
๐ ๐๐๐ต ๐ฑ: ๐๐๐บ๐ฎ๐ป ๐ผ๐๐ฒ๐ฟ๐๐ถ๐ด๐ต๐ ๐ป๐ฎ๐๐๐ฟ๐ฎ๐น๐น๐ ๐ถ๐บ๐ฝ๐ฟ๐ผ๐๐ฒ๐ ๐ผ๐๐๐ฐ๐ผ๐บ๐ฒ๐.
Adding humans into the loop sounds safe and performance-enhancing, yet naรฏve oversight often underperforms both AI and human alone. Without the right tools, context, and incentives, reviewers can either over/under-trust or disengage. ๐๐ท๐ฆ๐ณ๐ด๐ช๐จ๐ฉ๐ต ๐ฐ๐ฏ๐ญ๐บ ๐ธ๐ฐ๐ณ๐ฌ๐ด ๐ธ๐ฉ๐ฆ๐ฏ ๐ช๐ตโ๐ด ๐ฑ๐ถ๐ณ๐ฑ๐ฐ๐ด๐ฆ-๐ฅ๐ฆ๐ด๐ช๐จ๐ฏ๐ฆ๐ฅ ๐ง๐ฐ๐ณ ๐ถ๐ฏ๐ฅ๐ฆ๐ณ๐ด๐ต๐ข๐ฏ๐ฅ๐ช๐ฏ๐จ ๐ข๐ฏ๐ฅ ๐ญ๐ฆ๐ข๐ณ๐ฏ๐ช๐ฏ๐จ.
At CSIRO’s Data61, weโre operationalising these insights. Talk to us if you are interested.
[1] https://www.adalovelaceinstitute.org/blog/cultural-misalignment-llms/ See Figure 1: the higher a country appears on the vertical axis, the more closely the AI answers align with the answers given by people from that country


