ICSE25 Distinguished Paper – How GUI Agents Are Changing the AI Game

๐Ÿš€ Whatโ€™s the hottest trend in AI agents right now?
Itโ€™s not the traditional workflow-based or API-calling agents. The most interesting agents today are ๐—š๐—จ๐—œ ๐—ฎ๐—ด๐—ฒ๐—ป๐˜๐˜€ โ€” ones that see what you see on the screen. They donโ€™t just execute calls behind the scenes โ€” they observe, interpret, and interact with visual interfaces just like human users.

This opens up a whole new class of capability: learning directly from real user behaviour, as it happens on-screen, rather than relying on disconnected logs or backend traces that miss subtle but critical interactions.

Proud to share that weโ€™re making serious strides in this space. Join me in congratulating the CSIRO’s Data61 research team on winning the Distinguished Paper Award at the 2025 International Conference on Software Engineering! ๐Ÿ‘

It introduces SeeAction, a deep learning-based computer vision tool that automatically recognises and describes user actions from screen recordings.
It enables powerful new applications:
– Automating UI testing and repetitive digital tasks
– Reproducing software bugs by retracing a userโ€™s exact steps
– Helping AI tools learn from real human behaviour โ€” not just text or screenshots, but nuanced interactions in context

[2503.12873] SeeAction: Towards Reverse Engineering How-What-Where of HCI Actions from Screencasts for UI Automation


About Me


About me – According to AI

Research Director, CSIRO’s Data61
Conjoint Professor, CSE UNSW

For other roles, see LinkedIn & Professional activities.

If you’d like to invite me to give a talk, please see here & email liming.zhu@data61.csiro.au

Featured Posts

    Categories