Skip to yearly menu bar Skip to main content


Poster Sat, Jun 6, 2026 • 3:45 PM – 5:45 PM PDT ExHall A & F

See, Think, Act: Teaching Multimodal Agents to Effectively Interact with GUI by Identifying Toggles

Zongru Wu ⋅ Rui Mao ⋅ Zhiyuan Tian ⋅ Pengzhou Cheng ⋅ Tianjie Ju ⋅ Zheng Wu ⋅ Lingzhong Dong ⋅ Haiyue Sheng ⋅ Zhuosheng Zhang ⋅ Gongshen Liu

Abstract

Log in and register to view live content