Tom Builds, Tom Breaks: Hands-On Attacks and Defenses for Vision-Language Systems
Abstract
Vision-language models are increasingly deployed in real-world systems where images can directly influence decisions and actions, creating new security risks beyond traditional text-based attacks. This tutorial provides a hands-on introduction to attacks and defenses for vision-language systems, using a practical, end-to-end workflow that mirrors real deployment scenarios. It covers a range of vulnerabilities, including visual jailbreaks, preprocessing-induced attacks, adversarial perturbations, backdoored models, and data poisoning, along with corresponding mitigation strategies. Through interactive examples and reproducible notebooks, the tutorial emphasizes how these threats manifest in practice and how to build robust, auditable systems for multimodal AI.