Skip to yearly menu bar Skip to main content


Oral

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng · Inbum Park · Andrew Owens

Summit Flex Hall AB Oral #3
[ ] [ Visit Orals 6B Image & Video Synthesis ]
Fri 21 Jun 1:36 p.m. — 1:54 p.m. PDT

Abstract:

We consider the problem of synthesizing multi-view optical illusions---images that change appearance upon a transformation, such as a flip. We present a conceptually simple, zero-shot method to do so based on diffusion. For every diffusion step we estimate the noise from different views of a noisy image, combine the noise estimates, and perform a step of the reverse diffusion process. A theoretical analysis shows that this method works precisely for views that can be written as orthogonal transformations, of which permutations are a subset. This leads to the idea of a visual anagram, which includes images that change appearance upon a rotation or a flip, but also upon more exotic pixel permutations such as a jigsaw rearrangement. We provide both qualitative and quantitative results demonstrating the effectiveness and flexibility of our method.

Chat is not available.