Skip to yearly menu bar Skip to main content


Timezone: America/Denver
Filter Events
Registration
7:00 AM - 5:00 PM
Tutorial

All You Need To Know About Self-Driving

Raquel Urtasun · Abbas Sadat · Sivabalan Manivasagam · Jingkang Wang · Ioan Andrei Barsan
8:00 AM - 6:00 PM

Autonomous driving has evolved into a complex, full-stack problem that integrates perception, prediction, planning, simulation, and safety within a unified system. This tutorial provides a comprehensive overview of modern self-driving pipelines, covering both traditional modular approaches and emerging end-to-end paradigms. It reviews key components including sensor systems, multi-modal perception, motion forecasting, planning and control, large-scale simulation, and data-centric development. The tutorial further highlights recent advances such as foundation models, generative simulation, and real-world deployment at scale, offering a unified perspective on current challenges and future directions in self-driving systems.

... more
Tutorial
8:00 AM - 12:00 PM

Artificial intelligence has driven significant advances in medical imaging, improving tasks such as image reconstruction, diagnosis, and clinical decision support across modalities including MRI, CT, X-ray, and pathology. This tutorial provides an up-to-date overview of key paradigms in the field, including physics-informed learning, medical foundation models, and collaborative approaches such as federated and multi-agent systems. It examines major challenges such as generalization, interpretability, data heterogeneity, and privacy constraints, while highlighting emerging solutions and open research directions. The tutorial aims to offer a comprehensive perspective on the development and deployment of reliable, clinically relevant AI systems for medical imaging.

... more
Tutorial

Millimeter-wave (mmWave) sensing is emerging as a new modality for computer vision, enabling perception of objects and scenes that are occluded or invisible to traditional cameras. This tutorial introduces the fundamentals of mmWave imaging, highlighting how its physical properties enable through-occlusion sensing and all-weather perception. It covers both classical signal-processing approaches and recent learning-based methods for 3D reconstruction, segmentation, and scene understanding. The tutorial further provides practical guidance on datasets, tools, and open challenges, offering a comprehensive entry point for researchers interested in extending vision systems beyond visible light.

... more
Tutorial

Analytic understanding of diffusion models

Artem Lukoianov · Chenyang Yuan · Christopher Scarvelis · Mason Kamb
8:00 AM - 6:00 PM

Diffusion models achieve state-of-the-art performance in generative modeling, yet their theoretical foundations and generalization behavior remain poorly understood. This tutorial focuses on the analytical understanding of diffusion models, addressing the apparent paradox between closed-form optimal denoisers and the empirical success of deep diffusion networks. It introduces recent theoretical advances that explain how mechanisms such as score smoothing, training dynamics, neural network inductive biases, and data structure contribute to generalization. By combining mathematical insights with hands-on experiments, the tutorial provides a principled framework for understanding the inner workings of diffusion models and for interpreting recent developments in the field.

... more
Tutorial
8:00 AM - 12:00 PM

Physical AI systems, including robotics and autonomous platforms, require tightly integrated pipelines spanning data collection, model training, and real-time deployment. This tutorial presents a full-stack perspective on building such systems, covering simulation-based data generation, foundation models for robot control, and deployment on edge hardware. It introduces practical workflows using modern tools for human-in-the-loop data collection, multimodal robot foundation models, and hardware-aware optimization for low-latency inference. The tutorial further highlights challenges in scaling and deploying physical AI systems, providing attendees with actionable guidance and open-source resources for end-to-end robotics development.

... more
Tutorial
8:00 AM - 12:00 PM

Large-scale multi-camera systems are central to real-world computer vision applications, yet their design is shaped as much by infrastructure constraints as by algorithmic advances. This tutorial presents a unified perspective on multi-camera vision through the lens of checkout-free retail, focusing on three core components: automatic camera calibration, real-time multi-object tracking, and structured event detection. It examines how challenges such as asynchrony, partial observability, hardware failures, and edge deployment constraints influence system design and performance. The tutorial further highlights generalizable principles for building reliable, scalable vision systems, bridging the gap between academic methods and real-world deployment.

... more
Workshop

3D Geometry Generation for Scientific Computing (2nd Edition)

Wuyang Chen ⋅ Marissa Ramirez de Chanlatte ⋅ Peter Yichen Chen ⋅ Chuhang Zou ⋅ Zhiwen Fan ⋅ Daniel Martin ⋅ Michael Mahoney
9:00 AM - 1:00 PM
Workshop

Third Workshop for Learning 3D with Multi-View Supervision

Abdullah J Hamdi ⋅ Silvio Giancola
9:00 AM - 1:00 PM
Workshop
Workshop

Workshop on Any-to-any Multimodal Learning

Shengqiong Wu ⋅ Wei Dai
9:00 AM - 1:00 PM
Workshop
Workshop
Workshop
9:00 AM - 1:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop
Workshop

Humans of Generative AI

Jaron Mink ⋅ David Forsyth
9:00 AM - 1:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop

Multimodal Algorithmic Reasoning Workshop

Anoop Cherian ⋅ Suhas Lohit
9:00 AM - 1:00 PM
Workshop

Exploring the Next Generation of Data

Nadine Chang ⋅ Maying Shen
9:00 AM - 1:00 PM
Workshop

6th Omnidirectional Computer Vision Workshop

Pierre Moulon ⋅ Guillaume Caron
9:00 AM - 1:00 PM
Workshop

Open-World Vision

Shu Kong ⋅ Neehar Peri
9:00 AM - 1:00 PM

Open-World Vision (OWV) emphasizes realistic opportunities and challenges in developing and deploying computer vision systems in the dynamic, vast, and unpredictable real open world, which offers abundant data that can benefit training and challenge testing. It contrasts the traditional "closed-world" paradigm of visual learning and inference, which assumes fixed, known data distributions and categorical labels. Models developed under such closed-world assumptions tend to be brittle when encountering ever-changing and novel scenarios in the real open world. Modern visual learning has shifted towards an open-world paradigm, such as pretraining foundation models on massive data sourced from the open world (e.g., web-sourced data). While these models show unprecedented performance and strong adaptability to downstream tasks, they inherit biases from their open-world pretraining data and can still fail in truly novel or underrepresented scenarios during deployment. This workshop aims not only to uncover current limitations, potential risks, emerging opportunities, and unresolved challenges of open-world vision, but also to solicit solutions that advance the field toward more robust, fair, and adaptable visual systems.

... more
Workshop

Personalization in Generative AI Workshop

Pinar Yanardag ⋅ Nupur Kumari
9:00 AM - 1:00 PM
Workshop

PhysHuman: Physically Grounded Human Perception and Modeling

Feng Liu ⋅ Youngjoong Kwon ⋅ Cheng Zhang
9:00 AM - 1:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop

Safe Artificial Intelligence for All Domains

Oliver Wasenmüller ⋅ Markus Enzweiler
9:00 AM - 1:00 PM
Workshop
Workshop
Workshop
9:00 AM - 1:00 PM
Workshop
9:00 AM - 1:00 PM
Workshop

4th Workshop on Maritime Computer Vision

Benjamin Kiefer ⋅ Jon Muhovic
9:00 AM - 1:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop

4th Workshop on Generative Models for Computer Vision

Adam Kortylewski ⋅ Fangneng Zhan
9:00 AM - 6:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop

How Do Vision Models Work?

Tamar Rott Shaham ⋅ Amil Dravid
9:00 AM - 6:00 PM
Workshop
9:00 AM - 6:00 PM
Workshop

9th Multimodal Learning and Applications Workshop

Paolo Rota ⋅ Michael Ying Yang
9:00 AM - 6:00 PM
Workshop
Workshop
9:00 AM - 6:00 PM
Workshop

2nd Workshop on Video Large Language Models

Rohit Gupta ⋅ Sirnam Swetha
9:00 AM - 6:00 PM
Workshop

Workshop on Visual Concepts

Joy Hsu ⋅ R. Kenny Jones
9:00 AM - 6:00 PM
Workshop

Sight and Sound

Andrew Owens ⋅ Jiajun Wu
9:00 AM - 6:00 PM
Workshop

1st Workshop on Generative 3D Reconstruction

Daniel Barath ⋅ Fabian Manhardt
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop

Artificial Intelligence for Space

Daniele Gammelli ⋅ Gabriele Meoni
1:00 PM - 6:00 PM
Workshop

2nd Workshop on GenAI for Storytelling

Andrew Shin ⋅ Yusuke Mori
1:00 PM - 6:00 PM
Workshop

Appearance Understanding and Generation

Elena Garces ⋅ Giuseppe Vecchio
1:00 PM - 6:00 PM
Workshop

Big Model Adaptation In Computer Vision

Yuki Asano ⋅ Anna Kukleva
1:00 PM - 6:00 PM
Workshop

CVPR 2026 Biometrics Workshop

Bir Bhanu ⋅ Ajay Kumar
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop

The 7th International Workshop on Eye and Gaze in Computer Vision

Yihua Cheng ⋅ Seonwook Park ⋅ Hyung Jin Chang
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop

Imagine a world where computer vision-based systems can analyze a video of an athlete, a surgeon, a patient, or a factory worker and instantly provide expert-level actionable feedback---correcting techniques, identifying inefficiencies, and helping people refine their skills in real time. Thanks to rapid progress in video understanding, this vision is becoming reality. AI-powered systems can now analyze complex human activities, assess performance, and generate intelligent feedback, unlocking new possibilities in sports, healthcare, manufacturing, education, rehabilitation, and beyond. Through Expert Keynotes and Invited Contributions, this CVPR 2026 workshop will explore the cutting edge of skilled activity understanding, assessment, and feedback generation, bridging research and real-world applications.

As AI systems become more capable of understanding human expertise, the implications are profound---empowering individuals with personalized coaching, democratized skill development, and scalable training solutions. We invite researchers, industry leaders, and practitioners to join us in shaping the future of AI-powered skill understanding. Whether working on foundational research, applied solutions, or real-world deployment, this workshop is an opportunity and forum to learn about and push the boundaries of how AI perceives, evaluates, and enhances human ability.

... more
Workshop
1:00 PM - 6:00 PM
Workshop
1:00 PM - 6:00 PM
Workshop

Visual Anomaly and Novelty Detection - 4th Edition

Philipp Seeböck ⋅ Latha Pemula
1:00 PM - 6:00 PM
Tutorial

From Perception to Action: Building Efficient and Deployable Robot Intelligence Pipelines with Open-Source Edge AI Toolkits

Samet Akcay · Zhuo Wu · Michael Paulitsch · Ashutosh Kumar · Tao Xiong · Adrian Boguszewski · Sameer Sheorey · Benjamin Ummenhofer
1:30 PM - 6:00 PM

Robotic manipulation has become a key application of embodied AI, but many research pipelines remain difficult to reproduce and deploy in real-world systems. This tutorial presents an end-to-end, open-source workflow for building efficient robot intelligence pipelines, covering data collection, visuomotor policy training, simulation, and deployment on edge hardware. It introduces practical techniques such as teleoperated data acquisition, diffusion- and transformer-based policies, and neural object cloning for simulation-ready assets. The tutorial further emphasizes model optimization and real-time deployment, culminating in a live demonstration of a complete perception-to-action pipeline on an affordable robotic platform.

... more
Tutorial

Foundations and Frontiers of Watermarking: Algorithms, Multimodal Extensions, Benchmarks, and Authenticity Frameworks

Vishal Asnani · Shruti Agarwal · Benedetta Tondi · Pierre Fernandez · Furong Huang
1:30 PM - 6:00 PM

Watermarking has re-emerged as a critical component of trustworthy AI, driven by the rapid growth of generative models and the need for content attribution and authenticity. This tutorial provides a unified overview of watermarking, spanning classical signal-processing foundations and modern deep-learning–based approaches across images, video, audio, and multimodal data. It examines key challenges such as robustness, capacity, and adversarial resilience, along with recent benchmarking efforts and evaluation frameworks. The tutorial further connects these methods to real-world deployment through applications in content provenance, media forensics, and emerging standards such as C2PA, offering a comprehensive perspective on building reliable and transparent media systems.

... more
Tutorial

The Road to Convergence: Evolution of Unified Multimodal Models

Jindong Wang · Hao Chen · Jiakui Hu · Zhaolong Su · Sharon Li
1:30 PM - 6:00 PM

Unified multimodal models are emerging as a new paradigm that integrates understanding and generation across modalities within a single foundation model. This tutorial provides a comprehensive overview of these models, addressing the currently fragmented landscape of architectures, representations, and training strategies. It introduces a unified perspective on key design choices, including modeling paradigms, multimodal tokenization, and alignment methods, while reviewing benchmarks and real-world applications. The tutorial further highlights open challenges such as scalable representation learning and unified world modeling, offering a structured roadmap for future research in multimodal AI.

... more