Workshop

1st Workshop on Dataset Distillation for Computer Vision

Saeed Vahidian

Project Page [ Contact: saeed.vahidian@duke.edu ]

Abstract

In the past decade, deep learning has been mainly advanced by training increasingly large models on increasingly large datasets which comes with the price of massive computation and expensive devices for their training.
As a result, research on designing state-of-the-art models gradually gets monopolized by large companies, while research groups with limited resources such as universities and small companies are unable to compete.
Reducing the training dataset size while preserving model training effects is significant for reducing the training cost, enabling green AI, and encouraging the university research groups to engage in the latest research.
This workshop focuses on the emerging research field of dataset distillation which aims to compress a large training dataset into a tiny informative one (e.g. 1\% of the size of the original data) while maintaining the performance of models trained on this dataset. Besides general-purpose efficient model training, dataset distillation can also greatly facilitate downstream tasks such as neural architecture/hyperparameter search by speeding up model evaluation, continual learning by producing compact memory, federated learning by reducing data transmission, and privacy-preserving learning by removing data privacy. Dataset distillation is also closely related to research topics including core-set selection, prototype generation, active learning, few-shot learning, generative models, and a broad area of learning from synthetic data.

Although DD has become an important paradigm in various machine-learning tasks, the potential of DD in computer vision (CV) applications, such as face recognition, person re-identification, and action recognition is far from being fully exploited.
Moreover, DD has rarely been demonstrated effectively in advanced computer vision tasks such as object detection, image segmentation, and video understanding.
Further, numerous unexplored challenges and unresolved issues exist in the realm of DD.
One such challenge pertains to finding efficient methods to modify existing DD workflows or create entirely new ones to address a wide range of computer vision tasks, extending beyond mere image classification.
An additional challenge lies in improving the scalability of dataset distillation (DD) methods to compress real-world datasets beyond the scale of ImageNet.

The purpose of this workshop is to unite researchers and professionals who share an interest in Dataset Distillation for computer vision for developing the next generation of dataset distillation methods for computer vision applications.

Chat is not available.