Distilling Focal Knowledge From Imperfect Expert for 3D Object Detection

Jia Zeng · Li Chen · Hanming Deng · Lewei Lu · Junchi Yan · Yu Qiao · Hongyang Li

West Building Exhibit Halls ABC 094
[ Abstract ]
Tue 20 Jun 10:30 a.m. PDT — noon PDT


Multi-camera 3D object detection blossoms in recent years and most of state-of-the-art methods are built up on the bird’s-eye-view (BEV) representations. Albeit remarkable performance, these works suffer from low efficiency. Typically, knowledge distillation can be used for model compression. However, due to unclear 3D geometry reasoning, expert features usually contain some noisy and confusing areas. In this work, we investigate on how to distill the knowledge from an imperfect expert. We propose FD3D, a Focal Distiller for 3D object detection. Specifically, a set of queries are leveraged to locate the instance-level areas for masked feature generation, to intensify feature representation ability in these areas. Moreover, these queries search out the representative fine-grained positions for refined distillation. We verify the effectiveness of our method by applying it to two popular detection models, BEVFormer and DETR3D. The results demonstrate that our method achieves improvements of 4.07 and 3.17 points respectively in terms of NDS metric on nuScenes benchmark. Code is hosted at

Chat is not available.