MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures

Insights into Top Paper Nominee, “MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures"

A Q&A with the Authors

Zhiqin Chen, Thomas Funkhouser, Peter Hedman, Andrea Tagliasacch

Paper Presentation: Tuesday, 20 June, 3:30 p.m. PDT, East Exhibit Halls A-B

For CVPR 2023 paper, “MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures,” the authors deliver results in the rendering pipeline that make it 10 times faster than SNeRG with the same output quality. In addition, it can run on mobile phones and other devices previously unable to support NeRF visualization. The CVPR PR team conducted a Q&A interview with the research team to find out more.

CVPR: Will you please share a little more about your work and results? How is it different than the standard approaches to date? How did your model outperform other options? What was the key factor in these results?

The key idea of our approach is to leverage the existing highly optimized polygon rasterization pipeline in modern GPUs, instead of performing volumetric rendering, to make viewing and interacting with NeRF scenes possible on a wide range of mobile devices.

Classic NeRF methods need to evaluate a large MLP at hundreds of sample points along the ray for each pixel, resulting in slow rendering speed. Recent work such as SNeRG [1] addresses this issue by baking NeRF into a sparse 3D voxel grid, thus eliminating the need for MLP evaluations. However, SNeRG does not fully exploit the parallelism available in GPUs as it still relies on ray marching. Furthermore, it requires a significant amount of GPU memory to store volumetric textures, making it infeasible to run on mobile phones.

In our method, MobileNeRF, we represent NeRF models as textured polygon meshes. The polygons in the mesh provide a coarse approximation of the scene's surface, and we rely on the opacity and feature vectors stored in the texture images to represent geometric details and view-dependent colors, respectively. During rendering, we utilize the classic polygon rasterization pipeline with Z-buffering to produce a feature vector for each pixel, and then pass it to a lightweight MLP running in a GLSL fragment shader to produce the view-dependent output color. Our rendering pipeline takes full advantage of the parallelism provided by modern GPUs, and thus is 10 times faster than SNeRG with the same output quality. It can run on mobile phones and other devices previously unable to support NeRF visualization, as virtually all GPUs support rendering polygons. The explicit mesh representation of our method also enables real-time editing and manipulation of NeRF scenes.

CVPR: So, what’s next? What do you see as the future of your research?

An important next step is to improve the training speed of our method to enhance its efficiency. Our current implementation is slow to train since we adopt NeRF’s MLP backbone to produce the opacity and feature vector for every point on the mesh. Therefore, incorporating fast-training architectures such as Instant-NGP [2] into our work could substantially accelerate the training process.

Furthermore, it is worthwhile to explore techniques that represent the scene as a combination of manifold meshes and polygon soups depending on the characteristics of the reconstructed objects. It would also be valuable to investigate approaches for separating the geometry and materials, to enable seamless integration of the reconstructed meshes into real-world applications.

Lastly, it is important to note that MobileNeRF uses binary opacity in mesh textures to avoid sorting polygons. Consequently, it is not capable of modeling scenes with semi-transparencies. We look forward to finding an elegant solution for handling semi-transparent objects in the future.

CVPR: What more would you like to add?

The advancement of rendering NeRF on common devices is progressing rapidly, and there has already been numerous follow-ups of our MobileNeRF. For instance, BakedSDF [3] and NeRF2Mesh [4] have emerged as methods capable of reconstructing higher-quality manifold meshes. Other researchers have also explored optimizing NeRF's volumetric rendering for mobile devices [5], as well as real-time neural light field [6].

Moreover, our work has received considerable attention on Twitter, attracting some enthusiastic people who have successfully integrated our approach into various applications and platforms, including AR, VR, Unity, and Unreal. These exciting developments hold great promise, and we are thrilled to see the future advancements in this field.

Annually, CVPR recognizes top research in the field through its prestigious “Best Paper Awards.” This year, from more than 9,000 paper submissions, the CVPR 2023 Paper Awards Committee selected 12 candidates for the coveted honor of Best Paper. Join us for the Award Session on Wednesday, 21 June at 8:30 a.m. to find out which nominees take home the distinction of “Best Paper” at CVPR 2023.

[1] Hedman et al, Baking Neural Radiance Fields for Real-Time View Synthesis. In ICCV 2021.

[2] Muller et al, Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. In SIGGRAPH 2022.

[3] Yariv et al, BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis. In SIGGRAPH 2023.

[4] Tang et al, Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement. In arXiv 2023.

[5] Reiser et al, MERF: Memory-Efficient Radiance Fields for Real-time View Synthesis in Unbounded Scenes. In SIGGRAPH 2023.

[6] Cao et al, Real-Time Neural Light Field on Mobile Devices. In CVPR 2023.