LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis

2026-03-20 • Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition

AI summaryⓘ

The authors developed a new neural network called LagerNVS that helps computers create new views of 3D scenes from images, without needing to build a full 3D model first. Their approach starts with a network trained on true 3D data to give it a good sense of 3D structure, then fine-tunes it for the task of generating new views. This method achieves very accurate and fast results, works well even with random camera positions, and can handle real-world images. They also show it can be combined with other techniques to generate new images beyond the original views.

Novel View SynthesisNeural Networks3D ReconstructionEncoder-Decoder ArchitecturePhotometric LossPSNRLatent FeaturesGenerative ModelsDiffusion DecoderFeed-forward Rendering

Authors

Stanislaw Szymanowicz, Minghao Chen, Jianyuan Wang, Christian Rupprecht, Andrea Vedaldi

Abstract

Recent work has shown that neural networks can perform 3D tasks such as Novel View Synthesis (NVS) without explicit 3D reconstruction. Even so, we argue that strong 3D inductive biases are still helpful in the design of such networks. We show this point by introducing LagerNVS, an encoder-decoder neural network for NVS that builds on `3D-aware' latent features. The encoder is initialized from a 3D reconstruction network pre-trained using explicit 3D supervision. This is paired with a lightweight decoder, and trained end-to-end with photometric losses. LagerNVS achieves state-of-the-art deterministic feed-forward Novel View Synthesis (including 31.4 PSNR on Re10k), with and without known cameras, renders in real time, generalizes to in-the-wild data, and can be paired with a diffusion decoder for generative extrapolation.

View PDFOpen arXiv