BRDFusion: Physics Meets Generation for Urban Scene Inverse Rendering

2026-06-15 • Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition

AI summaryⓘ

The authors developed BRDFusion, a system that combines physical models and generative models to understand and recreate urban scenes from video. Their approach recovers detailed scene properties accurately while reducing errors during reconstruction. This lets them create realistic videos that can be easily controlled, such as changing lighting or adding new objects. Their method works better than previous ones on both real and computer-generated scenes.

inverse renderingurban scenesphysically-based renderinggenerative modelsBRDFscene reconstructionnovel-view renderingrelightingvideo synthesisoptimisation ambiguity

Authors

Yi-Ruei Liu, Jie-Ying Lee, Zheng-Hui Huang, Yu-Lun Liu, Chih-Hao Lin

Abstract

Inverse rendering of urban scenes from captured videos enables numerous applications, including content creation and autonomous driving simulation. Physically-based rendering methods follow and control lighting physics, but suffer from reconstruction and rendering artifacts. While generative models produce realistic videos, they offer limited consistency and controllability. We present BRDFusion, a unified framework that combines two complementary models for inverse and forward rendering. Specifically, BRDFusion recovers explicit, consistent scene properties with physical modeling and alleviates optimization ambiguity with generative priors. During forward rendering, the physical model provides controllable rendering from the scene configuration, and the generative model denoises and fixes artifacts. Therefore, our method produces high-quality videos while allowing precise control, outperforming baselines in real and synthetic scenes. Moreover, BRDFusion supports novel-view relighting, night simulation, and dynamic object insertion/editing. Project page: https://shigon255.github.io/brdfusion-page/

View PDFOpen arXiv