Online3R: Online Learning for Consistent Sequential Reconstruction Based on Geometry Foundation Model

2026-04-10Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors introduce Online3R, a new method that learns to improve 3D scene reconstructions as it processes new environments. They add small, trainable visual prompts to a fixed, pretrained model so it can learn new details without forgetting what it already knows. To train these prompts without needing exact groundtruth data and while staying efficient, they use a strategy that enforces consistency in both local and global predictions over time. Their experiments show this approach works better than previous methods on several standard tests.

3D reconstructiononline learningvisual promptsfoundation modelgeometry predictionself-supervised learninglocal consistencyglobal consistencypseudo groundtruthkeyframes
Authors
Shunkai Zhou, Zike Yan, Fei Xue, Dong Wu, Yuchen Deng, Hongbin Zha
Abstract
We present Online3R, a new sequential reconstruction framework that is capable of adapting to new scenes through online learning, effectively resolving inconsistency issues. Specifically, we introduce a set of learnable lightweight visual prompts into a pretrained, frozen geometry foundation model to capture the knowledge of new environments while preserving the fundamental capability of the foundation model for geometry prediction. To solve the problems of missing groundtruth and the requirement of high efficiency when updating these visual prompts at test time, we introduce a local-global self-supervised learning strategy by enforcing the local and global consistency constraints on predictions. The local consistency constraints are conducted on intermediate and previously local fused results, enabling the model to be trained with high-quality pseudo groundtruth signals; the global consistency constraints are operated on sparse keyframes spanning long distances rather than per frame, allowing the model to learn from a consistent prediction over a long trajectory in an efficient way. Our experiments demonstrate that Online3R outperforms previous state-of-the-art methods on various benchmarks. Project page: https://shunkaizhou.github.io/online3r-1.0/