MegaFlow: Zero-Shot Large Displacement Optical Flow

2026-03-26Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors present MegaFlow, a new model designed to estimate optical flow, which means tracking how objects move between video frames, especially when they move a lot. Instead of complicated local methods, their approach uses a global matching technique with a pre-trained Vision Transformer to find large movements. They then refine these results lightly to improve accuracy. Their experiments show that MegaFlow performs very well without task-specific training, and it also works for tracking points over long distances, suggesting it could be a general tool for motion estimation.

optical flowlarge displacementVision Transformerzero-shot learningglobal matchingmotion estimationpre-trained modelpoint trackingiterative refinementtransferability
Authors
Dingxi Zhang, Fangjinhua Wang, Marc Pollefeys, Haofei Xu
Abstract
Accurate estimation of large displacement optical flow remains a critical challenge. Existing methods typically rely on iterative local search or/and domain-specific fine-tuning, which severely limits their performance in large displacement and zero-shot generalization scenarios. To overcome this, we introduce MegaFlow, a simple yet powerful model for zero-shot large displacement optical flow. Rather than relying on highly complex, task-specific architectural designs, MegaFlow adapts powerful pre-trained vision priors to produce temporally consistent motion fields. In particular, we formulate flow estimation as a global matching problem by leveraging pre-trained global Vision Transformer features, which naturally capture large displacements. This is followed by a few lightweight iterative refinements to further improve the sub-pixel accuracy. Extensive experiments demonstrate that MegaFlow achieves state-of-the-art zero-shot performance across multiple optical flow benchmarks. Moreover, our model also delivers highly competitive zero-shot performance on long-range point tracking benchmarks, demonstrating its robust transferability and suggesting a unified paradigm for generalizable motion estimation. Our project page is at: https://kristen-z.github.io/projects/megaflow.