Who Handles Orientation? Investigating Invariance in Feature Matching

2026-04-13Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors study how to make computer programs better at matching points between images when those images are rotated. They test whether it is best to teach the program to ignore rotation early (in the descriptor) or later (in the matcher) and find that teaching rotation invariance early works just as well but makes the matching faster. They also show that training with lots of data helps the program handle rotations without losing accuracy on normal images. The authors provide two improved matchers that work well on different types of challenging image pairs, such as satellite or multi-modal images.

keypoint matching3D computer visionrotation invariancedescriptormatcherdata augmentationimage matching benchmarksmulti-modal imagestraining datageneralization
Authors
David Nordström, Johan Edstedt, Fredrik Kahl, Georg Bökman
Abstract
Finding matching keypoints between images is a core problem in 3D computer vision. However, modern matchers struggle with large in-plane rotations. A straightforward mitigation is to learn rotation invariance via data augmentation. However, it remains unclear at which stage rotation invariance should be incorporated. In this paper, we study this in the context of a modern sparse matching pipeline. We perform extensive experiments by training on a large collection of 3D vision datasets and evaluating on popular image matching benchmarks. Surprisingly, we find that incorporating rotation invariance already in the descriptor yields similar performance to handling it in the matcher. However, rotation invariance is achieved earlier in the matcher when it is learned in the descriptor, allowing for a faster rotation-invariant matcher. Further, we find that enforcing rotation invariance does not hurt upright performance when trained at scale. Finally, we study the emergence of rotation invariance through scale and find that increasing the training data size substantially improves generalization to rotated images. We release two matchers robust to in-plane rotations that achieve state-of-the-art performance on e.g. multi-modal (WxBS), extreme (HardMatch), and satellite image matching (SatAst). Code is available at https://github.com/davnords/loma.