RS-OVC: Open-Vocabulary Counting for Remote-Sensing Data
2026-04-09 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors address the problem of counting objects in remote-sensing images, which is usually limited to known object types. They introduce RS-OVC, the first model that can count new, unseen object classes without needing extra training, using just text or example images as hints. This approach helps count objects in changing real-world settings more flexibly. Their model works by understanding descriptions or visuals to identify and count these novel objects accurately.
remote sensingobject countingopen vocabularyaerial imagerymachine learningmodel generalizationzero-shot learningtextual conditioningvisual conditioningcomputer vision
Authors
Tamir Shor, George Leifman, Genady Beryozkin
Abstract
Object-Counting for remote-sensing (RS) imagery is attracting increasing research interest due to its crucial role in a wide and diverse set of applications. While several promising methods for RS object-counting have been proposed, existing methods focus on a closed, pre-defined set of object classes. This limitation necessitates costly re-annotation and model re-training to adapt current approaches for counting of novel objects that have not been seen during training, and severely inhibits their application in dynamic, real-world monitoring scenarios. To address this gap, in this work we propose RS-OVC - the first Open Vocabulary Counting (OVC) model for Remote-Sensing and aerial imagery. We show that our model is capable of accurate counting of novel object classes, that were unseen during training, based solely on textual and/or visual conditioning.