FuTCR: Future-Targeted Contrast and Repulsion for Continual Panoptic Segmentation

2026-05-12Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors study a problem where a computer vision model needs to recognize new object categories over time without forgetting old ones, a task called Continual Panoptic Segmentation. They note that current methods treat all unknown objects as one background class, which confuses the model when new categories appear. To fix this, the authors propose a new method called FuTCR that finds regions likely to be future categories and groups similar pixels together while keeping these separate from known objects. This helps the model prepare in advance for new categories, leading to better recognition of new objects without losing accuracy on known ones.

Continual LearningPanoptic SegmentationContrastive LearningRepresentation LearningBackground ClassPrototype LearningDense PredictionClass-Incremental Learning
Authors
Nicholas Ikechukwu, Keanu Nichols, Deepti Ghadiyaram, Bryan A. Plummer
Abstract
Continual Panoptic Segmentation (CPS) requires methods that can quickly adapt to new categories over time. The nature of this dense prediction task means that training images may contain a mix of labeled and unlabeled objects. As nothing is known about these unlabeled objects a priori, existing methods often simply group any unlabeled pixel into a single "background" class during training. In effect, during training, they repeatedly tell the model that all the different background categories are the same (even when they aren't). This makes learning to identify different background categories as they are added challenging since these new categories may require using information the model was previously told was unimportant and ignored. Thus, we propose a Future-Targeted Contrastive and Repulsive (FuTCR) framework that addresses this limitation by restructuring representations before new classes are introduced. FuTCR first discovers confident future-like regions by grouping model-predicted masks whose pixels are consistently classified as background but exhibit non-background logits. Next, FuTCR applies pixel-to-region contrast to build coherent prototypes from these unlabeled regions, while simultaneously repelling background features away from known-class prototypes to explicitly reserve representational space for future categories. Experiments across six CPS settings and a range of dataset sizes show FuTCR improves relative new-class panoptic quality over the state-of-the-art by up to 28%, while preserving or improving base-class performance with gains up to 4%.