A Turbo-Inference Strategy for Object Detection and Instance Segmentation

2026-06-10Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition
AI summary

The authors studied how detecting objects and breaking them down into detailed parts (segmenting) are connected tasks. Usually, detection happens first and affects segmentation quality, but the authors explored how segmentation can also improve detection. They created a new approach called turbo-inference with two parts that let detection and segmentation share information back and forth, improving both without retraining the model. Tests on popular datasets showed better accuracy, though with some extra computing time.

object detectioninstance segmentationtop-down methodsbounding boxsegmentation maskCOCO datasetCityscapes datasetinferencecomputer vision
Authors
Zhen Zhao, Gang Zhang, Xiaolin Hu, Liang Tang
Abstract
Object detection and instance segmentation tasks are closely related. Existing top-down instance segmentation methods usually follow a detect-then-segment paradigm, where an initial detector is used to recognize and localize objects with bounding boxes, followed by the segmentation of an instance mask within each bounding box. In such methods, the detection accuracy directly influences the subsequent segmentation performance. However, previous research has seldom explored the impact of the instance segmentation task on object detection. In this paper, we present a turbo-inference strategy for the top-down methods that leverages the complementary information between detection and segmentation tasks iteratively. Specifically we design two modules: turbo-detection head and turbo-segmentation head, which facilitate communication between the tasks. The two modules form a closed loop that interlaces the detection and segmentation results without retraining the model. Comprehensive experiments on the COCO, iFLYTEK, and Cityscapes datasets demonstrate that our method substantially enhances both detection and segmentation accuracies with a certain increase in computational cost. The proposed method represents a tradeoff between prediction accuracy and inference speed. Codes are available at https://github.com/zhaozhen2333/Turbo-Learning.git.