DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model Zhixiong Nan 1
–Neural Information Processing Systems
This paper is motivated by an interesting phenomenon: the performance of object detection lags behind that of instance segmentation (i.e., performance imbalance) when investigating the intermediate results from the beginning transformer decoder layer of MaskDINO (i.e., the SOTA model for joint detection and segmentation). This phenomenon inspires us to think about a question: will the performance imbalance at the beginning layer of transformer decoder constrain the upper bound of the final performance?
Neural Information Processing Systems
May-25-2025, 04:20:15 GMT
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Information Technology (0.93)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning > Neural Networks (0.93)
- Natural Language (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence