panoptic segmentation
K Net Towards Unified Image Segmentation
Semantic, instance, and panoptic segmentations have been addressed using different and specialized frameworks despite their underlying connections. This paper presents a unified, simple, and effective framework for these essentially similar tasks. The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels, where each kernel is responsible for generating a mask for either a potential instance or a stuff class. To remedy the difficulties of distinguishing various instances, we propose a kernel update strategy that enables each kernel dynamic and conditional on its meaningful group in the input image. K-Net can be trained in an end-to-end manner with bipartite matching, and its training and inference are naturally NMS-free and box-free. Without bells and whistles, K-Net surpasses all previous published stateof-the-art single-model results of panoptic segmentation on MSCOCO test-dev split and semantic segmentation on ADE20K val split with 55.2% PQ and 54.3% mIoU, respectively. Its instance segmentation performance is also on par with Cascade Mask R-CNN on MSCOCO with 60%-90% faster inference speeds. Code and models will be released at https://github.com/ZwwWayne/K-Net/.
ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation
This paper presents a new mechanism to facilitate the training of mask transformers for efficient panoptic segmentation, democratizing its deployment. We observe that due to the high complexity in the training objective of panoptic segmentation, it will inevitably lead to much higher penalization on false positive.
Supplementary Material 1 Additional Implementation Details
We printed a checkerboard with a 9x10 grid of blocks, each measuring 87 mm x 87 mm. Parameter V alue Model Architecture Panoptic-PolarNet Test Batch Size 2 V al Batch Size 2 Test Batch size 1 post proc threshold 0.1 post proc nms kernel 5 post proc top k 100 center loss MSE offset loss L1 center loss weight 100 offset loss weight 10 enable SAP True SAP start epoch 30 SAP rate 0.01 Table 3: Parameters for Panoptic Segmentation model Parameter V alue(s) Model Architecture 4D-StOP Learning Rate 0.0005 Momentum 0.98 Stride 1 Max in points 5000 Sampling importance Decay Sampling None Input Threads 16 Checkpoint Gap 100 Table 4: Parameters for the 4D Panoptic Segmentation model The results reveal a significant variance in performance across different categories. Notably, 'Structure' and'Ground' both achieved high mIoU at Result The results are shown in Table 8. presents the mean intersection-over-union (mIoU) percent-56 Notably, 'Structure' achieved the highest mIoU at'General Objects' category have the lowest mIoU, The dataset is divided into 17 and 6 categories, respectively. Ground' and'Roads', as opposed to grouping anything related to ground as a single category. Overall, the performance across these tasks underscores the challenges posed by our dataset's With our dataset, future work can focus on improving the model's capacity to handle such diverse The raw data, processed data, and framework code can be found on our website.