Appendix Implementation Details

Apr-25-2026, 05:11:15 GMT–Neural Information Processing Systems

A.1 Network Architectures We adopt Daformer [17] with Swin-B or MiT-B5 backbone as the base semantic segmentation architecture. For the segmentation head, we utilize the same head as Daformer [17]. The stem module contains one fully-convolutional layers with kernel 3 3 and stride of 2, two fully-convolutional layers with kernel 3 3 and stride of 1, two fully-convolutional layers with kernel 3 3 and stride of 2, and another three fully-convolutional layers with kernel 1 1 and stride of 1 to adjust channels of different feature maps. Level embedding module is defined as metrics with shape 3 dims. The prompt Interactor module contains three fully-convolutional layers with kernel 3 3 and stride of 2 to adjust feature dimensions.

artificial intelligence, machine learning, standard single source, (15 more...)

Neural Information Processing Systems

Apr-25-2026, 05:11:15 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
157c30da6a988e1cbef2095f7b9521db-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found