Supplementary Material for HUMUS Net Hybrid Unrolled Multi Scale Network Architecture for Accelerated Net baseline details
–Neural Information Processing Systems
Our default model has 3RSTB-D downsampling blocks, 2RSTB-B bottleneck blocks and 3RSTB-U upsampling blocks with 3 6 12 attention heads in the D/U blocks and 24 attention heads in the bottleneck block. For Swin Transformers layers, the window size is 8 for all methods and MLP ratio (hidden_dim/input_dim) of 2 is used. Each RSTB block consists of 2 STLs with embedding dimension of 66. For HUMUS-Net-L, we increase the embedding dimension to 96. We use 8cascades of unrolling with a U-Net as sensitivity map estimator (same as in E2E-VarNet) with 16channels.
Neural Information Processing Systems
Apr-26-2026, 21:21:11 GMT