Supplementary Material
–Neural Information Processing Systems
The tradeoff weight λ is not the one in (10). Flipping h to [ 1, 0] produces the same issue. The activated areas are shaded. The activated areas are shaded. So case ii is always preferred. We will use mini-batches with size b.
Neural Information Processing Systems
Nov-15-2025, 19:56:50 GMT