Review for NeurIPS paper: ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding
–Neural Information Processing Systems
I could not see a strong motivation for explicitly enforcing sparsity on architecture parameters. This is because there are already many works trying to decouple the dependency of evaluating sub-networks on the training of supernet (i.e., making the correlation higher). This means that we have ways to explicitly decouple the network evaluation with supernet training without adding a sparsity regularizaiton. As far as I know, weight-sharing methods require the BN to be re-calculated [1] to properly measure the Kendall correlation. Other works that can reduce the gap between supernet and sub-networks (e.g.
Neural Information Processing Systems
Jan-25-2025, 22:30:49 GMT
- Technology: