Review for NeurIPS paper: Multiscale Deep Equilibrium Models

Neural Information Processing Systems 

Strengths: The approach is a significant departure from the norm for deep vision models. As surveyed in the related work, and represented in the explicit deep learning baselines, it is common for deep vision models to arrange many different depths and resolutions of features in sequence, each with their own parameters, to encode a hierarchical and multi-scale representation of images. When there is parallel processing, as on a pyramid, the interaction between scales is limited to simple aggregations like taking the sum or max. This work instead simultaneously solves for an equilibrium across multiple scales of a shallow transformation with only a single "stage" of parameters. This is an informative interrogation of the need for hierarchy, stages of distinct parameters, and depth vs. width in deep learning for vision.