Goto

Collaborating Authors

 hierarchical neural architecture search


Hierarchical Neural Architecture Search for Deep Stereo Matching

Neural Information Processing Systems

To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation. The underlying idea for the NAS algorithm is straightforward, namely, to allow the network the ability to choose among a set of operations (\eg convolution with different filter sizes), one is able to find an optimal architecture that is better adapted to the problem at hand. However, so far the success of NAS has not been enjoyed by low-level geometric vision tasks such as stereo matching. This is partly due to the fact that state-of-the-art deep stereo matching networks, designed by humans, are already sheer in size. Directly applying the NAS to such massive structures is computationally prohibitive based on the currently available mainstream computing resources. In this paper, we propose the first \emph{end-to-end} hierarchical NAS framework for deep stereo matching by incorporating task-specific human knowledge into the neural architecture search framework.


Hierarchical Neural Architecture Search for Deep Stereo Matching - Supplementary Materials

Neural Information Processing Systems

KITTI 2012 contains 194 training image pairs and 195 test image pairs. We use a maximum disparity level of 192 in this dataset. Most of the stereo pairs are indoor scenes with handcrafted layouts. This dataset contains many thin objects and large disparity ranges. We provide more qualitative results on the SceneFlow, KITTI 2012, KITTI 2015 and Middlebury datasets in Figure 1 2 3 4, respectively.


Review for NeurIPS paper: Hierarchical Neural Architecture Search for Deep Stereo Matching

Neural Information Processing Systems

Weaknesses: - The paper is not particularly novel or exciting since it takes algorithms already applied in the field of semantic segmentation and applies them to the stereo depth estimation problem. The idea of using AutoML for stereo is not particularly novel either, as stated by the authors themselves, even if the proposed algorithm outperforms the previous proposal. Unfortunately the authors did not spend much time commenting on these aspects. For example, what might be the biggest takeaways from the found architecture? The main differences with respect to the previously published work is the search performed also on the network level and the use of two separate feature and matching networks.


Review for NeurIPS paper: Hierarchical Neural Architecture Search for Deep Stereo Matching

Neural Information Processing Systems

This paper initially received scores of 6,5,7, and 7. After the rebuttal R4 revised up from a 5 to a 6. The consensus from the reviewers was that while the technical novelty of the paper is not extremely high the results are important as neural architecture search for dense correspondence problems is under explored. Reviewers commented on the strong empirical performance for the same model across multiple datasets which is an important selling point for the paper. The authors are strongly encouraged to update the final paper to clarify the questions raised in the rebuttal - specifically the responses to R2's questions and the additional comparisons to AANet.


Hierarchical Neural Architecture Search for Deep Stereo Matching

Neural Information Processing Systems

To reduce the human efforts in neural network design, Neural Architecture Search (NAS) has been applied with remarkable success to various high-level vision tasks such as classification and semantic segmentation. The underlying idea for the NAS algorithm is straightforward, namely, to allow the network the ability to choose among a set of operations (\eg convolution with different filter sizes), one is able to find an optimal architecture that is better adapted to the problem at hand. However, so far the success of NAS has not been enjoyed by low-level geometric vision tasks such as stereo matching. This is partly due to the fact that state-of-the-art deep stereo matching networks, designed by humans, are already sheer in size. Directly applying the NAS to such massive structures is computationally prohibitive based on the currently available mainstream computing resources. In this paper, we propose the first \emph{end-to-end} hierarchical NAS framework for deep stereo matching by incorporating task-specific human knowledge into the neural architecture search framework.