Ayaz, Ulas
Invertibility of Convolutional Generative Networks from Partial Measurements
Ma, Fangchang, Ayaz, Ulas, Karaman, Sertac
In this work, we present new theoretical results on convolutional generative neural networks, in particular their invertibility (i.e., the recovery of input latent code given the network output). The study of network inversion problem is motivated by image inpainting and the mode collapse problem in training GAN. Network inversion is highly non-convex, and thus is typically computationally intractable and without optimality guarantees. However, we rigorously prove that, under some mild technical assumptions, the input of a two-layer convolutional generative network can be deduced from the network output efficiently using simple gradient descent. This new theoretical finding implies that the mapping from the low- dimensional latent space to the high-dimensional image space is bijective (i.e., one-to-one). In addition, the same conclusion holds even when the network output is only partially observed (i.e., with missing pixels). Our theorems hold for 2-layer convolutional generative network with ReLU as the activation function, but we demonstrate empirically that the same conclusion extends to multi-layer networks and networks with other activation functions, including the leaky ReLU, sigmoid and tanh.
Invertibility of Convolutional Generative Networks from Partial Measurements
Ma, Fangchang, Ayaz, Ulas, Karaman, Sertac
The problem of inverting generative neural networks (i.e., to recover the input latent code given partial network output), motivated by image inpainting, has recently been studied. Prior work focused on fully-connected networks for mathematical simplicity. In this work, we present new results on convolutional networks, which are more widely used. The network inversion problem is highly non-convex, and hence is typically computationally intractable and without optimality guarantees. However, we rigorously prove that, for a 2-layer convolutional generative network with ReLU and Gaussian-distributed random weights, the input latent code can be deduced from the network output efficiently using simple gradient descent. This new theoretical finding implies that the mapping from the low-dimensional latent space to the high-dimensional image space is one-to-one, under our assumptions. In addition, the same conclusion holds even when the network output is only partially observed (i.e., with missing pixels). We further demonstrate, empirically, that the same conclusion extends to networks with multiple layers, other activation functions (leaky ReLU, sigmoid and tanh), and weights trained on real datasets.
Sparse Depth Sensing for Resource-Constrained Robots
Ma, Fangchang, Carlone, Luca, Ayaz, Ulas, Karaman, Sertac
We consider the case in which a robot has to navigate in an unknown environment but does not have enough on-board power or payload to carry a traditional depth sensor (e.g., a 3D lidar) and thus can only acquire a few (point-wise) depth measurements. We address the following question: is it possible to reconstruct the geometry of an unknown environment using sparse and incomplete depth measurements? Reconstruction from incomplete data is not possible in general, but when the robot operates in man-made environments, the depth exhibits some regularity (e.g., many planar surfaces with only a few edges); we leverage this regularity to infer depth from a small number of measurements. Our first contribution is a formulation of the depth reconstruction problem that bridges robot perception with the compressive sensing literature in signal processing. The second contribution includes a set of formal results that ascertain the exactness and stability of the depth reconstruction in 2D and 3D problems, and completely characterize the geometry of the profiles that we can reconstruct. Our third contribution is a set of practical algorithms for depth reconstruction: our formulation directly translates into algorithms for depth estimation based on convex programming. In real-world problems, these convex programs are very large and general-purpose solvers are relatively slow. For this reason, we discuss ad-hoc solvers that enable fast depth reconstruction in real problems. The last contribution is an extensive experimental evaluation in 2D and 3D problems, including Monte Carlo runs on simulated instances and testing on multiple real datasets. Empirical results confirm that the proposed approach ensures accurate depth reconstruction, outperforms interpolation-based strategies, and performs well even when the assumption of structured environment is violated.