Deep structured models are widely used for tasks like semantic segmentation, where explicit correlations between variables provide important prior information which generally helps to reduce the data needs of deep nets. However, current deep structured models are restricted by oftentimes very local neighborhood structure, which cannot be increased for computational complexity reasons, and by the fact that the output configuration, or a representation thereof, cannot be transformed further. Very recent approaches which address those issues include graphical model inference inside deep nets so as to permit subsequent non-linear output space transformations. However, optimization of those formulations is challenging and not well understood. Here, we develop a novel model which generalizes existing approaches, such as structured prediction energy networks, and discuss a formulation which maintains applicability of existing inference techniques.
Recent object detection systems rely on two critical steps: (1) a set of object proposals is predicted as efficiently as possible, and (2) this set of candidate proposals is then passed to an object classifier. Such approaches have been shown they can be fast, while achieving the state of the art in detection performance. In this paper, we propose a new way to generate object proposals, introducing an approach based on a discriminative convolutional network. Our model is trained jointly with two objectives: given an image patch, the first part of the system outputs a class-agnostic segmentation mask, while the second part of the system outputs the likelihood of the patch being centered on a full object. At test time, the model is efficiently applied on the whole test image and generates a set of segmentation masks, each of them being assigned with a corresponding object likelihood score.
Our Panoptic-DeepLab is conceptually simple and delivers state-of-the-art results. In particular, we adopt the dual-ASPP and dual-decoder structures specific to semantic, and instance segmentation, respectively. The semantic segmentation branch is the same as the typical design of any semantic segmentation model ( e.g., DeepLab), while the instance segmentation branch is class-agnostic, involving a simple instance center regression. Our single Panoptic-DeepLab sets the new state-of-art at all three Cityscapes benchmarks, reaching 84.2% mIoU, 39.0% AP, and 65.5% PQ on test set, and advances results on the other challenging Mapillary Vistas. 1. Introduction Our bottom-up Panoptic-DeepLab is conceptually simple and delivers state-of-the-art panoptic segmentation results . We adopt dual-ASPP and dual-decoder modules, specific to semantic segmentation and instance segmentation, respectively. The semantic segmentation branch follows the typical design of any semantic segmentation model (e.g., DeepLab ), while the instance segmentation prediction involves a simple instance center regression [1, 5], where the model learns to predict instance centers as well as the offset from each pixel to its corresponding center.
Statistically independent features can be extracted by finding a factorial representationof a signal distribution. Principal Component Analysis (PCA) accomplishes this for linear correlated and Gaussian distributedsignals. Independent Component Analysis (ICA), formalized by Comon (1994), extracts features in the case of linear statisticaldependent but not necessarily Gaussian distributed signals. Nonlinear Component Analysis finally should find a factorial representationfor nonlinear statistical dependent distributed signals. This paper proposes for this task a novel feed-forward, information conserving, nonlinear map - the explicit symplectic transformations. It also solves the problem of non-Gaussian output distributions by considering single coordinate higher order statistics. 1 Introduction In previous papers Deco and Brauer (1994) and Parra, Deco, and Miesbach (1995) suggest volume conserving transformations and factorization as the key elements for a nonlinear version of Independent Component Analysis. As a general class of volume conserving transformations Parra et al. (1995) propose the symplectic transformation. It was defined by an implicit nonlinear equation, which leads to a complex relaxation procedure for the function recall. In this paper an explicit form of the symplectic map is proposed, overcoming thus the computational problems.