Goto

Collaborating Authors

 r-fcn


R-FCN: Object Detection via Region-based Fully Convolutional Networks

Neural Information Processing Systems

We present region-based, fully convolutional networks for accurate and efficient object detection. In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. To achieve this goal, we propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection. Our method can thus naturally adopt fully convolutional image classifier backbones, such as the latest Residual Networks (ResNets), for object detection. We show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, our result is achieved at a test-time speed of 170ms per image, 2.5-20 times faster than the Faster R-CNN counterpart.



Reviews: R-FCN: Object Detection via Region-based Fully Convolutional Networks

Neural Information Processing Systems

For example, it would be interesting to give elements on why the position sensitive score maps have such an empirical edge over standard RoI pooling methods performed on top on convolutions layers (i.e.the naïve Faster RCNN variant evaluated in the experiments).


R-FCN: Object Detection via Region-based Fully Convolutional Networks

Neural Information Processing Systems

We present region-based, fully convolutional networks for accurate and efficient object detection. In contrast to previous region-based detectors such as Fast/Faster R-CNN [7, 19] that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. To achieve this goal, we propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection. Our method can thus naturally adopt fully convolutional image classifier backbones, such as the latest Residual Networks (ResNets) [10], for object detection. We show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, our result is achieved at a test-time speed of 170ms per image, 2.5-20 faster than the Faster R-CNN counterpart.


R-FCN: Object Detection via Region-based Fully Convolutional Networks

Dai, Jifeng, Li, Yi, He, Kaiming, Sun, Jian

Neural Information Processing Systems

We present region-based, fully convolutional networks for accurate and efficient object detection. In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. To achieve this goal, we propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection. Our method can thus naturally adopt fully convolutional image classifier backbones, such as the latest Residual Networks (ResNets), for object detection. We show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet.


R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection

Li, Zeming (Tsinghua University) | Chen, Yilun (Megvii Inc) | Yu, Gang (Megvii Inc) | Deng, Yangdong (Tsinghua University)

AAAI Conferences

Region based detectors like Faster R-CNN and R-FCN have achieved leading performance on object detection benchmarks. However, in Faster R-CNN, RoI pooling is used to extract feature of each region, which might harm the classification as the RoI pooling loses spatial resolution. Also it gets slow when a large number of proposals are utilized. R-FCN is a fully convolutional structure that uses a position-sensitive pooling layer to extract prediction score of each region, which speeds up network by sharing computation of RoIs and prevents the feature map from losing information in RoI-pooling. But R-FCN can not benefit from fully connected layer (or global average pooling), which enables Faster R-CNN to utilize global context information. In this paper, we propose R-FCN++ to address this issue in two-fold: first we involve Global Context Module to improve the classification score maps by adopting large, separable convolutional kernels. Second we introduce a new pooling method to better extract scores from the score maps, by using row-wise or column-wise max pooling. Our approach achieves state-of-the-art single-model results on both Pascal VOC and MS COCO object detection benchmarks, 87.3% on Pascal VOC 2012 test dataset and 42.3% on COCO 2015 test-dev dataset. Code will be made publicly available.


Deep Learning for Object Detection: A Comprehensive Review

@machinelearnbot

With the rise of autonomous vehicles, smart video surveillance, facial detection and various people counting applications, fast and accurate object detection systems are rising in demand. These systems involve not only recognizing and classifying every object in an image, but localizing each one by drawing the appropriate bounding box around it. This makes object detection a significantly harder task than its traditional computer vision predecessor, image classification. Fortunately, however, the most successful approaches to object detection are currently extensions of image classification models. A few months ago, Google released a new object detection API for Tensorflow.


R-FCN: Object Detection via Region-based Fully Convolutional Networks

Dai, Jifeng, Li, Yi, He, Kaiming, Sun, Jian

Neural Information Processing Systems

We present region-based, fully convolutional networks for accurate and efficient object detection. In contrast to previous region-based detectors such as Fast/Faster R-CNN that apply a costly per-region subnetwork hundreds of times, our region-based detector is fully convolutional with almost all computation shared on the entire image. To achieve this goal, we propose position-sensitive score maps to address a dilemma between translation-invariance in image classification and translation-variance in object detection. Our method can thus naturally adopt fully convolutional image classifier backbones, such as the latest Residual Networks (ResNets), for object detection. We show competitive results on the PASCAL VOC datasets (e.g., 83.6% mAP on the 2007 set) with the 101-layer ResNet. Meanwhile, our result is achieved at a test-time speed of 170ms per image, 2.5-20 times faster than the Faster R-CNN counterpart. Code is made publicly available at: https://github.com/daijifeng001/r-fcn.