r/MachineLearning - [R] EfficientDet: Scalable and Efficient Object Detection


Abstract: Model efficiency has become increasingly important in computer vision. In this paper, we systematically study various neural network architecture design choices for object detection and propose several key optimizations to improve efficiency. First, we propose a weighted bi- directional feature pyramid network (BiFPN), which allows easy and fast multi- scale feature fusion; Second, we propose a compound scaling method that uniformly scales the resolution, depth, and width for all backbone, feature network, and box/class prediction networks at the same time. Based on these optimizations, we have developed a new family of object detectors, called EfficientDet, which consistently achieve an order-of-magnitude better efficiency than prior art across a wide spectrum of resource constraints. In particular, without bells and whistles, our EfficientDet-D7 achieves stateof- the-art 51.0 mAP on COCO dataset with 52M parameters and 326B FLOPS1, being 4x smaller and using 9.3x fewer FLOPS yet still more accurate ( 0.3% mAP) than the best previous detector.