Reviews: Sequential Context Encoding for Duplicate Removal

Neural Information Processing Systems 

This paper proposes a new Duplicate Removal method based on RNN. Based on each candidate area, informative features are extracted by using appearance feature, position and ranking information in addition to the score. Then, they are treated as series data and are input into the RNN-based model to improve the final accuracy by capturing global information. The number of candidate regions is enormous to the number of objects that are to be left. Therefore, this paper proposes to reduce the box gradually by dividing it into two stages. In the two stages, the RNN model of the same structure was used. In stage I, to remove simple boxes the model is trained by using NMS results as a teaching signal. In stage II, to remove difficult boxes, the model is trained by using the grand-truth boxes. Experiments showed that mAP is increased in the SOTA object detection methods (FPN, Mask R - CNN, PANet with DCN) with the proposed method.