Review for NeurIPS paper: RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder

Jan-27-2025, 00:28:35 GMT–Neural Information Processing Systems

Good work on analyzing pros and cons of various object representations, as well as a neat way to combine them into a single framework that gives good gains on the COCO benchmark. The proposed solution of using a self-attention module to bridge the representations is both original, simple and widely-applicable. I think the method and the work reveal intriguing differences between the various representations and this will be useful to the community. The authors should adapt the camera ready in accordance to the post-rebuttal comments from the reviewers (esp.

bridging visual representation, object detection, transformer decoder, (1 more...)

Neural Information Processing Systems

Jan-27-2025, 00:28:35 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Data Science (0.40)
  - Artificial Intelligence > Vision (0.40)