Appendix A Object Query Generation

Neural Information Processing Systems 

The text-guided object detection network, as described in Section 3.1.1, Here, we present the details of the process for generating object queries. Detecting objects in a 3D scene using a fixed set of 3D anchor boxes or parameterized representations poses a challenge due to the large search space. Since the network focuses only on detecting objects that are semantically related to the text, we filter the point cloud features based on their semantic correlation with textual features. As mentioned in Section 3.1.2,