We thank all the reviewers for their helpful comments and for recognizing the novelty of our approach (R2-4) and its
–Neural Information Processing Systems
We are glad that the reviewers found our experimental setup exhaustive (R1-4). This is not feasible with prior work, e . We will clarify and highlight these challenges in the final version. Random samples in Tab. 2, 11, and 12 show that the captions from COS-CV AE are coherent (ll. COS-CV AE has a score of 0.742 while Seq-CV AE(attn) has 0.714.
Neural Information Processing Systems
Oct-2-2025, 11:57:28 GMT
- Technology: