they pointed out our contributions of a) clustering into non-uniform bins (R2, R3), b) the modality alignment (R2) and c)

Neural Information Processing Systems 

We thank all reviewers for their thorough feedback that further strengthened our paper. We first reply to main comments and then provide minor clarifications. First, as shown in Tab. The motivation (l.16-24) is to learn to assign data to discrete buckets, basically naming On the other hand, the clustering task also leads to strong feature representations (see Tab. A.5). We have extended Tab. 2 from the paper: see Tab. 1 (c-f) below.