Not enough data to create a plot.
Try a different view from the menu above.

Plotting


Differential Properties of Sinkhorn Approximation for Learning with Wasserstein Distance

Giulia Luise, Alessandro Rudi, Massimiliano Pontil, Carlo Ciliberto

Neural Information Processing Systems

Applications of optimal transport have recently gained remarkable attention as a result of the computational advantages of entropic regularization. However, in most situations the Sinkhorn approximation to the Wasserstein distance is replaced by a regularized version that is less accurate but easy to differentiate. In this work we characterize the differential properties of the original Sinkhorn approximation, proving that it enjoys the same smoothness of its regularized version and we explicitly provide an efficient algorithm to compute its gradient. We show that this result benefits both theory and applications: on one hand, high order smoothness confers statistical guarantees to learning with Wasserstein approximations. On the other hand, the gradient formula is used to efficiently solve learning and optimization problems in practice. Promising preliminary experiments complement our analysis.



PhotorealisticText-to-ImageDiffusionModels withDeepLanguageUnderstanding

Neural Information Processing Systems

While conceptually simple and easy to train, Imagen yields surprisingly strong results. Imagen outperforms other methods on COCO [38] with zero-shot FID-30K of 7.27, significantly outperforming prior work such asGLIDE [43](at 12.4) and the concurrent work ofDALL-E 2[56](at 10.4).



ec51d1fe4bbb754577da5e18eb54e6d1-Paper-Conference.pdf

Neural Information Processing Systems

Frequently,transformations occurring in data can be better represented by a subset of a group than by agroup asawhole, e.g., rotations in[ 90,90 ]. Insuch cases, amodel that respects equivariancepartially is better suited to represent the data.