Learning symmetries via weight-sharing with doubly stochastic tensors Putri A. van der Linden

Neural Information Processing Systems 

This yields learnable kernel transformations that are jointly optimized with downstream tasks.