Geodesic Mode Connectivity
Tan, Charlie, Long, Theodore, Zhao, Sarah, Laine, Rudolf
–arXiv.org Artificial Intelligence
Mode connectivity is a phenomenon where trained models are connected by a path of low loss. We reframe this in the context of Information Geometry, where neural networks are studied as spaces of parameterized distributions with curved geometry. We hypothesize that shortest paths in these spaces, known as geodesics, correspond to mode-connecting paths in the loss landscape. We propose an algorithm to approximate geodesics and demonstrate that they achieve mode connectivity. M Figure 1: Geodesics are shortest paths in the space of parameterized distributions M. For narrow architectures linear interpolation (dashed) fails to achieve mode connectivity, passing through a region of high loss, despite using a permutation π to'shift' θ If we instead follow the geodesic (shortest) path (solid) in the curved distribution space, this does achieve mode connectivity, appearing as a curved path in the loss landscape.
arXiv.org Artificial Intelligence
Aug-24-2023