Goto

Collaborating Authors

 projc


Appendix

Neural Information Processing Systems

Let x be a root ofG(,θ), i.e., G(x,θ) = 0. Frank-Wolfeimplementations typically maintain theconvexweights ofthe vertices, which we use to get an approximation ofp?(θ). As a more advanced example, we now describe how to implement the KKT conditions(6). In all experiments, we only show how to compute gradients with the outer objective. There areseveral ways to differentiate this projection. The first is to use the KKT conditions as detailed in 2.2.