Goto

Collaborating Authors


5 Broader impact This submission focuses on foundational and exploratory work, with application to general machine

Neural Information Processing Systems

Our experiments use data sets that are already open-sourced and cited in the references. At present, our implementation of Kruskal's algorithm is incompatible with processing very large batch sizes at train time. At inference time this is not the case, since gradients need not be back-propagated hence, any implementation of Kruskal's algorithm can be used such as the union-find implementation. Our implementation of Kruskal's is tailored to our use: we first initialize both We remark that our implementation takes the form as a single loop, with each step of the loop consisting only of matrix multiplications. This biasing ensures that any edge between points that are constrained to be in the same cluster will always be processed before unconstrained edges.




Moonshine: Distilling with Cheap Convolutions

Elliot J. Crowley, Gavin Gray, Amos J. Storkey

Neural Information Processing Systems

Using attention transfer, we provide Pareto curves/tables for distillation of residual networks with four benchmark datasets, indicating the memory versus accuracy payoff.




Weakly Supervised Dense Event Captioning in Videos

Xuguang Duan, Wenbing Huang, Chuang Gan, Jingdong Wang, Wenwu Zhu, Junzhou Huang

Neural Information Processing Systems

Among the wide variety of applications on video understanding, the video captioning task is attracting more and more interests in recent years [4, 5, 6, 7, 8, 9, 10, 11].