Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets Pascal Vincent

Neural Information Processing Systems 

An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e.g.