Representation Learning Beyond Linear Prediction Functions

Neural Information Processing Systems 

It has become a common practice (Tan et al., 2018) to use a pre-trained network as the representation