Faster Neural Networks Straight from JPEG

Gueguen, Lionel, Sergeev, Alex, Kadlec, Ben, Liu, Rosanne, Yosinski, Jason

Feb-14-2020, 13:42:11 GMT–Neural Information Processing Systems

The simple, elegant approach of training convolutional neural networks (CNNs) directly from RGB pixels has enjoyed overwhelming empirical success. But can more performance be squeezed out of networks by using different input representations? In this paper we propose and explore a simple idea: train CNNs directly on the blockwise discrete cosine transform (DCT) coefficients computed and available in the middle of the JPEG codec. Intuitively, when processing JPEG images using CNNs, it seems unnecessary to decompress a blockwise frequency representation to an expanded pixel representation, shuffle it from CPU to GPU, and then process it with a CNN that will learn something similar to a transform back to frequency representation in its first layers. Why not skip both steps and feed the frequency domain into the network directly?

artificial intelligence, machine learning, representation, (5 more...)

Neural Information Processing Systems

Feb-14-2020, 13:42:11 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.99)