High Performance Zero-Memory Overhead Direct Convolutions