A Theory of I/O-Efficient Sparse Neural Network Inference