Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News