Derivation of Information-Theoretically Optimal Adversarial Attacks with Applications to Robust Machine Learning