Comment on "Predicting reaction performance in C-N cross-coupling using machine learning"

Science 

Ahneman et al. (Reports, 13 April 2018) applied machine learning models to predict C–N cross-coupling reaction yields. The models use atomic, electronic, and vibrational descriptors as input features. However, the experimental design is insufficient to distinguish models trained on chemical features from those trained solely on random-valued features in retrospective and prospective test scenarios, thus failing classical controls in machine learning. A recent report by Ahneman et al. (1) describes a machine learning approach for modeling chemical reactions with data collected through ultrahigh-throughput experimentation. The Buchwald-Hartwig coupling (2) is used as a model reaction, with a Glorius interference approach (3) to study reaction poisoning by isoxazole additives. Reactions are represented by atomic, electronic, and vibrational descriptors that are automatically calculated through a new computational pipeline.