Deep learning for genomics


Although deep learning holds enormous promise for advancing new discoveries in genomics, it also should be implemented mindfully and with appropriate caution. Deep learning should be applied to biological datasets of sufficient size, usually on the order of thousands of samples. The'black box' nature of deep neural networks is an intrinsic property and does not necessarily lend itself well to complete understanding or transparency. Subtle variations in the input data can have outsized effects and must be controlled for as well as possible. Importantly, deep learning methods should be compared with simpler machine learning models with fewer parameters to ensure that the additional model complexity afforded by deep learning has not led to overfitting of the data.