On the ability of Deep Neural Networks to Learn Granger Causality in Multi-Variate Time Series Data