Towards a Better Understanding of Representation Dynamics under TD-learning