Hierarchical clustering with deep Q-learning