Deep Reinforcement Learning for Dynamic Treatment Regimes on Medical Registry Data