Solving The Dynamic Volatility Fitting Problem: A Deep Reinforcement Learning Approach