A new multilayer optical film optimal method based on deep q-learning