TTOpt: AMaximumVolumeQuantizedTensor Train-basedOptimizationanditsApplicationto ReinforcementLearning

Neural Information Processing Systems 

The vital part of every learning-based algorithm is an optimization procedure, e.g., Stochastic Gradient Descent.