A second-order-like optimizer with adaptive gradient scaling for deep learning