adam optimizer
Supplementary Material for Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition
Our main reconstruction loss is an MSE between the rendered color c and the corresponding pixel in the input image. This loss is then exponentially faded over 100,000 steps to a cosine weighted MSE: (x ฯo n หxฯo n)2. This weighting tends to achieve better BRDF fitting results [4] as harsh grazing highlights from the Fresnel effect are not factored as much as regular samples, as well as our approximated rendering model being the least accurate in the grazing angles. The reason for this fading loss scheme is that the normals nare not reliable in the early stages of the training.
ATraining Regime
A.1 Implementation of the GPs We use the GPyTorch4 package for the computations of GPs and their kernels. The NN linear kernel is implemented in all experiments as a 1-layer MLP with ReLU activations and hidden dimension 16. For the Spectral Mixture Kernel, we use 4 mixtures. A.2 Sines Dataset For the first experiments on sines functions, we use the dataset from [9]. For each task, the input points x are sampled from the range [ 5,5], and the target values y are obtained by applying y = Asin(x ')+, where the amplitude A and phase ' are drawn uniformly at random from ranges [0.1,5] and [0, ], respectively.
Resetting the Optimizer in Deep RL: An Empirical Study
We focus on the task of approximating the optimal value function in deep reinforcement learning. This iterative process is comprised of solving a sequence of optimization problems where the loss function changes per iteration. The common approach to solving this sequence of problems is to employ modern variants of the stochastic gradient descent algorithm such as Adam. These optimizers maintain their own internal parameters such as estimates of the first-order and the second-order moments of the gradient, and update them over time. Therefore, information obtained in previous iterations is used to solve the optimization problem in the current iteration. We demonstrate that this can contaminate the moment estimates because the optimization landscape can change arbitrarily from one iteration to the next one. To hedge against this negative effect, a simple idea is to reset the internal parameters of the optimizer when starting a new iteration. We empirically investigate this resetting idea by employing various optimizers in conjunction with the Rainbow algorithm. We demonstrate that this simple modification significantly improves the performance of deep RL on the Atari benchmark.