SupplementaryMaterialforEnd-to-EndStochastic OptimizationwithEnergy-BasedModel

Neural Information Processing Systems 

We adopt gradient-based method such as Adam [5] to update the model parameters. We use a two-hidden-layer neural network, where each "layer" is a combination of linear, batch norm [4], ReLU, and dropout (p = 0.2) layers with dimension200. SO-EBM draws512samples from the proposal distribution to estimate the gradient of the model parameters. The proposal distribution is a mixture of Gaussians with 3 components where the variancesare {0.02,0.05,0.1}. We use a two-layer gated recurrent unit (GRU) with hidden-size128 as the forecasting model.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found