SupplementaryMaterialforEnd-to-EndStochastic OptimizationwithEnergy-BasedModel

Feb-8-2026, 18:06:13 GMT–Neural Information Processing Systems

We adopt gradient-based method such as Adam [5] to update the model parameters. We use a two-hidden-layer neural network, where each "layer" is a combination of linear, batch norm [4], ReLU, and dropout (p = 0.2) layers with dimension200. SO-EBM draws512samples from the proposal distribution to estimate the gradient of the model parameters. The proposal distribution is a mixture of Gaussians with 3 components where the variancesare {0.02,0.05,0.1}. We use a two-layer gated recurrent unit (GRU) with hidden-size128 as the forecasting model.

artificial intelligence, exp, machine learning, (17 more...)

Neural Information Processing Systems

Feb-8-2026, 18:06:13 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (0.55)
  - Statistical Learning (0.35)

Duplicate Docs Excel Report

Title
49cf35ff2298c10452db99d08036805b-Supplemental-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found