Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training

May-25-2025, 11:36:34 GMT–Neural Information Processing Systems

Regularization in modern machine learning is crucial, and it can take various forms in algorithmic design: training set, model family, error function, regularization terms, and optimizations. In particular, the learning rate, which can be interpreted as a temperature-like parameter within the statistical mechanics of learning, plays a crucial role in neural network training. Indeed, many widely adopted training strategies basically just define the decay of the learning rate over time. This process can be interpreted as decreasing a temperature, using either a global learning rate (for the entire model) or a learning rate that varies for each parameter. This paper proposes TempBalance, a straightforward yet effective layer-wise learning rate method. TempBalance is based on Heavy-Tailed Self-Regularization (HT-SR) Theory, an approach which characterizes the implicit self-regularization of different layers in trained models. We demonstrate the efficacy of using HT-SR-motivated metrics to guide the scheduling and balancing of temperature across all network layers during model training, resulting in improved performance during testing.

artificial intelligence, machine learning, tempbalance, (17 more...)

Neural Information Processing Systems

May-25-2025, 11:36:34 GMT

Conferences PDF

Add feedback

Country:
- Europe
  - Switzerland > Zürich
    - Zürich (0.14)
  - United Kingdom > England (0.14)

Genre:
- Research Report > New Finding (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.67)
    - Neural Networks (1.00)
    - Statistical Learning > Gradient Descent (0.46)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.67)

Duplicate Docs Excel Report

Title
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training, Charles H. Martin

Similar Docs Excel Report more

Title	Similarity	Source
None found