On Warm-Starting Neural Network Training

Dec-23-2025, 21:31:33 GMT–Neural Information Processing Systems

In many real-world deployments of machine learning systems, data arrive piecemeal. These learning scenarios may be passive, where data arrive incrementally due to structural properties of the problem (e.g., daily financial data) or active, where samples are selected according to a measure of their quality (e.g., experimental design). In both of these cases, we are building a sequence of models that incorporate an increasing amount of data. We would like each of these models in the sequence to be performant and take advantage of all the data that are available to that point. Conventional intuition suggests that when solving a sequence of related optimization problems of this form, it should be possible to initialize using the solution of the previous iterate---to ``warm start'' the optimization rather than initialize from scratch---and see reductions in wall-clock time. However, in practice this warm-starting seems to yield poorer generalization performance than models that have fresh random initializations, even though the final training losses are similar.

name change, proceedings, warm-starting neural network training, (5 more...)

Neural Information Processing Systems

Dec-23-2025, 21:31:33 GMT

Conferences Web Page

Add feedback

Country:
- Asia > Middle East > Jordan (0.07)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.61)