AITopics | l-bfg

L-BFGS has been applied as an effective parameter estimation method for various machine learning algorithms since 1980s. With an increasing demand to deal with massive instances and variables, it is important to scale up and parallelize L-BFGS effectively in a distributed system. In this paper, we study the problem of parallelizing the L-BFGS algorithm in large clusters of tens of thousands of shared-nothing commodity machines. First, we show that a naive implementation of L-BFGS using Map-Reduce requires either a significant amount of memory or a large number of map-reduce steps with negative performance impact. Second, we propose a new L-BFGS algorithm, called V ector-free L-BFGS, which avoids the expensive dot product operations in the two loop recursion and greatly improves computation efficiency with a great degree of parallelism. The algorithm scales very well and enables a variety of machine learning algorithms to handle a massive number of variables over large datasets. We prove the mathematical equivalence of the new V ector-free L-BFGS and demonstrate its excellent performance and scalability using real-world machine learning problems with billions of variables in production clusters.

algorithm, l-bfg, map-reduce step, (16 more...)

Neural Information Processing Systems

Industry: Education (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

192fc044e74dffea144f9ac5dc9f3395-Supplemental.pdf

Neural Information Processing SystemsOct-2-2025, 07:12:23 GMT

artificial intelligence, k-bfg, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Practical Quasi-Newton Methods for Training Deep Neural Networks

Neural Information Processing SystemsOct-2-2025, 07:12:15 GMT

In our proposed methods, we approximate the Hessian by a block-diagonal matrix and use the structure of the gradient and Hessian to further approximate these blocks, each of which corresponds to a layer, as the Kronecker product of two much smaller matrices.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

13e5ebb0fa112fe1b31a1067962d74a7-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 04:11:49 GMT

artificial intelligence, logic rule, machine learning, (12 more...)

Neural Information Processing Systems

Industry:

Media (0.30)
Leisure & Entertainment (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.31)

Add feedback

Large-scale L-BFGS using MapReduce

Neural Information Processing SystemsSep-30-2025, 09:27:09 GMT

L-BFGS has been applied as an effective parameter estimation method for various machine learning algorithms since 1980s. With an increasing demand to deal with massive instances and variables, it is important to scale up and parallelize L-BFGS effectively in a distributed system. In this paper, we study the problem of parallelizing the L-BFGS algorithm in large clusters of tens of thousands of shared-nothing commodity machines. First, we show that a naive implementation of L-BFGS using Map-Reduce requires either a significant amount of memory or a large number of map-reduce steps with negative performance impact. Second, we propose a new L-BFGS algorithm, called Vector-free L-BFGS, which avoids the expensive dot product operations in the two loop recursion and greatly improves computation efficiency with a great degree of parallelism. The algorithm scales very well and enables a variety of machine learning algorithms to handle a massive number of variables over large datasets. We prove the mathematical equivalence of the new Vector-free L-BFGS and demonstrate its excellent performance and scalability using real-world machine learning problems with billions of variables in production clusters.

l-bfg, large-scale l-bfg, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

c4ede56bbd98819ae6112b20ac6bf145-AuthorFeedback.pdf

Neural Information Processing SystemsAug-16-2025, 07:48:34 GMT

Author Response for: "Inverting Gradients - How easy is it to break privacy in federated learning" We thank all reviewers for their valuable feedback and interest in this attack. Some questions arose about the theoretical analysis for fully connected layers. Finally knowledge of the feature representation already enables attacks like Melis et al. This non-uniformity is a significant result for the privacy of gradient batches. Fig.4 of [35] looks better because the attack scenario there is easier.

knowledge, reconstruction, scenario, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback

PREIG: Physics-informed and Reinforcement-driven Interpretable GRU for Commodity Demand Forecasting

Ma, Hongwei, Gao, Junbin, Tran, Minh-Ngoc

arXiv.org Artificial IntelligenceJul-30-2025

--Accurately forecasting commodity demand remains a critical challenge due to volatile market dynamics, nonlinear dependencies, and the need for economically consistent predictions. This paper introduces PREIG--a Physics-informed and Reinforcement-driven Interpretable model with GRU--a novel deep learning framework tailored for commodity demand forecasting. This constraint is enforced through a customized loss function that penalizes violations of the physical rule, ensuring that model predictions remain interpretable and aligned with economic theory. T o further enhance predictive performance and stability, PREIG incorporates a hybrid optimization strategy that couples NAdam and L-BFGS with Population-Based Training (POP)--a reinforcement-learning inspired mechanism that dynamically tunes hyperparameters via evolutionary exploration and exploitation. Experiments across multiple commodities datasets demonstrate that PREIG significantly outperforms traditional econometric models (ARIMA, GARCH) and deep learning baselines (BPNN,RNN) in both RMSE and MAPE. When compared with GRU, PREIG maintains good explainability while still performing well in prediction. By bridging domain knowledge, optimization theory and deep learning, PREIG provides a robust, interpretable, and scalable solution for high-dimensional nonlinear time series forecasting in economy.

artificial intelligence, machine learning, neural network, (15 more...)

arXiv.org Artificial Intelligence

2507.2171

Country: North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.25)

Genre: Research Report (0.82)

Industry: