AITopics | multi-batch l-bfg method

A Multi-Batch L-BFGS Method for Machine Learning

Neural Information Processing SystemsNov-21-2025, 15:03:21 GMT

The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration. This can cause difficulties because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, illustrates the behavior of the algorithm in a distributed computing platform, and studies its convergence properties for both the convex and nonconvex cases.

machine learning, multi-batch l-bfg method, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)

Add feedback

A Multi-Batch L-BFGS Method for Machine Learning

Albert S. Berahas, Jorge Nocedal, Martin Takac

Neural Information Processing SystemsNov-21-2025, 07:59:08 GMT

In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration.

artificial intelligence, iteration, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

A Multi-Batch L-BFGS Method for Machine Learning

Neural Information Processing SystemsFeb-11-2025, 19:40:14 GMT

The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration. This can cause difficulties because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, illustrates the behavior of the algorithm in a distributed computing platform, and studies its convergence properties for both the convex and nonconvex cases.

iteration, machine learning, multi-batch l-bfg method

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

Reviews: A Multi-Batch L-BFGS Method for Machine Learning

Neural Information Processing SystemsJan-20-2025, 15:44:50 GMT

In supervised learning, one is interested in minimizing the empirical risk where efficient optimization algorithms become the key. First-order methods such as stochastic gradient descent and its variants are reasonably well understood admitting efficient implementation and parallelization techniques. However there has been a recent interest in making second-order methods such as Newton's method or L-BFGS method efficient for such large-scale problems. This paper is along this direction, presenting a new variant of the stochastic L-BFGS method that is efficient and robust in mainly two settings: The first arises in the presence of node failures in a distributed computing environment, the second occurs when one uses an adaptive batch size that varies over iterations for accelerating learning. The main idea is to form the Hessian estimate based on the overlap between consecutive batches (the intuition why this works is that we have less limitation in choosing the second-order information matrix compared to an estimate of the true gradient).

machine learning, multi-batch l-bfg method, review, (3 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.07)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.59)

Add feedback

A Multi-Batch L-BFGS Method for Machine Learning

Neural Information Processing SystemsMar-12-2024, 14:29:50 GMT

The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration. This can cause difficulties because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, illustrates the behavior of the algorithm in a distributed computing platform, and studies its convergence properties for both the convex and nonconvex cases.

artificial intelligence, iteration, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
North America > United States > Pennsylvania > Northampton County > Bethlehem (0.04)
North America > United States > New York (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

A Multi-Batch L-BFGS Method for Machine Learning

Berahas, Albert S., Nocedal, Jorge, Takac, Martin

Neural Information Processing SystemsFeb-14-2020, 07:42:29 GMT

The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration. This can cause difficulties because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, illustrates the behavior of the algorithm in a distributed computing platform, and studies its convergence properties for both the convex and nonconvex cases.

iteration, machine learning, multi-batch l-bfg method

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.66)

Add feedback

A Robust Multi-Batch L-BFGS Method for Machine Learning

Berahas, Albert S., Takáč, Martin

arXiv.org Machine LearningJul-26-2017

This paper describes an implementation of the L-BFGS method designed to deal with two adversarial situations. The first occurs in distributed computing environments where some of the computational nodes devoted to the evaluation of the function and gradient are unable to return results on time. A similar challenge occurs in a multi-batch approach in which the data points used to compute function and gradients are purposely changed at each iteration to accelerate the learning process. Difficulties arise because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the updating process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, studies the convergence properties for both convex and nonconvex functions, and illustrates the behavior of the algorithm in a distributed computing platform on binary classification logistic regression and neural network training problems that arise in machine learning.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Machine Learning

1707.08552

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.33)

Add feedback

A Multi-Batch L-BFGS Method for Machine Learning

Berahas, Albert S., Nocedal, Jorge, Takac, Martin

Neural Information Processing SystemsDec-31-2016

The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration. This can cause difficulties because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, illustrates the behavior of the algorithm in a distributed computing platform, and studies its convergence properties for both the convex and nonconvex cases.

gradient, iteration, l-bfg method, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Evanston (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

A Multi-Batch L-BFGS Method for Machine Learning

Berahas, Albert S., Nocedal, Jorge, Takáč, Martin

arXiv.org Machine LearningOct-23-2016

The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature. In this paper, we focus instead on batch methods that use a sizeable fraction of the training set at each iteration to facilitate parallelism, and that employ second-order information. In order to improve the learning process, we follow a multi-batch approach in which the batch changes at each iteration. This can cause difficulties because L-BFGS employs gradient differences to update the Hessian approximations, and when these gradients are computed using different data points the process can be unstable. This paper shows how to perform stable quasi-Newton updating in the multi-batch setting, illustrates the behavior of the algorithm in a distributed computing platform, and studies its convergence properties for both the convex and nonconvex cases.

artificial intelligence, machine learning, optimization problem, (14 more...)

arXiv.org Machine Learning

1605.06049

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback

Filters

Collaborating Authors

multi-batch l-bfg method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

A Multi-Batch L-BFGS Method for Machine Learning

A Multi-Batch L-BFGS Method for Machine Learning

A Multi-Batch L-BFGS Method for Machine Learning

Reviews: A Multi-Batch L-BFGS Method for Machine Learning

A Multi-Batch L-BFGS Method for Machine Learning

A Multi-Batch L-BFGS Method for Machine Learning

A Robust Multi-Batch L-BFGS Method for Machine Learning

A Multi-Batch L-BFGS Method for Machine Learning

A Multi-Batch L-BFGS Method for Machine Learning