AITopics | optimal brain surgeon

Collaborating Authors

optimal brain surgeon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information

Neural Information Processing SystemsDec-27-2025, 14:23:53 GMT

The rising footprint of machine learning has led to a focus on imposing model sparsity as a means of reducing computational and memory costs. For deep neural networks (DNNs), the state-of-the-art accuracy-vs-sparsity is achieved by heuristics inspired by the classical Optimal Brain Surgeon (OBS) framework [LeCun et al., 1989, Hassibi and Stork, 1992, Hassibi et al., 1993], which leverages loss curvature information to make better pruning decisions. Yet, these results still lack a solid theoretical understanding, and it is unclear whether they can be improved by leveraging connections to the wealth of work on sparse recovery algorithms. In this paper, we draw new connections between these two areas and present new sparse recovery algorithms inspired by the OBS framework that come with theoretical guarantees under reasonable assumptions and have strong practical performance. Specifically, our work starts from the observation that we can leverage curvature information in OBS-like fashion upon the projection step of classic iterative sparse recovery algorithms such as IHT. We show for the first time that this leads both to improved convergence bounds in well-behaved settings and to stronger practical convergence.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Add feedback

The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information

Neural Information Processing SystemsMay-27-2025, 21:56:25 GMT

iterative optimal brain surgeon, leveraging second-order information, optimal brain surgeon, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

Add feedback

Reviews: Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon

Neural Information Processing SystemsOct-8-2024, 08:48:26 GMT

Summary: This paper adapts Optimal Brain Surgeon (OBS) method to a local version, and modified the objective function to be the target activation per each layer. Similar to OBS, it uses an approximation to compute Hessian inverse by running through the dataset once. Compare to prior methods, it finishes compression with much less retraining iterations. A theoretical bound on the total error based on local reconstruction error is provided. Pros: - The paper explores a local version of OBS and shows effectiveness of proposed method in terms of less time cost for retraining the pruned network.

hessian inverse, layer-wise optimal brain surgeon, optimal brain surgeon, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Second order derivatives for network pruning: Optimal Brain Surgeon

Neural Information Processing SystemsApr-6-2023, 19:06:05 GMT

We investigate the use of information from all second order derivatives of the error function to perfonn network pruning (i.e., removing unimportant weights from a trained network) in order to improve generalization, simplify networks, reduce hardware or storage requirements, increase the speed of further training, and in some cases enable rule extraction. Our method, Optimal Brain Surgeon (OBS), is Significantly better than magnitude-based methods and Optimal Brain Damage [Le Cun, Denker and Sol1a, 1990], which often remove the wrong weights. OBS permits the pruning of more weights than other methods (for the same error on the training set), and thus yields better generalization on test data. Crucial to OBS is a recursion relation for calculating the inverse Hessian matrix H-I from training data and structural information of the net. OBS permits a 90%, a 76%, and a 62% reduction in weights over backpropagation with weighL decay on three benchmark MONK's problems [Thrun et aI., 1991].

network pruning, optimal brain surgeon, order derivative, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.62)

Add feedback

Optimal Brain Surgeon: Extensions and performance comparisons

Neural Information Processing SystemsApr-6-2023, 18:57:52 GMT

We extend Optimal Brain Surgeon (OBS) - to allow for general error mea(cid:173) method for pruning networks - sures, and explore a reduced computational and storage implemen(cid:173) tation via a dominant eigenspace decomposition. Simulations on nonlinear, noisy pattern classification problems reveal that OBS does lead to improved generalization, and performs favorably in comparison with Optimal Brain Damage (OBD). We find that the required retraining steps in OBD may lead to inferior generaliza(cid:173) tion, a result that can be interpreted as due to injecting noise back into the system. A common technique is to stop training of a large network at the minimum validation error. We found that the test error could be reduced even further by means of OBS (but not OBD) pruning.

extension and performance comparison, obd, optimal brain surgeon, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.45)

Add feedback

Recurrent Networks: Second Order Properties and Pruning

Neural Information Processing SystemsApr-6-2023, 18:46:52 GMT

Second order properties of cost functions for recurrent networks are investigated. We analyze a layered fully recurrent architecture, the virtue of this architecture is that it features the conventional feedforward architecture as a special case. A detailed description of recursive computation of the full Hessian of the network cost func(cid:173) tion is provided. We discuss the possibility of invoking simplifying approximations of the Hessian and show how weight decays iron the cost function and thereby greatly assist training. We present tenta(cid:173) tive pruning results, using Hassibi et al.'s Optimal Brain Surgeon, demonstrating that recurrent networks can construct an efficient internal memory.

architecture, denote, second order property and pruning, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Add feedback

Early Brain Damage

Tresp, Volker, Neuneier, Ralph, Zimmermann, Hans-Georg

Neural Information Processing SystemsDec-31-1997

Optimal Brain Damage (OBD) is a method for reducing the number ofweights in a neural network. OBD estimates the increase in cost function if weights are pruned and is a valid approximation if the learning algorithm has converged into a local minimum. On the other hand it is often desirable to terminate the learning process beforea local minimum is reached (early stopping). In this paper we show that OBD estimates the increase in cost function incorrectly if the network is not in a local minimum. We also show how OBD can be extended such that it can be used in connection withearly stopping.

artificial intelligence, cost function, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
Europe > Germany (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Early Brain Damage

Tresp, Volker, Neuneier, Ralph, Zimmermann, Hans-Georg

Neural Information Processing SystemsDec-31-1997

Optimal Brain Damage (OBD) is a method for reducing the number of weights in a neural network. OBD estimates the increase in cost function if weights are pruned and is a valid approximation if the learning algorithm has converged into a local minimum. On the other hand it is often desirable to terminate the learning process before a local minimum is reached (early stopping). In this paper we show that OBD estimates the increase in cost function incorrectly if the network is not in a local minimum. We also show how OBD can be extended such that it can be used in connection with early stopping. We call this new approach Early Brain Damage, EBD. EBD also allows to revive already pruned weights. We demonstrate the improvements achieved by EBD using three publicly available data sets.

cost function, early stopping, obd, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
Europe > Germany (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Early Brain Damage

Tresp, Volker, Neuneier, Ralph, Zimmermann, Hans-Georg

Neural Information Processing SystemsDec-31-1997

Optimal Brain Damage (OBD) is a method for reducing the number of weights in a neural network. OBD estimates the increase in cost function if weights are pruned and is a valid approximation if the learning algorithm has converged into a local minimum. On the other hand it is often desirable to terminate the learning process before a local minimum is reached (early stopping). In this paper we show that OBD estimates the increase in cost function incorrectly if the network is not in a local minimum. We also show how OBD can be extended such that it can be used in connection with early stopping. We call this new approach Early Brain Damage, EBD. EBD also allows to revive already pruned weights. We demonstrate the improvements achieved by EBD using three publicly available data sets.

cost function, early stopping, obd, (15 more...)

Neural Information Processing Systems

Country: