AITopics | Pearlmutter, Barak A.

Collaborating Authors

Pearlmutter, Barak A.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Creating a Formally Verified Neural Network for Autonomous Navigation: An Experience Report

Bukhari, Syed Ali Asadullah, Flinkow, Thomas, Inkarbekov, Medet, Pearlmutter, Barak A., Monahan, Rosemary

arXiv.org Artificial IntelligenceNov-21-2024

The increased reliance of self-driving vehicles on neural networks opens up the challenge of their verification. In this paper we present an experience report, describing a case study which we undertook to explore the design and training of a neural network on a custom dataset for vision-based autonomous navigation. We are particularly interested in the use of machine learning with differentiable logics to obtain networks satisfying basic safety properties by design, guaranteeing the behaviour of the neural network after training. We motivate the choice of a suitable neural network verifier for our purposes and report our observations on the use of neural network verifiers for self-driving systems.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.4204/EPTCS.411.12

2411.14163

Country:

North America > United States (0.46)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.50)

Industry:

Information Technology (1.00)
Transportation > Ground > Road (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Low-Rank Learning by Design: the Role of Network Architecture and Activation Linearity in Gradient Rank Collapse

Baker, Bradley T., Pearlmutter, Barak A., Miller, Robyn, Calhoun, Vince D., Plis, Sergey M.

arXiv.org Artificial IntelligenceFeb-9-2024

Our understanding of learning dynamics of deep neural networks (DNNs) remains incomplete. Recent research has begun to uncover the mathematical principles underlying these networks, including the phenomenon of "Neural Collapse", where linear classifiers within DNNs converge to specific geometrical structures during late-stage training. However, the role of geometric constraints in learning extends beyond this terminal phase. For instance, gradients in fully-connected layers naturally develop a low-rank structure due to the accumulation of rank-one outer products over a training batch. Despite the attention given to methods that exploit this structure for memory saving or regularization, the emergence of low-rank learning as an inherent aspect of certain DNN architectures has been under-explored. In this paper, we conduct a comprehensive study of gradient rank in DNNs, examining how architectural choices and structure of the data effect gradient rank bounds. Our theoretical analysis provides these bounds for training fully-connected, recurrent, and convolutional neural networks. We also demonstrate, both theoretically and empirically, how design choices like activation function linearity, bottleneck layer introduction, convolutional stride, and sequence truncation influence these bounds. Our findings not only contribute to the understanding of learning dynamics in DNNs, but also provide practical guidance for deep learning engineers to make informed design decisions.

artificial intelligence, machine learning, singular value, (17 more...)

arXiv.org Artificial Intelligence

2402.06751

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparing Differentiable Logics for Learning Systems: A Research Preview

Flinkow, Thomas, Pearlmutter, Barak A., Monahan, Rosemary

arXiv.org Artificial IntelligenceNov-16-2023

Extensive research on formal verification of machine learning (ML) systems indicates that learning from data alone often fails to capture underlying background knowledge. A variety of verifiers have been developed to ensure that a machine-learnt model satisfies correctness and safety properties, however, these verifiers typically assume a trained network with fixed weights. ML-enabled autonomous systems are required to not only detect incorrect predictions, but should also possess the ability to self-correct, continuously improving and adapting. A promising approach for creating ML models that inherently satisfy constraints is to encode background knowledge as logical constraints that guide the learning process via so-called differentiable logics. In this research preview, we compare and evaluate various logics from the literature in weakly-supervised contexts, presenting our findings and highlighting open problems for future work. Our experimental results are broadly consistent with results reported previously in literature; however, learning with differentiable logics introduces a new hyperparameter that is difficult to tune and has significant influence on the effectiveness of the logics.

artificial intelligence, constraint, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.4204/EPTCS.395.3

2311.09809

Country: Europe > Spain (0.14)

Genre: Research Report > New Finding (0.49)

Industry: Transportation (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.91)

Add feedback

Gradients without Backpropagation

Baydin, Atılım Güneş, Pearlmutter, Barak A., Syme, Don, Wood, Frank, Torr, Philip

arXiv.org Machine LearningFeb-17-2022

Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic differentiation algorithms that also includes the forward mode. We present a method to compute gradients based solely on the directional derivative that one can compute exactly and efficiently via the forward mode. We call this formulation the forward gradient, an unbiased estimate of the gradient that can be evaluated in a single forward run of the function, entirely eliminating the need for backpropagation in gradient descent. We demonstrate forward gradient descent in a range of problems, showing substantial savings in computation and enabling training up to twice as fast in some cases.

artificial intelligence, backpropagation, machine learning, (1 more...)

arXiv.org Machine Learning

2202.08587

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (1.00)

Add feedback

Neural Ordinary Differential Equation based Recurrent Neural Network Model

Habiba, Mansura, Pearlmutter, Barak A.

arXiv.org Machine LearningMay-19-2020

Neural differential equations are a promising new member in the neural network family. They show the potential of differential equations for time series data analysis. In this paper, the strength of the ordinary differential equation (ODE) is explored with a new extension. The main goal of this work is to answer the following questions: (i)~can ODE be used to redefine the existing neural network model? (ii)~can Neural ODEs solve the irregular sampling rate challenge of existing neural network models for a continuous time series, i.e., length and dynamic nature, (iii)~how to reduce the training and evaluation time of existing Neural ODE systems? This work leverages the mathematical foundation of ODEs to redesign traditional RNNs such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). The main contribution of this paper is to illustrate the design of two new ODE-based RNN models (GRU-ODE model and LSTM-ODE) which can compute the hidden state and cell state at any point of time using an ODE solver. These models reduce the computation overhead of hidden state and cell state by a vast amount. The performance evaluation of these two new models for learning continuous time series with irregular sampling rate is then demonstrated. Experiments show that these new ODE based RNN models require less training time than Latent ODEs and conventional Neural ODEs. They can achieve higher accuracy quickly, and the design of the neural network is simpler than, previous neural ODE systems.

deep learning, neural network, time series, (14 more...)

arXiv.org Machine Learning

2005.09807

Country: Europe > Ireland (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tricks from Deep Learning

Baydin, Atılım Güneş, Pearlmutter, Barak A., Siskind, Jeffrey Mark

arXiv.org Machine LearningNov-10-2016

The deep learning community has devised a diverse set of methods to make gradient optimization, using large datasets, of large and highly complex models with deeply cascaded nonlinearities, practical. Taken as a whole, these methods constitute a breakthrough, allowing computational structures which are quite wide, very deep, and with an enormous number and variety of free parameters to be effectively optimized. The result now dominates much of practical machine learning, with applications in machine translation, computer vision, and speech recognition. Many of these methods, viewed through the lens of algorithmic differentiation (AD), can be seen as either addressing issues with the gradient itself, or finding ways of achieving increased efficiency using tricks that are AD-related, but not provided by current AD systems. The goal of this paper is to explain not just those methods of most relevance to AD, but also the technical constraints and mindset which led to their discovery. After explaining this context, we present a "laundry list" of methods developed by the deep learning community. Two of these are discussed in further mathematical detail: a way to dramatically reduce the size of the tape when performing reverse-mode AD on a (theoretically) time-reversible process like an ODE integrator; and a new mathematical insight that allows for the implementation of a stochastic Newton's method.

deep learning, gradient, neural network, (15 more...)

arXiv.org Machine Learning

1611.03777

Country: Europe > France (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Automatic Differentiation of Algorithms for Machine Learning

Baydin, Atilim Gunes, Pearlmutter, Barak A.

arXiv.org Machine LearningApr-28-2014

Automatic differentiation---the mechanical transformation of numeric computer programs to calculate derivatives efficiently and accurately---dates to the origin of the computer age. Reverse mode automatic differentiation both antedates and generalizes the method of backwards propagation of errors used in machine learning. Despite this, practitioners in a variety of fields, including machine learning, have been little influenced by automatic differentiation, and make scant use of available tools. Here we review the technique of automatic differentiation, describe its two main modes, and explain how it can benefit machine learning practitioners. To reach the widest possible audience our treatment assumes only elementary differential calculus, and does not assume any knowledge of linear algebra.

differentiation, neural network, survey article, (14 more...)

arXiv.org Machine Learning

1404.7456

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Subject-Independent Magnetoencephalographic Source Localization by a Multilayer Perceptron

Jun, Sung C., Pearlmutter, Barak A.

Neural Information Processing SystemsDec-31-2004

We describe a system that localizes a single dipole to reasonable accuracy from noisy magnetoencephalographic (MEG) measurements in real time. At its core is a multilayer perceptron (MLP) trained to map sensor signals and head position to dipole location. Including head position overcomes the previous need to retrain the MLP for each subject and session. The training dataset was generated by mapping randomly chosen dipoles and head positions through an analytic model and adding noise from real MEG recordings. After training, a localization took 0.7 ms with an average error of 0.90 cm. A few iterations of a Levenberg-Marquardt routine using the MLP's output as its initial guess took 15 ms and improved the accuracy to 0.53 cm, only slightly above the statistical limits on accuracy imposed by the noise. We applied these methods to localize single dipole sources from MEG components isolated by blind source separation and compared the estimated locations to those generated by standard manually-assisted commercial software.

mlp, neural network, neurology, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback

Subject-Independent Magnetoencephalographic Source Localization by a Multilayer Perceptron

Jun, Sung C., Pearlmutter, Barak A.

Neural Information Processing SystemsDec-31-2004

We describe a system that localizes a single dipole to reasonable accuracy fromnoisy magnetoencephalographic (MEG) measurements in real time. At its core is a multilayer perceptron (MLP) trained to map sensor signalsand head position to dipole location. Including head position overcomes the previous need to retrain the MLP for each subject and session. Thetraining dataset was generated by mapping randomly chosen dipoles and head positions through an analytic model and adding noise from real MEG recordings. After training, a localization took 0.7 ms with an average error of 0.90 cm. A few iterations of a Levenberg-Marquardt routine using the MLP's output as its initial guess took 15 ms and improved theaccuracy to 0.53 cm, only slightly above the statistical limits on accuracy imposed by the noise. We applied these methods to localize single dipole sources from MEG components isolated by blind source separation and compared the estimated locations to those generated by standard manually-assisted commercial software.

mlp, neural network, neurology, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Europe (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback

Blind Source Separation via Multinode Sparse Representation

Zibulevsky, Michael, Kisilev, Pavel, Zeevi, Yehoshua Y., Pearlmutter, Barak A.

Neural Information Processing SystemsDec-31-2002

We consider a problem of blind source separation from a set of instantaneous linear mixtures, where the mixing matrix is unknown. It was discovered recently, that exploiting the sparsity of sources in an appropriate representation according to some signal dictionary, dramatically improves the quality of separation. In this work we use the property of multi scale transforms, such as wavelet or wavelet packets, to decompose signals into sets of local features with various degrees of sparsity. We use this intrinsic property for selecting the best (most sparse) subsets of features for further separation. The performance of the algorithm is verified on noise-free and noisy data. Experiments with simulated signals, musical sounds and images demonstrate significant improvement of separation quality over previously reported results.

artificial intelligence, coefficient, data quality, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.29)
North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Quality > Data Transformation (0.93)

Add feedback