AITopics | mvm

Collaborating Authors

mvm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra

Neural Information Processing SystemsOct-9-2025, 00:40:22 GMT

artificial intelligence, machine learning, programming language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Software > Programming Languages (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)

Add feedback

CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra

Potapczynski, Andres, Finzi, Marc, Pleiss, Geoff, Wilson, Andrew Gordon

arXiv.org Machine LearningNov-29-2023

Many areas of machine learning and science involve large linear algebra problems, such as eigendecompositions, solving linear systems, computing matrix exponentials, and trace estimation. The matrices involved often have Kronecker, convolutional, block diagonal, sum, or product structure. In this paper, we propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA (Compositional Linear Algebra). By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms. Moreover, CoLA provides memory efficient automatic differentiation, low precision computation, and GPU acceleration in both JAX and PyTorch, while also accommodating new objects, operations, and rules in downstream packages via multiple dispatch. CoLA can accelerate many algebraic operations, while making it easy to prototype matrix structures and algorithms, providing an appealing drop-in tool for virtually any computational effort that requires linear algebra. We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.

algorithm, matrix, operator, (17 more...)

arXiv.org Machine Learning

2309.0306

Country:

North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Training neural networks with end-to-end optical backpropagation

Spall, James, Guo, Xianxin, Lvovsky, A. I.

arXiv.org Artificial IntelligenceAug-9-2023

Optics is an exciting route for the next generation of computing hardware for machine learning, promising several orders of magnitude enhancement in both computational speed and energy efficiency. However, to reach the full capacity of an optical neural network it is necessary that the computing not only for the inference, but also for the training be implemented optically. The primary algorithm for training a neural network is backpropagation, in which the calculation is performed in the order opposite to the information flow for inference. While straightforward in a digital computer, optical implementation of backpropagation has so far remained elusive, particularly because of the conflicting requirements for the optical element that implements the nonlinear activation function. In this work, we address this challenge for the first time with a surprisingly simple and generic scheme. Saturable absorbers are employed for the role of the activation units, and the required properties are achieved through a pump-probe process, in which the forward propagating signal acts as the pump and backward as the probe. Our approach is adaptable to various analog platforms, materials, and network structures, and it demonstrates the possibility of constructing neural networks entirely reliant on analog optical processes for both training and inference tasks.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2308.05226

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.84)

Add feedback

Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Wan, Weier, Kubendran, Rajkumar, Schaefer, Clemens, Eryilmaz, S. Burc, Zhang, Wenqiang, Wu, Dabin, Deiss, Stephen, Raina, Priyanka, Qian, He, Gao, Bin, Joshi, Siddharth, Wu, Huaqiang, Wong, H. -S. Philip, Cauwenberghs, Gert

arXiv.org Artificial IntelligenceAug-17-2021

Realizing today's cloud-level artificial intelligence functionalities directly on devices distributed at the edge of the internet calls for edge hardware capable of processing multiple modalities of sensory data (e.g. video, audio) at unprecedented energy-efficiency. AI hardware architectures today cannot meet the demand due to a fundamental "memory wall": data movement between separate compute and memory units consumes large energy and incurs long latency. Resistive random-access memory (RRAM) based compute-in-memory (CIM) architectures promise to bring orders of magnitude energy-efficiency improvement by performing computation directly within memory. However, conventional approaches to CIM hardware design limit its functional flexibility necessary for processing diverse AI workloads, and must overcome hardware imperfections that degrade inference accuracy. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single level of the design. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM - the first multimodal edge AI chip using RRAM CIM to simultaneously deliver a high degree of versatility for diverse model architectures, record energy-efficiency $5\times$ - $8\times$ better than prior art across various computational bit-precisions, and inference accuracy comparable to software models with 4-bit weights on all measured standard AI benchmarks including accuracy of 99.0% on MNIST and 85.7% on CIFAR-10 image classification, 84.7% accuracy on Google speech command recognition, and a 70% reduction in image reconstruction error on a Bayesian image recovery task. This work paves a way towards building highly efficient and reconfigurable edge AI hardware platforms for the more demanding and heterogeneous AI applications of the future.

accuracy, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2108.07879

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Plymouth County > Norwell (0.04)
(5 more...)

Genre: Research Report > New Finding (0.93)

Industry: Semiconductors & Electronics (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Hardware > Memory (0.84)

Add feedback

SKIing on Simplices: Kernel Interpolation on the Permutohedral Lattice for Scalable Gaussian Processes

Kapoor, Sanyam, Finzi, Marc, Wang, Ke Alexander, Wilson, Andrew Gordon

arXiv.org Machine LearningJun-12-2021

State-of-the-art methods for scalable Gaussian processes use iterative algorithms, requiring fast matrix vector multiplies (MVMs) with the covariance kernel. The Structured Kernel Interpolation (SKI) framework accelerates these MVMs by performing efficient MVMs on a grid and interpolating back to the original space. In this work, we develop a connection between SKI and the permutohedral lattice used for high-dimensional fast bilateral filtering. Using a sparse simplicial grid instead of a dense rectangular one, we can perform GP inference exponentially faster in the dimension than SKI. Our approach, Simplex-GP, enables scaling SKI to high dimensions, while maintaining strong predictive performance. We additionally provide a CUDA implementation of Simplex-GP, which enables significant GPU acceleration of MVM based inference.

dataset, gaussian process, lattice, (12 more...)

arXiv.org Machine Learning

2106.06695

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Washington > King County > Bellevue (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(7 more...)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Sports > Skiing (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

One-step regression and classification with cross-point resistive memory arrays

#artificialintelligenceFeb-2-2020, 16:04:48 GMT

Resistive memories, also known as memristors (1), including resistive switching memory (RRAM) and phase-change memory (PCM), are emerging as a novel technology for high-density storage (2, 3), neuromorphic hardware (4, 5), and stochastic security primitives, such as random number generators (6, 7). Thanks to their ability to store analog values and to their excellent programming speed, resistive memories have also been demonstrated for executing in-memory computing (8–17), which eliminates the data transfer between the memory and the processing unit to improve the time and energy efficiency of computation. With a cross-point architecture, resistive memories can be naturally used to perform matrix-vector multiplication (MVM) by exploiting fundamental physical laws such as the Ohm's law and the Kirchhoff's law of electric circuits (8). Cross-point MVM has been shown to accelerate various data-intensive tasks, such as training and inference of deep neural networks (11–14), signal and image processing (15), and the iterative solution of a system of linear equations (16) or a differential equation (17). With a feedback circuit configuration, the cross-point array has been shown to solve systems of linear equations and calculate matrix eigenvectors in one step (18).

configuration, cross-point resistive memory array, one-step regression and classification, (4 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.57)

Add feedback

Multi-View Factorization Machines

Cao, Bokai, Zhou, Hucheng, Li, Guoqiang, Yu, Philip S.

arXiv.org Machine LearningMar-23-2018

For a learning task, data can usually be collected from different sources or be represented from multiple views. For example, laboratory results from different medical examinations are available for disease diagnosis, and each of them can only reflect the health state of a person from a particular aspect/view. Therefore, different views provide complementary information for learning tasks. An effective integration of the multi-view information is expected to facilitate the learning performance. In this paper, we propose a general predictor, named multi-view machines (MVMs), that can effectively include all the possible interactions between features from multiple views. A joint factorization is embedded for the full-order interaction parameters which allows parameter estimation under sparsity. Moreover, MVMs can work in conjunction with different loss functions for a variety of machine learning tasks. A stochastic gradient descent method is presented to learn the MVM model. We further illustrate the advantages of MVMs through comparison with other methods for multi-view classification, including support vector machines (SVMs), support tensor machines (STMs) and factorization machines (FMs).

artificial intelligence, interaction, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1145/2835776.2835777

1506.0111

Country:

North America > United States (0.68)
Asia (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Product Kernel Interpolation for Scalable Gaussian Processes

Gardner, Jacob R., Pleiss, Geoff, Wu, Ruihan, Weinberger, Kilian Q., Wilson, Andrew Gordon

arXiv.org Machine LearningFeb-24-2018

Recent work shows that inference for Gaussian processes can be performed efficiently using iterative methods that rely only on matrix-vector multiplications (MVMs). Structured Kernel Interpolation (SKI) exploits these techniques by deriving approximate kernels with very fast MVMs. Unfortunately, such strategies suffer badly from the curse of dimensionality. We develop a new technique for MVM based learning that exploits product kernel structure. We demonstrate that this technique is broadly applicable, resulting in linear rather than exponential runtime with dimension for SKI, as well as state-of-the-art asymptotic complexity for multi-task GPs.

artificial intelligence, kernel, machine learning, (15 more...)

arXiv.org Machine Learning

1802.08903

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Large Linear Multi-output Gaussian Process Learning

Feinberg, Vladimir, Cheng, Li-Fang, Li, Kai, Engelhardt, Barbara E

arXiv.org Machine LearningOct-23-2017

Gaussian processes (GPs), or distributions over arbitrary functions in a continuous domain, can be generalized to the multi-output case: a linear model of coregionalization (LMC) is one approach. LMCs estimate and exploit correlations across the multiple outputs. While model estimation can be performed efficiently for single-output GPs, these assume stationarity, but in the multi-output case the cross-covariance interaction is not stationary. We propose Large Linear GP (LLGP), which circumvents the need for stationarity by inducing structure in the LMC kernel through a common grid of inputs shared between outputs, enabling optimization of GP hyperparameters for multi-dimensional outputs and low-dimensional inputs. When applied to synthetic two-dimensional and real time series data, we find our theoretical improvement relative to the current solutions for multi-output GPs is realized with LLGP reducing training time while improving or maintaining predictive mean accuracy. Moreover, by using a direct likelihood approximation rather than a variational one, model confidence estimates are significantly improved.

artificial intelligence, kernel, machine learning, (18 more...)

arXiv.org Machine Learning

1705.10813

Genre: Research Report (0.51)

Industry: Banking & Finance (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Cross-Validation Optimization for Large Scale Hierarchical Classification Kernel Methods

Seeger, Matthias

Neural Information Processing SystemsDec-31-2007

We propose a highly efficient framework for kernel multi-class models with a large and structured set of classes. Kernel parameters are learned automatically by maximizing the cross-validation log likelihood, and predictive probabilities are estimated. We demonstrate our approach on large scale text classification tasks with hierarchical class structure, achieving state-of-the-art results in an order of magnitude less time than previous work.

classification, hyperparameter, mvm, (15 more...)

Neural Information Processing Systems

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

Add feedback