AITopics | strassen

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-8-2026, 16:47:07 GMT

52c4608c2f126708211b9e0a60eaf050-Paper.pdf

Neural networks are known to be vulnerable toadversarial attacks, which are imperceptible perturbations to input data that maximize loss [38, 15, 5].

adversarial risk, artificial intelligence, machine learning, (17 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Neural Information Processing SystemsAug-14-2025, 12:43:24 GMT

The Many Faces of Adversarial Risk

Adversarial risk quantifies the performance of classifiers on adversarially perturbed data.

adversarial risk, artificial intelligence, machine learning, (20 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsAug-14-2025, 12:43:20 GMT

The Many Faces of Adversarial Risk

Adversarial risk quantifies the performance of classifiers on adversarially perturbed data.

adversarial risk, artificial intelligence, machine learning, (19 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Kilian, Valentin, Cortinovis, Stefano, Caron, François

Anytime-valid, Bayes-assisted,Prediction-Powered Inference

arXiv.org Machine LearningMay-26-2025

Given a large pool of unlabelled data and a smaller amount of labels, prediction-powered inference (PPI) leverages machine learning predictions to increase the statistical efficiency of standard confidence interval procedures based solely on labelled data, while preserving their fixed-time validity. In this paper, we extend the PPI framework to the sequential setting, where labelled and unlabelled datasets grow over time. Exploiting Ville's inequality and the method of mixtures, we propose prediction-powered confidence sequence procedures that are valid uniformly over time and naturally accommodate prior knowledge on the quality of the predictions to further boost efficiency. We carefully illustrate the design choices behind our method and demonstrate its effectiveness in real and synthetic examples.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

2505.18

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Rybin, Dmitry, Zhang, Yushun, Luo, Zhi-Quan

$XX^{t}$ Can Be Faster

arXiv.org Artificial IntelligenceMay-19-2025

We present RXTX, a new algorithm for computing the product of matrix by its transpose $XX^{t}$ for $X\in \mathbb{R}^{n\times m}$. RXTX uses $5\%$ fewer multiplications and $5\%$ fewer operations (additions and multiplications) than State-of-the-Art algorithms. Note that the accelerations not only holds asymptotically for large matrices with $n \rightarrow \infty$, but also for small matrices including $n = 4$. The algorithm was discovered by combining Machine Learning-based search methods with Combinatorial Optimization.

machine learning, multiplication, reinforcement learning, (19 more...)

2505.09814

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)

Ailon, Nir, Bercovich, Akhiad, Weinstein, Omri

Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs

arXiv.org Artificial IntelligenceMar-15-2025

We propose a cheaper alternative bilinear operator to matrix-multiplication in deep neural networks (DNNs). Unlike many stubborn attempts to accelerate MatMuls in DNN inference, this operator is supported by capabilities of existing GPU hardware, most notably NVIDIA TensorCores. To our knowledge, this is the first GPU-native acceleration technique which \emph{does not decrease} (in fact, increases) the number of trainable parameters of the network, mitigating the accuracy-loss of compression-based techniques. Hence, this operator is at the same time more expressive than MatMul, yet requires substantially \emph{fewer} FLOPs to evaluate. We term this new operator \emph{Strassen-Tile} (STL). The main idea behind STL$(X,W)$ is a \emph{local} change-of-basis (learnable encoder) on weights and activation \emph{tiles}, after which we perform batched \emph{elementwise} products between tiles, and a final decoding transformation (inspired by algebraic pipelines from fast matrix and polynomial multiplication). We compare STL against two benchmarks. The first one is SoTA T2T-ViT on Imagenet-1K. Here we show that replacing \emph{all} linear layers with STL and training from scratch, results in factor x2.7 reduction in FLOPs with a 0.5 \emph{accuracy improvement}. Our second speed-accuracy comparison benchmark for pretrained LLMs is the most practical GPU-acceleration technique, \twofour structured Sparsity. Finetuning TinyLlama \cite{tinyllama24} with STL layers on the Slim Pajama dataset, achieves similar accuracy to 2:4, with x2.2 FLOP speedup compared to x1.7 of the latter. Finally, we discuss a group-theoretic approach for discovering \emph{universal} encoders for STL, which could lead to fast \emph{black-box} acceleration via approximate matrix-multiplication (AMM).

artificial intelligence, machine learning, natural language, (19 more...)

2503.12211

Country:

Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Pogue, Trevor E., Nicolici, Nicola

Strassen Multisystolic Array Hardware Architectures

arXiv.org Artificial IntelligenceFeb-14-2025

While Strassen's matrix multiplication algorithm reduces the complexity of naive matrix multiplication, general-purpose hardware is not suitable for achieving the algorithm's promised theoretical speedups. This leaves the question of if it could be better exploited in custom hardware architectures designed specifically for executing the algorithm. However, there is limited prior work on this and it is not immediately clear how to derive such architectures or if they can ultimately lead to real improvements. We bridge this gap, presenting and evaluating new systolic array architectures that efficiently translate the theoretical complexity reductions of Strassen's algorithm directly into hardware resource savings. Furthermore, the architectures are multisystolic array designs that can multiply smaller matrices with higher utilization than single-systolic array designs. The proposed designs implemented on FPGA reduce DSP requirements by a factor of $1.14^r$ for $r$ implemented Strassen recursion levels, and otherwise require overall similar soft logic resources when instantiated to support matrix sizes down to 32x32 and 24x24 at 1-2 levels of Strassen recursion, respectively. We evaluate the proposed designs both in isolation and in an end-to-end machine learning accelerator compared to baseline designs and prior works, achieving state-of-the-art performance.

artificial intelligence, deep learning, machine learning, (19 more...)

doi: 10.1109/TVLSI.2025.3530785

2502.10063

Country:

North America > Canada > Ontario > Hamilton (0.28)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
Europe > Romania > Vest Development Region > Timiș County > Timișoara (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Deza, Arnaud, Liu, Chang, Vaezipoor, Pashootan, Khalil, Elias B.

Fast Matrix Multiplication Without Tears: A Constraint Programming Approach

arXiv.org Artificial IntelligenceJul-17-2023

It is known that the multiplication of an $N \times M$ matrix with an $M \times P$ matrix can be performed using fewer multiplications than what the naive $NMP$ approach suggests. The most famous instance of this is Strassen's algorithm for multiplying two $2\times 2$ matrices in 7 instead of 8 multiplications. This gives rise to the constraint satisfaction problem of fast matrix multiplication, where a set of $R < NMP$ multiplication terms must be chosen and combined such that they satisfy correctness constraints on the output matrix. Despite its highly combinatorial nature, this problem has not been exhaustively examined from that perspective, as evidenced for example by the recent deep reinforcement learning approach of AlphaTensor. In this work, we propose a simple yet novel Constraint Programming approach to find non-commutative algorithms for fast matrix multiplication or provide proof of infeasibility otherwise. We propose a set of symmetry-breaking constraints and valid inequalities that are particularly helpful in proving infeasibility. On the feasible side, we find that exploiting solver performance variability in conjunction with a sparsity-based problem decomposition enables finding solutions for larger (feasible) instances of fast matrix multiplication. Our experimental results using CP Optimizer demonstrate that we can find fast matrix multiplication algorithms for matrices up to $3\times 3$ in a short amount of time.

artificial intelligence, machine learning, multiplication, (17 more...)

2306.01097

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany (0.04)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Communications of the ACMMay-24-2023, 23:40:13 GMT

Better Algorithms through Faster Math

Developing faster algorithms is an important but elusive goal for data scientists. The ability to accelerate complex computing tasks and reduce latency has far-reaching ramifications in areas such as natural language processing, video streaming, autonomous robotics, gaming, and extended reality. Yet for all the hype surrounding computer algorithms and the increasingly sophisticated ways they operate, a basic fact stands out: these algorithms are typically built atop matrix multiplication, a basic type of linear algebra. The underlying mathematical framework has not changed a great deal since the inception of computing--and finding more efficient formulas has proved elusive. It is an issue attracting growing attention--particularly as machine learning (ML), deep learning (DL), artificial intelligence (AI), and machine automation advance into the mainstream.

artificial intelligence, machine learning, natural language, (17 more...)

Communications of the ACM

Country:

Africa > Senegal > Kolda Region > Kolda (0.05)
North America > United States > Oregon > Clackamas County > West Linn (0.04)
North America > United States > New York (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Industry:

Information Technology (0.48)
Leisure & Entertainment > Games (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.43)