AITopics | einsum

Collaborating Authors

einsum

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Relational Self-Attention: What's Missing in Attention for Video Understanding Supplementary Material

Neural Information Processing SystemsApr-25-2026, 15:32:23 GMT

We use TSN-ResNet [11] as our backbone (see Table 1) and initialize it with ImageNet-pretrained weights [4]. We replace its 7 spatial convolutional layers with the RSA layers; for every two ResNet blocks from the third block in res2 to the second block in res5, each spatial convolutional layer is replaced with the RSA layer. For the bottlenecks including RSA layers, we randomly initialize weights using MSRA initialization [3] and set the gamma parameter of the last batch normalization layer to zero. We resize the resolution of each frame to 240 320, and apply random cropping as 224 224, scale jittering, and random horizontal flipping for data augmentation. Note that we do not flip videos of which action labels include'left' or'right' words, e.g., 'pulling something from left to right'.

artificial intelligence, machine learning, video understanding, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)
Information Technology > Artificial Intelligence > Vision > Video Understanding (0.40)

Add feedback

a9d419ef12fb34105424fa3166716139-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 07:56:29 GMT

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Education (0.67)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
(3 more...)

Add feedback

Autoconj: Recognizing and Exploiting Conjugacy Without a Domain-Specific Language

Matthew D. Hoffman

Neural Information Processing SystemsFeb-13-2026, 21:40:35 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, autoconj, inference, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.06)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

RelationalSelf-Attention: What'sMissinginAttentionforVideoUnderstanding SupplementaryMaterial

Neural Information Processing SystemsFeb-8-2026, 09:47:54 GMT

Forthebottlenecks including RSAlayers, werandomly initializeweights using MSRA initialization [3] and set the gamma parameter of the last batch normalization layer to zero. We implement our model based on TSN in Pytorch2 under BSD 2-Clause license. All the benchmarks that we used are commonly used datasets for the academic purpose. While specified otherwise, the training and testing details are the sameasthoseinSec.5.1. Since each RSA kernel generated by each query captures a distinct motion pattern, the model can learn diverse motion features(seeFigure3). Inthisexperiment,wechooseL = 8asthedefault.

artificial intelligence, einsum, kernel, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Self-supervisedCo-TrainingforVideoRepresentationLearning

Neural Information Processing SystemsFeb-8-2026, 04:16:33 GMT

The detailed dimensions are showninTable2.

artificial intelligence, machine learning, torch, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.36)

Add feedback

2cd2915e69546904e4e5d4a2ac9e1652-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 22:55:02 GMT

For easier derivation, we have introduced a notation ofqi. Sequence-level prediction This is essentially the case we consider in most of our experiments wherewewanttoobtain avectorial representation oftheinputsequence suchastextclassification. Finally, although we focus on discussion on the NLP tasks in this paper, Funnel-Transformer couldbeapplied toanytasksdealing withsequential data,suchastimeseries andvideostreamanalysis. B.1 Preprocessing&Tokenization For all experiments conducted in this work, we simply adapt the "uncased" word piece model originally used by BERT [2], where the vocabulary size is about 30K. Specifically,wefindthe training can be unstable when the depth goes beyond 24 layers (in the case of B10-10-10H1024) at base scale, especially for the MLM objective.

artificial intelligence, attentiondropout 0, natural language, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.50)

Add feedback

Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices

Neural Information Processing SystemsFeb-7-2026, 11:34:17 GMT

einsum, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Collapsing Taylor Mode Automatic Differentiation

Dangel, Felix, Siebert, Tim, Zeinhofer, Marius, Walther, Andrea

arXiv.org Artificial IntelligenceNov-25-2025

Computing partial differential equation (PDE) operators via nested backpropagation is expensive, yet popular, and severely restricts their utility for scientific machine learning. Recent advances, like the forward Laplacian and randomizing Taylor mode automatic differentiation (AD), propose forward schemes to address this. We introduce an optimization technique for Taylor mode that 'collapses' derivatives by rewriting the computational graph, and demonstrate how to apply it to general linear PDE operators, and randomized Taylor mode. The modifications simply require propagating a sum up the computational graph, which could -- or should -- be done by a machine learning compiler, without exposing complexity to users. We implement our collapsing procedure and evaluate it on popular PDE operators, confirming it accelerates Taylor mode and outperforms nested backpropagation.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2505.13644

Country:

North America > Canada (0.45)
Europe > Switzerland (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Autoconj: Recognizing and Exploiting Conjugacy Without a Domain-Specific Language

Matthew D. Hoffman

Neural Information Processing SystemsNov-20-2025, 18:40:59 GMT

Deriving conditional and marginal distributions using conjugacy relationships can be time consuming and error prone. In this paper, we propose a strategy for automating such derivations. Unlike previous systems which focus on relationships between pairs of random variables, our system (which we call Autoconj) operates directly on Python functions that compute log-joint distribution functions. Autoconj provides support for conjugacy-exploiting algorithms in any Python-embedded PPL. This paves the way for accelerating development of novel inference algorithms and structure-exploiting modeling strategies.

artificial intelligence, machine learning, programming language, (21 more...)

Neural Information Processing Systems

Country: