AITopics | theory

Collaborating Authors

theory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AI and Theory of Mind: an interview with Nitay Alon

AIHubMar-16-2026, 09:32:21 GMT

How did you arrive at this research area?

machine learning, natural language, theory, (21 more...)

AIHub

Country:

Asia > Singapore (0.05)
North America > United States (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Personal > Interview (1.00)

Industry: Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.99)
Information Technology > Artificial Intelligence > Natural Language (0.72)
Information Technology > Communications > Social Media (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Entropy Rate Estimation for Markov Chains with Large State Space

Yanjun Han, Jiantao Jiao, Chuan-Zheng Lee, Tsachy Weissman, Yihong Wu, Tiancheng Yu

Neural Information Processing SystemsFeb-13-2026, 21:02:34 GMT

Entropy estimation is one of the prototypical problems in distribution property testing.

artificial intelligence, machine learning, markov chain, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Add feedback

Robust Learning of Fixed-Structure Bayesian Networks

Yu Cheng, Ilias Diakonikolas, Daniel Kane, Alistair Stewart

Neural Information Processing SystemsFeb-12-2026, 17:58:26 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, bayesian network, diakonikola, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.91)

Add feedback

A Theory of Link Prediction via Relational Weisfeiler-Leman on Knowledge Graphs

Neural Information Processing SystemsDec-24-2025, 20:13:01 GMT

Graph neural networks are prominent models for representation learning over graph-structured data. While the capabilities and limitations of these models are well-understood for simple graphs, our understanding remains incomplete in the context of knowledge graphs. Our goal is to provide a systematic understanding of the landscape of graph neural networks for knowledge graphs pertaining to the prominent task of link prediction. Our analysis entails a unifying perspective on seemingly unrelated models and unlocks a series of other models. The expressive power of various models is characterized via a corresponding relational Weisfeiler-Leman algorithm. This analysis is extended to provide a precise logical characterization of the class of functions captured by a class of graph neural networks. The theoretical findings presented in this paper explain the benefits of some widely employed practical design choices, which are validated empirically.

link prediction, name change, relational weisfeiler-leman, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.76)

Add feedback

A Theory of Transfer-Based Black-Box Attacks: Explanation and Implications

Neural Information Processing SystemsDec-24-2025, 09:37:47 GMT

Transfer-based attacks are a practical method of black-box adversarial attacks, in which the attacker aims to craft adversarial examples from a source (surrogate) model that is transferable to the target model. A wide range of empirical works has tried to explain the transferability of adversarial examples from different angles. However, these works only provide ad hoc explanations without quantitative analyses. The theory behind transfer-based attacks remains a mystery.This paper studies transfer-based attacks under a unified theoretical framework. We propose an explanatory model, called the manifold attack model, that formalizes popular beliefs and explains the existing empirical results. Our model explains why adversarial examples are transferable even when the source model is inaccurate. Moreover, our model implies that the existence of transferable adversarial examples depends on the "curvature" of the data manifold, which quantitatively explains why the success rates of transfer-based attacks are hard to improve. We also discuss the expressive power and the possible extensions of our model in general applications.

explanation and implication, name change, transfer-based black-box attack, (6 more...)

Neural Information Processing Systems

Industry: Transportation > Air (0.67)

Technology: Information Technology > Artificial Intelligence (0.64)

Add feedback

A Theory of PAC Learnability under Transformation Invariances

Neural Information Processing SystemsDec-24-2025, 06:33:06 GMT

Transformation invariances are present in many real-world problems. For example, image classification is usually invariant to rotation and color transformation: a rotated car in a different color is still identified as a car. Data augmentation, which adds the transformed data into the training set and trains a model on the augmented data, is one commonly used technique to build these invariances into the learning process. However, it is unclear how data augmentation performs theoretically and what the optimal algorithm is in presence of transformation invariances. In this paper, we study PAC learnability under transformation invariances in three settings according to different levels of realizability: (i) A hypothesis fits the augmented data; (ii) A hypothesis fits only the original data and the transformed data lying in the support of the data distribution; (iii) Agnostic case. One interesting observation is that distinguishing between the original data and the transformed data is necessary to achieve optimal accuracy in setting (ii) and (iii), which implies that any algorithm not differentiating between the original and transformed data (including data augmentation) is not optimal.

name change, pac learnability, transformation invariance, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Labeling Trick: A Theory of Using Graph Neural Networks for Multi-Node Representation Learning

Neural Information Processing SystemsDec-24-2025, 02:21:24 GMT

In this paper, we provide a theory of using graph neural networks (GNNs) for multi-node representation learning (where we are interested in learning a representation for a set of more than one node, such as link). We know that GNN is designed to learn single-node representations. When we want to learn a node set representation involving multiple nodes, a common practice in previous works is to directly aggregate the single-node representations obtained by a GNN into a joint node set representation. In this paper, we show a fundamental constraint of such an approach, namely the inability to capture the dependence between nodes in the node set, and argue that directly aggregating individual node representations does not lead to an effective joint representation for multiple nodes. Then, we notice that a few previous successful works for multi-node representation learning, including SEAL, Distance Encoding, and ID-GNN, all used node labeling.

graph neural network, node, representation, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.60)

Add feedback

On the Theory of Transfer Learning: The Importance of Task Diversity

Neural Information Processing SystemsDec-24-2025, 02:01:08 GMT

We provide new statistical guarantees for transfer learning via representation learning--when transfer is achieved by learning a feature representation shared across different tasks. This enables learning on new tasks using far less data than is required to learn them in isolation. Formally, we consider $t+1$ tasks parameterized by functions of the form $f_j \circ h$ in a general function class $F \circ H$, where each $f_j$ is a task-specific function in $F$ and $h$ is the shared representation in $H$. Letting $C(\cdot)$ denote the complexity measure of the function class, we show that for diverse training tasks (1) the sample complexity needed to learn the shared representation across the first $t$ training tasks scales as $C(H) + t C(F)$, despite no explicit access to a signal from the feature representation and (2) with an accurate estimate of the representation, the sample complexity needed to learn a new task scales only with $C(F)$. Our results depend upon a new general notion of task diversity--applicable to models with general tasks, features, and losses--as well as a novel chain rule for Gaussian complexities.

name change, representation, transfer learning, (10 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.07)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.43)

Add feedback

On the Theory of Reinforcement Learning with Once-per-Episode Feedback

Neural Information Processing SystemsDec-23-2025, 20:09:15 GMT

We study a theory of reinforcement learning (RL) in which the learner receives binary feedback only once at the end of an episode. While this is an extreme test case for theory, it is also arguably more representative of real-world applications than the traditional requirement in RL practice that the learner receive feedback at every time step. Indeed, in many real-world applications of reinforcement learning, such as self-driving cars and robotics, it is easier to evaluate whether a learner's complete trajectory was either bad,'' but harder to provide a reward signal at each step. To show that learning is possible in this more challenging setting, we study the case where trajectory labels are generated by an unknown parametric model, and provide a statistically and computationally efficient algorithm that achieves sublinear regret.

name change, once-per-episode feedback, reinforcement learning, (5 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.08)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.56)

Add feedback

Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation

Martin Azizyan, Aarti Singh, Larry Wasserman

Neural Information Processing SystemsOct-9-2025, 15:07:27 GMT

While several papers have investigated computationally and statistically efficient methods for learning Gaussian mixtures, precise minimax bounds for their statistical performance as well as fundamental limits in high-dimensional settings are not well-understood. In this paper, we provide precise information theoretic bounds on the clustering accuracy and sample complexity of learning a mixture of two isotropic Gaussians in high dimensions under small mean separation. If there is a sparse subset of relevant dimensions that determine the mean separation, then the sample complexity only depends on the number of relevant dimensions and mean separation, and can be achieved by a simple computationally efficient procedure. Our results provide the first step of a theoretical basis for recent methods that combine feature selection and clustering.

dimension, mean separation, separation, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.62)

Add feedback