AITopics | operatorname

Collaborating Authors

operatorname

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Distributed Multi-Agent Bandits Over Erdős-Rényi Random Networks

Neural Information Processing SystemsJun-14-2026, 06:51:13 GMT

We study the distributed multi-agent multi-armed bandit problem with heterogeneous rewards over random communication graphs. Uniquely, at each time step $t$ agents communicate over a time-varying random graph $\mathcal{G}\_t$ generated by applying the Erdős-Rényi model to a fixed connected base graph $\mathcal{G}$ (for classical Erdos-Rényi graphs, $\mathcal{G}$ is a complete graph), where each potential edge in $\mathcal{G}$ is randomly and independently present with the link probability $p$. Notably, the resulting random graph is not necessarily connected at each time step. Each agent's arm rewards follow time-invariant distributions, and the reward distribution for the same arm may differ across agents. The goal is to minimize the cumulative expected regret relative to the global mean reward of each arm, defined as the average of that arm's mean rewards across all agents. To this end, we propose a fully distributed algorithm that integrates the arm elimination strategy with the random gossip algorithm. We theoretically show that the regret upper bound is of order $\log T$ and is highly interpretable, where $T$ is the time horizon.

artificial intelligence, big data, data mining, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.94)

Add feedback

Variance-Reduced Long-Term Rehearsal Learning with Quadratic Programming Reformulation

Neural Information Processing SystemsJun-13-2026, 15:53:27 GMT

In machine learning, a critical class of decision-making problems involves *Avoiding Undesired Future* (AUF): given a predicted undesired outcome, how can one make decision about actions to prevent it? Recently, the *rehearsal learning* framework has been proposed to address AUF problem. While existing methods offer reliable decisions for single-round success, this paper considers long-term settings that involve coordinating multiple future outcomes, which is often required in real-world tasks. Specifically, we generalize the AUF objective to characterize a long-term decision target that incorporates cross-temporal relations among variables. As directly optimizing the *AUF probability* $\mathbb{P}_{\operatorname{AUF}}$ over this objective remains challenging, we derive an explicit expression for the objective and further propose a quadratic programming (QP) reformulation that transforms the intractable probabilistic AUF optimization into a tractable one. Under mild assumptions, we show that solutions to the QP reformulation are equivalent to those of the original AUF optimization, based on which we develop two novel rehearsal learning methods for long-term decision-making: (i) a *greedy* method that maximizes the single-round $\mathbb{P}_{\operatorname{AUF}}$ at each step, and (ii) a *far-sighted* method that accounts for future consequences in each decision, yielding a higher overall $\mathbb{P}_{\operatorname{AUF}}$ through an $L/(L+1)$ variance reduction in the AUF objective. We further establish an $\mathcal{O}(1/\sqrt{N})$ excess risk bound for decisions based on estimated parameters, ensuring reliable practical applicability with finite data.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.75)

Add feedback

Kernel-based Equalized Odds: A Quantification of Accuracy-Fairness Trade-off in Fair Representation Learning

Neural Information Processing SystemsJun-13-2026, 03:06:35 GMT

This paper introduces a novel kernel-based formulation of the Equalized Odds (EO) criterion, denoted as $\operatorname{EO}_k$, for fair representation learning (FRL) in supervised settings. The central goal of FRL is to mitigate discrimination regarding a sensitive attribute $S$ while preserving prediction accuracy for the target variable $Y$. Our proposed criterion enables a rigorous and interpretable quantification of three core fairness objectives: independence ($\widehat{Y} \perp S$), separation--also known as equalized odds ($\widehat{Y} \perp S \mid Y$), and calibration ($Y \perp S \mid \widehat{Y}$). Under both unbiased ($Y \perp S$) and biased ($Y \not \perp S$) conditions, we show that $\operatorname{EO}_k$ satisfies both independence and separation in the former, and uniquely preserves predictive accuracy while lower bounding independence and calibration in the latter, thereby offering a unified analytical characterization of the tradeoffs among these fairness criteria. We further define the empirical counterpart, $\widehat{\operatorname{EO}}_k$, a kernel-based statistic that can be computed in quadratic time, with linear-time approximations also available. A concentration inequality for $\widehat{\operatorname{EO}}_k$ is derived, providing performance guarantees and error bounds, which serve as practical certificates of fairness compliance. While our focus is on theoretical development, the results lay essential groundwork for principled and provably fair algorithmic design in future empirical studies.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Neural Information Processing SystemsMar-22-2026, 15:03:30 GMT

Linear attention Transformers and their gated variants, celebrated for enabling parallel training and efficient recurrent inference, still fall short in recall-intensive tasks compared to traditional Transformers and demand significant resources for training from scratch.This paper introduces Gated Slot Attention (GSA), which enhances Attention with Bounded-memory-Control (ABC) by incorporating a gating mechanism inspired by Gated Linear Attention (GLA).Essentially, GSA comprises a two-layer GLA linked via $\operatorname{softmax}$, utilizing context-aware memory reading and adaptive forgetting to improve memory capacity while maintaining compact recurrent state size.This design greatly enhances both training and inference efficiency through GLA's hardware-efficient training algorithm and reduced state size.Additionally, retaining the $\operatorname{softmax}$ operation is particularly beneficial in ``finetuning pretrained Transformers to RNNs'' (T2R) settings, reducing the need for extensive training from scratch.Extensive experiments confirm GSA's superior performance in scenarios requiring in-context recall and in T2R settings.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Consumer Health (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Monomial Matrix Group Equivariant Neural Functional Networks

Neural Information Processing SystemsMar-20-2026, 17:02:07 GMT

Neural functional networks (NFNs) have recently gained significant attention due to their diverse applications, ranging from predicting network generalization and network editing to classifying implicit neural representation. Previous NFN designs often depend on permutation symmetries in neural networks' weights, which traditionally arise from the unordered arrangement of neurons in hidden layers. However, these designs do not take into account the weight scaling symmetries of $\operatorname{ReLU}$ networks, and the weight sign flipping symmetries of $\operatorname{sin}$ or $\operatorname{Tanh}$ networks. In this paper, we extend the study of the group action on the network weights from the group of permutation matrices to the group of monomial matrices by incorporating scaling/sign-flipping symmetries.

artificial intelligence, machine learning, symmetry, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Constant Regret, Generalized Mixability, and Mirror Descent

Neural Information Processing SystemsMar-16-2026, 23:32:27 GMT

We consider the setting of prediction with expert advice; a learner makes predictions by aggregating those of a group of experts. Under this setting, and for the right choice of loss function and ``mixing'' algorithm, it is possible for the learner to achieve a constant regret regardless of the number of prediction rounds. For example, a constant regret can be achieved for \emph{mixable} losses using the \emph{aggregating algorithm}. The \emph{Generalized Aggregating Algorithm} (GAA) is a name for a family of algorithms parameterized by convex functions on simplices (entropies), which reduce to the aggregating algorithm when using the \emph{Shannon entropy} $\operatorname{S}$. For a given entropy $\Phi$, losses for which a constant regret is possible using the \textsc{GAA} are called $\Phi$-mixable. Which losses are $\Phi$-mixable was previously left as an open question. We fully characterize $\Phi$-mixability and answer other open questions posed by \cite{Reid2015}. We show that the Shannon entropy $\operatorname{S}$ is fundamental in nature when it comes to mixability; any $\Phi$-mixable loss is necessarily $\operatorname{S}$-mixable, and the lowest worst-case regret of the \textsc{GAA} is achieved using the Shannon entropy. Finally, by leveraging the connection between the \emph{mirror descent algorithm} and the update step of the GAA, we suggest a new \emph{adaptive} generalized aggregating algorithm and analyze its performance in terms of the regret bound.

artificial intelligence, machine learning, proceedings, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Local-Global MCMC kernels: the best of both worlds

Neural Information Processing SystemsDec-23-2025, 21:51:06 GMT

Recent works leveraging learning to enhance sampling have shown promising results, in particular by designing effective non-local moves and global proposals. However, learning accuracy is inevitably limited in regions where little data is available such as in the tails of distributions as well as in high-dimensional problems. In the present paper we study an Explore-Exploit Markov chain Monte Carlo strategy ($\operatorname{Ex^2MCMC}$) that combines local and global samplers showing that it enjoys the advantages of both approaches. We prove $V$-uniform geometric ergodicity of $\operatorname{Ex^2MCMC}$ without requiring a uniform adaptation of the global sampler to the target distribution. We also compute explicit bounds on the mixing rate of the Explore-Exploit strategy under realistic conditions. Moreover, we propose an adaptive version of the strategy ($\operatorname{FlEx^2MCMC}$) where a normalizing flow is trained while sampling to serve as a proposal for global moves. We illustrate the efficiency of $\operatorname{Ex^2MCMC}$ and its adaptive version on classical sampling benchmarks as well as in sampling high-dimensional distributions defined by Generative Adversarial Networks seen as Energy Based Models.

local-global mcmc kernel, name change, operatorname, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Constant Regret, Generalized Mixability, and Mirror Descent

Neural Information Processing SystemsNov-20-2025, 22:48:36 GMT

constant regret, entropy, generalized mixability, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Neural Information Processing SystemsMay-27-2025, 18:04:29 GMT

Linear attention Transformers and their gated variants, celebrated for enabling parallel training and efficient recurrent inference, still fall short in recall-intensive tasks compared to traditional Transformers and demand significant resources for training from scratch.This paper introduces Gated Slot Attention (GSA), which enhances Attention with Bounded-memory-Control (ABC) by incorporating a gating mechanism inspired by Gated Linear Attention (GLA).Essentially, GSA comprises a two-layer GLA linked via \operatorname{softmax}, utilizing context-aware memory reading and adaptive forgetting to improve memory capacity while maintaining compact recurrent state size.This design greatly enhances both training and inference efficiency through GLA's hardware-efficient training algorithm and reduced state size.Additionally, retaining the \operatorname{softmax} operation is particularly beneficial in finetuning pretrained Transformers to RNNs'' (T2R) settings, reducing the need for extensive training from scratch.Extensive experiments confirm GSA's superior performance in scenarios requiring in-context recall and in T2R settings.

attention, efficient linear-time sequence modeling, gated slot attention, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Monomial Matrix Group Equivariant Neural Functional Networks

Neural Information Processing SystemsMay-27-2025, 01:50:33 GMT

Neural functional networks (NFNs) have recently gained significant attention due to their diverse applications, ranging from predicting network generalization and network editing to classifying implicit neural representation. Previous NFN designs often depend on permutation symmetries in neural networks' weights, which traditionally arise from the unordered arrangement of neurons in hidden layers. However, these designs do not take into account the weight scaling symmetries of \operatorname{ReLU} networks, and the weight sign flipping symmetries of \operatorname{sin} or \operatorname{Tanh} networks. In this paper, we extend the study of the group action on the network weights from the group of permutation matrices to the group of monomial matrices by incorporating scaling/sign-flipping symmetries. We name our new family of NFNs the Monomial Matrix Group Equivariant Neural Functional Networks (Monomial-NFN).

artificial intelligence, group equivariant neural functional network, machine learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback