AITopics | pyq

Collaborating Authors

pyq

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

6bb56208f672af0dd65451f869fedfd9-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 19:03:29 GMT

In most applications,E " Y to begin with (ally are potential maximizer for some vector of costs, otherwise theyare not included in the set), and all points inY havepositive mass. Ittherefore also satisfies this property. We recall that we assume thatθ yields a unique maximum to the linear program onC. As a consequence, all convergent subsequences ofyn converge to the same limity pθq: it is the unique accumulation point of this sequence.Itfollowsdirectlythat yn convergesto y pθq,asitlivesinacompactset,whichyieldsthe desired result. Using different reference vectorsv yield different perturbed operations, andv " p1,2,...,dq is commonlyused.

artificial intelligence, asaconsequence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin (0.05)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

SubgaussianandDifferentiableImportanceSampling forOff-PolicyEvaluationandLearning

Neural Information Processing SystemsFeb-8-2026, 10:06:36 GMT

Inthispaper,weanalyze the theoretical properties of the IS estimator by deriving a novel anticoncentration bound that formalizes the intuition behind itsundesired behavior. Then, we propose anew class of IS transformations, based on the notion of power mean. To the best of our knowledge, the resulting estimator is the first to achieve, under certainconditions, twokeyproperties: (i)itdisplays asubgaussianconcentration rate; (ii) it preserves the differentiability in the target distribution.

artificial intelligence, estimator, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

Shi, Ruizhe, Song, Minhak, Zhou, Runlong, Zhang, Zihan, Fazel, Maryam, Du, Simon S.

arXiv.org Artificial IntelligenceOct-6-2025

We present a fine-grained theoretical analysis of the performance gap between reinforcement learning from human feedback (RLHF) and direct preference optimization (DPO) under a representation gap. Our study decomposes this gap into two sources: an explicit representation gap under exact optimization and an implicit representation gap under finite samples. In the exact optimization setting, we characterize how the relative capacities of the reward and policy model classes influence the final policy qualities. We show that RLHF, DPO, or online DPO can outperform one another depending on type of model mis-specifications. Notably, online DPO can outperform both RLHF and standard DPO when the reward and policy model classes are isomorphic and both mis-specified. In the approximate optimization setting, we provide a concrete construction where the ground-truth reward is implicitly sparse and show that RLHF requires significantly fewer samples than DPO to recover an effective reward model -- highlighting a statistical advantage of two-stage learning. Together, these results provide a comprehensive understanding of the performance gap between RLHF and DPO under various settings, and offer practical insights into when each method is preferred.

artificial intelligence, machine learning, py 1, (19 more...)

arXiv.org Artificial Intelligence

2505.1977

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the contraction properties of Sinkhorn semigroups

Akyildiz, O. Deniz, del Moral, Pierre, Miguez, Joaquin

arXiv.org Machine LearningMar-12-2025

We develop a novel semigroup contraction analysis based on Lyapunov techniques to prove the exponential convergence of Sinkhorn equations on weighted Banach spaces. This operator-theoretic framework yields exponential decays of Sinkhorn iterates towards Schr\"odinger bridges with respect to general classes of $\phi$-divergences as well as in weighted Banach spaces. To the best of our knowledge, these are the first results of this type in the literature on entropic transport and the Sinkhorn algorithm. We also illustrate the impact of these results in the context of multivariate linear Gaussian models as well as statistical finite mixture models including Gaussian-kernel density estimation of complex data distributions arising in generative models.

contraction coefficient, mq 1, py 1, (15 more...)

arXiv.org Machine Learning

2503.09887

Country:

North America > United States > New York (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(3 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)

Add feedback

Lower Bounds on the Expressivity of Recurrent Neural Language Models

Svete, Anej, Nowak, Franz, Sahabdeen, Anisha Mohamed, Cotterell, Ryan

arXiv.org Artificial IntelligenceJun-18-2024

The recent successes and spread of large neural language models (LMs) call for a thorough understanding of their computational ability. Describing their computational abilities through LMs' \emph{representational capacity} is a lively area of research. However, investigation into the representational capacity of neural LMs has predominantly focused on their ability to \emph{recognize} formal languages. For example, recurrent neural networks (RNNs) with Heaviside activations are tightly linked to regular languages, i.e., languages defined by finite-state automata (FSAs). Such results, however, fall short of describing the capabilities of RNN \emph{language models} (LMs), which are definitionally \emph{distributions} over strings. We take a fresh look at the representational capacity of RNN LMs by connecting them to \emph{probabilistic} FSAs and demonstrate that RNN LMs with linearly bounded precision can express arbitrary regular LMs.

language model, lms, pfsa, (14 more...)

arXiv.org Artificial Intelligence

2405.19222

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
(10 more...)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Efficiently Representing Regular Languages as RNNs

Svete, Anej, Chan, Robin Shing Moon, Cotterell, Ryan

arXiv.org Artificial IntelligenceJun-18-2024

Recent work by Hewitt et al. (2020) provides an interpretation of the empirical success of recurrent neural networks (RNNs) as language models (LMs). It shows that RNNs can efficiently represent bounded hierarchical structures that are prevalent in human language. This suggests that RNNs' success might be linked to their ability to model hierarchy. However, a closer inspection of Hewitt et al.'s (2020) construction shows that it is not inherently limited to hierarchical structures. This poses a natural question: What other classes of LMs can RNNs efficiently represent? To this end, we generalize Hewitt et al.'s (2020) construction and show that RNNs can efficiently represent a larger class of LMs than previously claimed -- specifically, those that can be represented by a pushdown automaton with a bounded stack and a specific stack update function. Altogether, the efficiency of representing this diverse class of LMs with RNN LMs suggests novel interpretations of their inductive bias.

bpda, lms, rnn, (16 more...)

arXiv.org Artificial Intelligence

2402.15814

Country:

North America > United States > Maryland > Prince George's County > College Park (0.04)
North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.04)
North America > Canada > Ontario > Toronto (0.04)
(6 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Safe Reinforcement Learning with Instantaneous Constraints: The Role of Aggressive Exploration

Wei, Honghao, Liu, Xin, Ying, Lei

arXiv.org Artificial IntelligenceDec-22-2023

This paper studies safe Reinforcement Learning (safe RL) with linear function approximation and under hard instantaneous constraints where unsafe actions must be avoided at each step. Existing studies have considered safe RL with hard instantaneous constraints, but their approaches rely on several key assumptions: $(i)$ the RL agent knows a safe action set for {\it every} state or knows a {\it safe graph} in which all the state-action-state triples are safe, and $(ii)$ the constraint/cost functions are {\it linear}. In this paper, we consider safe RL with instantaneous hard constraints without assumption $(i)$ and generalize $(ii)$ to Reproducing Kernel Hilbert Space (RKHS). Our proposed algorithm, LSVI-AE, achieves $\tilde{\cO}(\sqrt{d^3H^4K})$ regret and $\tilde{\cO}(H \sqrt{dK})$ hard constraint violation when the cost function is linear and $\cO(H\gamma_K \sqrt{K})$ hard constraint violation when the cost function belongs to RKHS. Here $K$ is the learning horizon, $H$ is the length of each episode, and $\gamma_K$ is the information gain w.r.t the kernel used to approximate cost functions. Our results achieve the optimal dependency on the learning horizon $K$, matching the lower bound we provide in this paper and demonstrating the efficiency of LSVI-AE. Notably, the design of our approach encourages aggressive policy exploration, providing a unique perspective on safe RL with general cost functions and no prior knowledge of safe actions, which may be of independent interest.

constraint, probability, violation, (14 more...)

arXiv.org Artificial Intelligence

2312.1447

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Precise Asymptotics for Spectral Methods in Mixed Generalized Linear Models

Zhang, Yihan, Mondelli, Marco, Venkataramanan, Ramji

arXiv.org Artificial IntelligenceDec-9-2022

In a mixed generalized linear model, the objective is to learn multiple signals from unlabeled observations: each sample comes from exactly one signal, but it is not known which one. We consider the prototypical problem of estimating two statistically independent signals in a mixed generalized linear model with Gaussian covariates. Spectral methods are a popular class of estimators which output the top two eigenvectors of a suitable data-dependent matrix. However, despite the wide applicability, their design is still obtained via heuristic considerations, and the number of samples $n$ needed to guarantee recovery is super-linear in the signal dimension $d$. In this paper, we develop exact asymptotics on spectral methods in the challenging proportional regime in which $n, d$ grow large and their ratio converges to a finite constant. By doing so, we are able to optimize the design of the spectral method, and combine it with a simple linear estimator, in order to minimize the estimation error. Our characterization exploits a mix of tools from random matrices, free probability and the theory of approximate message passing algorithms. Numerical simulations for mixed linear regression and phase retrieval display the advantage enabled by our analysis over existing designs of spectral methods.

artificial intelligence, equation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.11368

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Austria > Styria > Graz (0.04)
(2 more...)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Compositional Active Inference II: Polynomial Dynamics. Approximate Inference Doctrines

Smithe, Toby St. Clere

arXiv.org Artificial IntelligenceAug-25-2022

We develop the compositional theory of active inference by introducing activity, functorially relating statistical games to the dynamical systems which play them, using the new notion of approximate inference doctrine. In order to exhibit such functors, we first develop the necessary theory of dynamical systems, using a generalization of the language of polynomial functors to supply compositional interfaces of the required types: with the resulting polynomially indexed categories of coalgebras, we construct monoidal bicategories of differential and dynamical ``hierarchical inference systems'', in which approximate inference doctrines have semantics. We then describe ``externally parameterized'' statistical games, and use them to construct two approximate inference doctrines found in the computational neuroscience literature, which we call the `Laplace' and the `Hebb-Laplace' doctrines: the former produces dynamical systems which optimize the posteriors of Gaussian models; and the latter produces systems which additionally optimize the parameters (or `weights') which determine their predictions.

artificial intelligence, category, dynamical system, (16 more...)

arXiv.org Artificial Intelligence

2208.12173

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

Add feedback

Post-processing of Differentially Private Data: A Fairness Perspective

Zhu, Keyu, Fioretto, Ferdinando, Van Hentenryck, Pascal

arXiv.org Artificial IntelligenceJan-23-2022

Post-processing immunity is a fundamental property of differential privacy: it enables arbitrary data-independent transformations to differentially private outputs without affecting their privacy guarantees. Post-processing is routinely applied in data-release applications, including census data, which are then used to make allocations with substantial societal impacts. This paper shows that post-processing causes disparate impacts on individuals or groups and analyzes two critical settings: the release of differentially private datasets and the use of such private datasets for downstream decisions, such as the allocation of funds informed by US Census data. In the first setting, the paper proposes tight bounds on the unfairness of traditional post-processing mechanisms, giving a unique tool to decision-makers to quantify the disparate impacts introduced by their release. In the second setting, this paper proposes a novel post-processing mechanism that is (approximately) optimal under different fairness metrics, either reducing fairness issues substantially or reducing the cost of privacy. The theoretical analysis is complemented with numerical simulations on Census data.

equation, mechanism, post-processing mechanism, (15 more...)

arXiv.org Artificial Intelligence

2201.09425

Country:

North America > United States > Hawaii (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.87)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback