AITopics | second inequality

Collaborating Authors

second inequality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient

Shang, Shuning, Strauss, Hubert, Wei, Stanley, Arora, Sanjeev, Razin, Noam

arXiv.org Machine LearningApr-29-2026

Training language models via reinforcement learning often relies on imperfect proxy rewards, since ground truth rewards that precisely define the intended behavior are rarely available. Standard metrics for assessing the quality of proxy rewards, such as ranking accuracy, treat incorrect rewards as strictly harmful. In this work, however, we highlight that not all deviations from the ground truth are equal. By theoretically analyzing which outputs attract probability during policy gradient optimization, we categorize reward errors according to their effect on the increase in ground truth reward. The analysis establishes that reward errors, though conventionally viewed as harmful, can also be benign or even beneficial by preventing the policy from stalling around outputs with mediocre ground truth reward. We then present two practical implications of our theory. First, for reinforcement learning from human feedback (RLHF), we develop reward model evaluation metrics that account for the harmfulness of reward errors. Compared to standard ranking accuracy, these metrics typically correlate better with the performance of a language model after RLHF, yet gaps remain in robustly evaluating reward models. Second, we provide insights for reward design in settings with verifiable rewards. A key theme underlying our results is that the effectiveness of a proxy reward function depends heavily on its interaction with the initial policy and learning algorithm.

artificial intelligence, inequality, machine learning, (17 more...)

arXiv.org Machine Learning

2604.25872

Country: Asia (0.27)

Genre: Research Report > New Finding (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

3e6260b81898beacda3d16db379ed329-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 14:03:49 GMT

artificial intelligence, inequality, probability, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards Appendix AFormal Definition of Inhomogeneous Poisson Process

Neural Information Processing SystemsApr-24-2026, 14:15:01 GMT

The inhomogeneous Poisson (point) process is a Poisson point process with a Poisson parameter set as some time-dependent function r(τ). Let N(a,b) represent the number of points of inhomogeneous Poisson process with intensity function r(t) occurring in the interval [a,b], then the probability of n points existing in the interval [a,b] is given by, P(N(a,b) = n) Λ(a,b)n n! In this paper, the points mean the conversions and the time-dependent intensity function r() is defined in Eq. (2) and it depends on the realization of the conversions and parameter θ. Suppose X1, Xn are independent, mean-zero, subexponential random variables, and a = (a1,,an) is an ndimensional constanst vector. We first introduce the main idea of the the PAMM algorithm.

bft, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Detection of local geometry in random graphs: information-theoretic and computational limits

Bok, Jinho, Li, Shuangping, Yu, Sophie H.

arXiv.org Machine LearningMar-26-2026

We study the problem of detecting local geometry in random graphs. We introduce a model $\mathcal{G}(n, p, d, k)$, where a hidden community of average size $k$ has edges drawn as a random geometric graph on $\mathbb{S}^{d-1}$, while all remaining edges follow the Erdős--Rényi model $\mathcal{G}(n, p)$. The random geometric graph is generated by thresholding inner products of latent vectors on $\mathbb{S}^{d-1}$, with each edge having marginal probability equal to $p$. This implies that $\mathcal{G}(n, p, d, k)$ and $\mathcal{G}(n, p)$ are indistinguishable at the level of the marginals, and the signal lies entirely in the edge dependencies induced by the local geometry. We investigate both the information-theoretic and computational limits of detection. On the information-theoretic side, our upper bounds follow from three tests based on signed triangle counts: a global test, a scan test, and a constrained scan test; our lower bounds follow from two complementary methods: truncated second moment via Wishart--GOE comparison, and tensorization of KL divergence. These results together settle the detection threshold at $d = \widetildeΘ(k^2 \vee k^6/n^3)$ for fixed $p$, and extend the state-of-the-art bounds from the full model (i.e., $k = n$) for vanishing $p$. On the computational side, we identify a computational--statistical gap and provide evidence via the low-degree polynomial framework, as well as the suboptimality of signed cycle counts of length $\ell \geq 4$.

artificial intelligence, inequality, machine learning, (19 more...)

arXiv.org Machine Learning

2603.24545

Country:

North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Hungary > Hajdú-Bihar County > Debrecen (0.04)

Genre:

Research Report (0.63)
Overview (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.61)

Add feedback

Noise-Adaptive Thompson Sampling for Linear Contextual Bandits

Neural Information Processing SystemsFeb-19-2026, 10:21:24 GMT

Linear contextual bandits represent a fundamental class of models with numerous real-world applications, and it is critical to developing algorithms that can effectively manage noise with unknown variance, ensuring provable guarantees for both worst-case constant-variance noise and deterministic reward scenarios.

artificial intelligence, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country: