AITopics | maximal inequality

Collaborating Authors

maximal inequality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Theory for Kernel Bilevel Optimization

Neural Information Processing SystemsJun-15-2026, 01:57:30 GMT

Bilevel optimization has emerged as a technique for addressing a wide range of machine learning problems that involve an outer objective implicitly determined by the minimizer of an inner problem. While prior works have primarily focused on the parametric setting, a learning-theoretic foundation for bilevel optimization in the nonparametric case remains relatively unexplored. In this paper, we take a first step toward bridging this gap by studying Kernel Bilevel Optimization (KBO), where the inner objective is optimized over a reproducing kernel Hilbert space. This setting enables rich function approximation while providing a foundation for rigorous theoretical analysis. In this context, we derive novel finite-sample generalization bounds for KBO, leveraging tools from empirical process theory. These bounds further allow us to assess the statistical accuracy of gradient-based methods applied to the empirical discretization of KBO. We numerically illustrate our theoretical findings on a synthetic instrumental variable regression task.

artificial intelligence, inequality, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > France (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

9a0684d9dad4967ddd09594511de2c52-Paper.pdf

Neural Information Processing SystemsFeb-13-2026, 03:01:48 GMT

aggregate regret, algorithm, probability, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Indiana (0.04)
North America > United States > Illinois (0.04)
North America > Canada (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

MinimaxOptimalNonparametricEstimationof HeterogeneousTreatmentEffects

Neural Information Processing SystemsFeb-11-2026, 04:22:13 GMT

Causal inference aims todrawacausal relationship between some treatment and targetresponses.

artificial intelligence, covariate, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

597c7b407a02cc0a92167e7a371eca25-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-8-2026, 12:46:25 GMT

algorithm, inequality, state space, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.34)

Add feedback

Online Policy Learning via a Self-Normalized Maximal Inequality

Girard, Samuel, Bibaut, Aurélien, Zenati, Houssam

arXiv.org Machine LearningOct-20-2025

Adaptive experiments produce dependent data that break i.i.d. assumptions that underlie classical concentration bounds and invalidate standard learning guarantees. In this paper, we develop a self-normalized maximal inequality for martingale empirical processes. Building on this, we first propose an adaptive sample-variance penalization procedure which balances empirical loss and sample variance, valid for general dependent data. Next, this allows us to derive a new variance-regularized pessimistic off-policy learning objective, for which we establish excess-risk guarantees. Subsequently, we show that, when combined with sequential updates and under standard complexity and margin conditions, the resulting estimator achieves fast convergence rates in both parametric and nonparametric regimes, improving over the usual $1/\sqrt{n}$ baseline. We complement our theoretical findings with numerical simulations that illustrate the practical gains of our approach.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2510.15483

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Sardinia (0.04)

Genre:

Research Report > New Finding (0.46)
Instructional Material > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Thresholding Bandit with Optimal Aggregate Regret

Neural Information Processing SystemsOct-3-2025, 07:06:41 GMT

We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal.

aggregate regret, algorithm, probability, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Indiana (0.04)
North America > United States > Illinois (0.04)
North America > Canada (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

597c7b407a02cc0a92167e7a371eca25-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 23:58:58 GMT

artificial intelligence, inequality, state space, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.34)

Add feedback

a0ae15571eb4a97ac1c34a114f1bb179-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 11:51:45 GMT

artificial intelligence, data mining, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Cameroon (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

Learning Theory for Kernel Bilevel Optimization

Khoury, Fares El, Pauwels, Edouard, Vaiter, Samuel, Arbel, Michael

arXiv.org Artificial IntelligenceFeb-12-2025

Bilevel optimization has emerged as a technique for addressing a wide range of machine learning problems that involve an outer objective implicitly determined by the minimizer of an inner problem. In this paper, we investigate the generalization properties for kernel bilevel optimization problems where the inner objective is optimized over a Reproducing Kernel Hilbert Space. This setting enables rich function approximation while providing a foundation for rigorous theoretical analysis. In this context, we establish novel generalization error bounds for the bilevel problem under finite-sample approximation. Our approach adopts a functional perspective, inspired by (Petrulionyte et al., 2024), and leverages tools from empirical process theory and maximal inequalities for degenerate $U$-processes to derive uniform error bounds. These generalization error estimates allow to characterize the statistical accuracy of gradient-based methods applied to the empirical discretization of the bilevel problem.

artificial intelligence, inequality, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.08457

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback

Trade-off Between Dependence and Complexity for Nonparametric Learning -- an Empirical Process Approach

Deb, Nabarun, Mukherjee, Debarghya

arXiv.org Machine LearningJan-17-2024

Empirical process theory for i.i.d. observations has emerged as a ubiquitous tool for understanding the generalization properties of various statistical problems. However, in many applications where the data exhibit temporal dependencies (e.g., in finance, medical imaging, weather forecasting etc.), the corresponding empirical processes are much less understood. Motivated by this observation, we present a general bound on the expected supremum of empirical processes under standard $\beta/\rho$-mixing assumptions. Unlike most prior work, our results cover both the long and the short-range regimes of dependence. Our main result shows that a non-trivial trade-off between the complexity of the underlying function class and the dependence among the observations characterizes the learning rate in a large class of nonparametric problems. This trade-off reveals a new phenomenon, namely that even under long-range dependence, it is possible to attain the same rates as in the i.i.d. setting, provided the underlying function class is complex enough. We demonstrate the practical implications of our findings by analyzing various statistical estimators in both fixed and growing dimensions. Our main examples include a comprehensive case study of generalization error bounds in nonparametric regression over smoothness classes in fixed as well as growing dimension using neural nets, shape-restricted multivariate convex regression, estimating the optimal transport (Wasserstein) distance between two probability distributions, and classification under the Mammen-Tsybakov margin condition -- all under appropriate mixing assumptions. In the process, we also develop bounds on $L_r$ ($1\le r\le 2$)-localized empirical processes with dependent observations, which we then leverage to get faster rates for (a) tuning-free adaptation, and (b) set-structured learning problems.

empirical process, inequality, maximal inequality, (17 more...)

arXiv.org Machine Learning

2401.08978

Country: