AITopics

2605.29371

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.68)
Transportation > Electric Vehicle (0.68)
Automobiles & Trucks (0.68)
Energy > Energy Storage (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Neural Information Processing SystemsApr-25-2026, 09:21:32 GMT

8 Supplementary Material

Calculation of T Given data D, disaggregate Y into M equal-size bins, and the m-th bin is denoted as Bm. Let m = |Bm| denote the number of samples in Bm. For distribution p 2 (V A Y) conditioned on y in Bm, pV,A|ym, pV|ym and pA|ym are denoted as the joint distribution of (V,A), marginal distribution of V and A, respectively. As detailed in Section 5.1 of [33] and Algorithm 4 of [32], Um could be calculated through U-statistic. Specifically, in [33], they consider designing kernel as ij(av)= I(Ai = a,Vi = v) I(Ai = a)I(Vi = v), for i and j-th sample in Dt.

artificial intelligence, machine learning, prediction interval, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Neural Information Processing SystemsFeb-8-2026, 09:36:26 GMT

8 Supplementary Material 8.1 Details and Proofs for the Proposed EOC 8.1.1 Calculation of T Given data D

Fourier transform of a power of a Euclidean distance, i.e., According to Jensen's inequality and Lipschitzness assumption, we have X According to the law of total probability and Theorem 4.1, we have P { Y A feasible solution to Equation (1) can be quickly found. Pseudocode for Algorithm 2 The pseudocode for the constrained optimization is detailed in Algorithm 2. 18 Algorithm 2 Robust Optimization Method with EOC Constraint Input: Initiate Array A of shape 2 A M that stores the max possible step. Our proposed algorithm is highly computationally efficient.

artificial intelligence, machine learning, prediction interval, (15 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.54)

Wei, Aaron, Jalali, Milad, Sutherland, Danica J.

Maximum Mean Discrepancy with Unequal Sample Sizes via Generalized U-Statistics

arXiv.org Machine LearningDec-17-2025

Existing two-sample testing techniques, particularly those based on choosing a kernel for the Maximum Mean Discrepancy (MMD), often assume equal sample sizes from the two distributions. Applying these methods in practice can require discarding valuable data, unnecessarily reducing test power. W e address this long-standing limitation by extending the theory of generalized U-statistics and applying it to the usual MMD estimator, resulting in new characterization of the asymptotic distributions of the MMD estimator with unequal sample sizes (particularly outside the proportional regimes required by previous partial results). This generalization also provides a new criterion for optimizing the power of an MMD test with unequal sample sizes. Our approach preserves all available data, enhancing test accuracy and applicability in realistic settings. Along the way, we give much cleaner characterizations of the variance of MMD estimators, revealing something that might be surprising to those in the area: while zero MMD implies a degenerate estimator, it is sometimes possible to have a degenerate estimator with nonzero MMD as well; we give a construction and a proof that it does not happen in common situations.

estimator, kernel, variance, (15 more...)

2512.13997

Country:

North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > Canada > Quebec (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Neural Information Processing SystemsNov-21-2025, 11:53:43 GMT

979d472a84804b9f647bc185a877a8b5-Paper.pdf

artificial intelligence, fssd 2, machine learning, (16 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Bae, Gimun, Shin, Seung Jun

Scaling Up ROC-Optimizing Support Vector Machines

arXiv.org Machine LearningNov-10-2025

Binary classification is a fundamental problem in machine learning. Given a pair (X, Y), where X is a p-dimensional predictor and Y is a binary response taking values in { 1, 1}, the goal is to learn a decision function f of X that predicts Y by ˆ Y = sign{f(X)}. A canonical approach is to choose f that minimizes the classification error, or equivalently, maximizes the accuracy. For instance, the support vector machine (SVM; Vapnik, 1999) determines the decision function by maximizing the geometric margin, which effectively aligns with maximizing accuracy [Lin, 2002]. However, in imbalanced settings where one class is substantially underrepresented, accuracy can be a misleading measure of performance. Even a trivial classifier that always predicts the majority class can achieve high accuracy while completely failing to detect samples from the minor class. As an alternative, the receiver operating characteristic (ROC) curve is widely used to evaluate classifier performance under class imbalance. By definition, the ROC curve plots the true positive rate (TPR) against the false positive rate (FPR) to summarize classification performance, and the area under the ROC curve (AUC) serves as a popular scalar summary. A classifier with a larger AUC value is generally regarded as having better classification performance.

approximation, artificial intelligence, machine learning, (19 more...)

2511.04979

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Artificial IntelligenceAug-22-2025

A U-Statistic-based random forest approach for genetic interaction study

Li, Ming, Peng, Ruo-Sin, Wei, Changshuai, Lu, Qing

Variations in complex traits are influenced by multiple genetic variants, environmental risk factors, and their interactions. Though substantial progress has been made in identifying single genetic variants associated with complex traits, detecting the gene-gene and gene-environment interactions remains a great challenge. When a large number of genetic variants and environmental risk factors are involved, searching for interactions is limited to pair-wise interactions due to the exponentially increased feature space and computational intensity. Alternatively, recursive partitioning approaches, such as random forests, have gained popularity in high-dimensional genetic association studies. In this article, we propose a U-Statistic-based random forest approach, referred to as Forest U-Test, for genetic association studies with quantitative traits. Through simulation studies, we showed that the Forest U-Test outperformed existing methods. The proposed method was also applied to study Cannabis Dependence CD, using three independent datasets from the Study of Addiction: Genetics and Environment. A significant joint association was detected with an empirical p-value less than 0.001. The finding was also replicated in two independent datasets with p-values of 5.93e-19 and 4.70e-17, respectively.

artificial intelligence, decision tree learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.2741/e576

2508.14924

Country: North America > United States > Michigan (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.86)

arXiv.org Machine LearningMar-4-2025

A Practical Introduction to Kernel Discrepancies: MMD, HSIC & KSD

Schrab, Antonin

This article provides a practical introduction to kernel discrepancies, focusing on the Maximum Mean Discrepancy (MMD), the Hilbert-Schmidt Independence Criterion (HSIC), and the Kernel Stein Discrepancy (KSD). Various estimators for these discrepancies are presented, including the commonly-used V-statistics and U-statistics, as well as several forms of the more computationallyefficient incomplete U-statistics. The importance of the choice of kernel bandwidth is stressed, showing how it affects the behaviour of the discrepancy estimation. Adaptive estimators are introduced, which combine multiple estimators with various kernels, addressing the problem of kernel selection. This paper corresponds to the introduction of my PhD thesis (Schrab, 2025a, Chapter 2) and is presented as a standalone article to introduce the reader to kernel discrepancies estimators. First, in Section 1, we define kernels, Reproducing Kernel Hilbert Spaces, mean embeddings and cross-covariance operators, and present kernel properties such as characteristicity, universality and translation invariance. Then, in Section 2, we introduce the Maximum Mean Discprecancy, the Hilbert-Schmidt Independence Criterion, and the Kernel Stein Discrepancy, as well as their estimators, and we discuss the importance of the choice of kernel for such measures. We then introduce a collection of statistics in Section 3, including the commonly-used complete statistics, as well as their incomplete counterparts which trade accuracy for computational efficiency. Finally, in Section 4, we construct adaptive estimators combining multiple statistics with various kernels, which is one method to address the problem of kernel selection.

kernel, kernel discrepancy, practical introduction, (13 more...)

2503.0482

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Mun, Jongmin, Kwak, Seungwoo, Kim, Ilmun

Minimax Optimal Two-Sample Testing under Local Differential Privacy

arXiv.org Machine LearningNov-22-2024

We explore the trade-off between privacy and statistical utility in private two-sample testing under local differential privacy (LDP) for both multinomial and continuous data. We begin by addressing the multinomial case, where we introduce private permutation tests using practical privacy mechanisms such as Laplace, discrete Laplace, and Google's RAPPOR. We then extend our multinomial approach to continuous data via binning and study its uniform separation rates under LDP over H\"older and Besov smoothness classes. The proposed tests for both discrete and continuous cases rigorously control the type I error for any finite sample size, strictly adhere to LDP constraints, and achieve minimax separation rates under LDP. The attained minimax rates reveal inherent privacy-utility trade-offs that are unavoidable in private testing. To address scenarios with unknown smoothness parameters in density testing, we propose an adaptive test based on a Bonferroni-type approach that ensures robust performance without prior knowledge of the smoothness parameters. We validate our theoretical findings with extensive numerical experiments and demonstrate the practical relevance and effectiveness of our proposed methods.

artificial intelligence, machine learning, mechanism, (18 more...)

2411.09064

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Neural Information Processing SystemsOct-3-2024, 23:20:40 GMT

A Linear-Time Kernel Goodness-of-Fit Test

We propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples. We learn the test features that best indicate the differences between observed samples and a reference model, by minimizing the false negative rate. These features are constructed via Stein's method, meaning that it is not necessary to compute the normalising constant of the model. We analyse the asymptotic Bahadur efficiency of the new test, and prove that under a mean-shift alternative, our test always has greater relative efficiency than a previous linear-time kernel test, regardless of the choice of parameters for that test. In experiments, the performance of our method exceeds that of the earlier linear-time test, and matches or exceeds the power of a quadratic-time kernel test. In high dimensions and where model structure may be exploited, our goodness of fit test performs far better than a quadratic-time two-sample test based on the Maximum Mean Discrepancy, with samples drawn from the model.

efficiency, fssd 2, test power, (13 more...)

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New York (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)