AITopics | Statistical Learning

We focus on the task of learning a single index model σ(w x) with respect to the isotropic Gaussian distribution in d dimensions. Prior work has shown that the sample complexity of learning w is governed by the information exponent k of the link function σ, which is defined as the index of the first nonzero Hermite coefficient of σ.

artificial intelligence, gradient descent, machine learning, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)

Add feedback

Defending against Data-Free Model Extraction by Distributionally Robust Defensive Training

Neural Information Processing SystemsApr-24-2026, 05:29:47 GMT

Data-Free Model Extraction (DFME) aims to clone a black-box model without knowing its original training data distribution, making it much easier for attackers to steal commercial models. Defense against DFME faces several challenges: (i) effectiveness; (ii) efficiency; (iii) no prior on the attacker's query data distribution and strategy. However, existing defense methods: (1) are highly computation and memory inefficient; or (2) need strong assumptions about attack data distribution; or (3) can only delay the attack or prove a model theft after the model stealing has happened. In this work, we propose a Memory and Computation efficient defense approach, named MeCo, to prevent DFME from happening while maintaining the model utility simultaneously by distributionally robust defensive training on the target victim model. Specifically, we randomize the input so that it: (1) causes a mismatch of the knowledge distillation loss for attackers; (2) disturbs the zerothorder gradient estimation; (3) changes the label prediction for the attack query data. Therefore, the attacker can only extract misleading information from the black-box model. Extensive experiments on defending against both decision-based and scorebased DFME demonstrate that MeCo can significantly reduce the effectiveness of existing DFME methods and substantially improve running efficiency.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Maryland (0.28)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Security & Privacy (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

ASimple Solution for Offline Imitation from Observations and Examples with Possibly Incomplete Trajectories

Neural Information Processing SystemsApr-24-2026, 05:29:30 GMT

Offline imitation from observations aims to solve MDPs where only task-specific expert states and task-agnostic non-expert state-action pairs are available. Offline imitation is useful in real-world scenarios where arbitrary interactions are costly and expert actions are unavailable. The state-of-the-art'DIstribution Correction Estimation' (DICE) methods minimize divergence of state occupancy between expert and learner policies and retrieve a policy with weighted behavior cloning; however, their results are unstable when learning from incomplete trajectories, due to a non-robust optimization in the dual domain. To address the issue, in this paper, we propose Trajectory-Aware Imitation Learning from Observations (TAILO). TAILO uses a discounted sum along the future trajectory as the weight for weighted behavior cloning. The terms for the sum are scaled by the output of a discriminator, which aims to identify expert states. Despite simplicity, TAILO works well if there exist trajectories or segments of expert behavior in the task-agnostic data, a common assumption in prior work. In experiments across multiple testbeds, we find TAILO to be more robust and effective, particularly with incomplete trajectories.

machine learning, natural language, trajectory, (17 more...)

Neural Information Processing Systems

Genre:

Instructional Material (0.46)
Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

015a8c69bedcb0a7b2ed2e1678f34399-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 05:03:36 GMT

annotator, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Scalable Membership Inference Attacks via Quantile Regression

Neural Information Processing SystemsApr-24-2026, 05:03:03 GMT

Membership inference attacks are designed to determine, using black box access to trained models, whether a particular example was used in training or not. Membership inference can be formalized as a hypothesis testing problem. The most effective existing attacks estimate the distribution of some test statistic (usually the model's confidence on the true label) on points that were (and were not) used in training by training many shadow models--i.e.

artificial intelligence, machine learning, membership inference attack, (14 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Provable Adversarial Robustness for Group Equivariant Tasks: Graphs, Point Clouds, Molecules, and More

Neural Information Processing SystemsApr-24-2026, 04:43:57 GMT

A machine learning model is traditionally considered robust if its prediction remains (almost) constant under input perturbations with small norm. However, real-world tasks like molecular property prediction or point cloud segmentation have inherent equivariances, such as rotation or permutation equivariance. In such tasks, even perturbations with large norm do not necessarily change an input's semantic content. Furthermore, there are perturbations for which a model's prediction explicitly needs to change. For the first time, we propose a sound notion of adversarial robustness that accounts for task equivariance.

artificial intelligence, machine learning, robustness, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (0.68)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)
Energy (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Adaptive Selective Sampling for Online Prediction with Experts

Neural Information Processing SystemsApr-24-2026, 04:42:40 GMT

We consider online prediction of a binary sequence with expert advice. For this setting, we devise label-efficient forecasting algorithms, which use a selective sampling scheme that enables collecting much fewer labels than standard procedures. For the general case without a perfect expert, we prove best-of-both-worlds guarantees, demonstrating that the proposed forecasting algorithm always queries sufficiently many labels in the worst case to obtain optimal regret guarantees, while simultaneously querying much fewer labels in more benign settings. Specifically, for a scenario where one expert is strictly better than the others in expectation, we show that the label complexity of the label-efficient forecaster is roughly upperbounded by the square root of the number of rounds. Finally, we present numerical experiments empirically showing that the normalized regret of the label-efficient forecaster can asymptotically match known minimax rates for pool-based active learning, suggesting it can optimally adapt to benign settings.

artificial intelligence, forecaster, machine learning, (14 more...)

Neural Information Processing Systems

Country: