AITopics | shuffling

Collaborating Authors

shuffling

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Outline

Neural Information Processing SystemsApr-26-2026, 12:59:44 GMT

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Data Science > Data Mining (0.67)

Add feedback

65ccdfe02045fa0b823c5fa7ffd56b66-Paper-Conference.pdf

Neural Information Processing SystemsApr-26-2026, 12:59:40 GMT

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.98)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

Scalable DP-SGD: Shuffling vs. Poisson Subsampling

Neural Information Processing SystemsMar-21-2026, 08:33:47 GMT

We provide new lower bounds on the privacy guarantee of Adaptive Batch Linear Queries (ABLQ) mechanism with, demonstrating substantial gaps when compared to; prior analysis was limited to a single epoch.Since the privacy analysis of Differentially Private Stochastic Gradient Descent (DP-SGD) is obtained by analyzing the ABLQ mechanism, this brings into serious question the common practice of implementing Shuffling based DP-SGD, but reporting privacy parameters as if Poisson subsampling was used.To understand the impact of this gap on the utility of trained machine learning models, we introduce a novel practical approach to implement Poisson subsampling using massively parallel computation, and efficiently train models with the same.We provide a comparison between the utility of models trained with Poisson subsampling based DP-SGD, and the optimistic estimates of utility when using shuffling, via our new lower bounds on the privacy guarantee of ABLQ with shuffling.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)

Add feedback

Scalable DP-SGD: Shuffling vs. Poisson Subsampling

Neural Information Processing SystemsFeb-16-2026, 04:55:30 GMT

A common approach for private training of differentiable models, such as neural networks, is to apply first-order methods with noisy gradients.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

65ccdfe02045fa0b823c5fa7ffd56b66-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 11:46:45 GMT

We show the utility of our method by applying it to gradient descent with shuffling and mini-batch gradient descent, reaffirming key results from existing literature under a unified framework.

artificial intelligence, machine learning, markov chain, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

PrivacyAmplificationviaRandomCheck-Ins

Neural Information Processing SystemsFeb-8-2026, 00:15:21 GMT

In this paper, we focus on conducting iterative methods like DP-SGD in the setting of federated learning (FL) wherein the data is distributed among many devices(clients).

amplification, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Florida > Broward County > Fort Lauderdale (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(3 more...)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.93)

Add feedback

Same model, better performance: the impact of shuffling on DNA Language Models benchmarking

Greco, Davide, Rawlik, Konrad

arXiv.org Artificial IntelligenceDec-12-2025

Large Language Models are increasingly popular in genomics due to their potential to decode complex biological sequences. Hence, researchers require a standardized benchmark to evaluate DNA Language Models (DNA LMs) capabilities. However, evaluating DNA LMs is a complex task that intersects genomic's domain-specific challenges and machine learning methodologies, where seemingly minor implementation details can significantly compromise benchmark validity. We demonstrate this through BEND (Benchmarking DNA Language Models), where hardware-dependent hyperparameters -- number of data loading workers and buffer sizes -- create spurious performance variations of up to 4% for identical models. The problem stems from inadequate data shuffling interacting with domain specific data characteristics. Experiments with three DNA language models (HyenaDNA, DNABERT-2, ResNet-LM) show these artifacts affect both absolute performance and relative model rankings. We propose a simple solution: pre-shuffling data before storage eliminates hardware dependencies while maintaining efficiency. This work highlights how standard ML practices can interact unexpectedly with domain-specific data characteristics, with broader implications for benchmark design in specialized domains.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.12617

Genre: Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

Add feedback

81c252b45b3bd7d9bf080eb27794b762-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 07:43:06 GMT

batch, dp-sgd, poisson, (16 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Data Science (0.67)

Add feedback

Oblivious Sampling Algorithms for Private Data Analysis

Sajin Sasy, Olga Ohrimenko

Neural Information Processing SystemsOct-3-2025, 00:02:58 GMT

Trusted execution environments (TEEs) can be used to protect the content of the data during query computation, while supporting differential-private (DP) queries in TEEs provides record privacy when query output is revealed.

data mining, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (1.00)

Technology: