AITopics | data diet

Collaborating Authors

data diet

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Learning on a Data Diet: Finding Important Examples Early in Training

Neural Information Processing SystemsDec-24-2025, 17:13:38 GMT

Recent success in deep learning has partially been driven by training increasingly overparametrized networks on ever larger datasets. It is therefore natural to ask: how much of the data is superfluous, which examples are important for generalization, and how do we find them? In this work, we make the striking observation that, in standard vision datasets, simple scores averaged over several weight initializations can be used to identify important examples very early in training. We propose two such scores--the Gradient Normed (GraNd) and the Error L2-Norm (EL2N) scores--and demonstrate their efficacy on a range of architectures and datasets by pruning significant fractions of training data without sacrificing test accuracy. In fact, using EL2N scores calculated a few epochs into training, we can prune half of the CIFAR10 training set while slightly improving test accuracy. Furthermore, for a given dataset, EL2N scores from one architecture or hyperparameter configuration generalize to other configurations. Compared to recent work that prunes data by discarding examples that are rarely forgotten over the course of training, our scores use only local information early in training. We also use our scores to detect noisy examples and study training dynamics through the lens of important examples--we investigate how the data distribution shapes the loss surface and identify subspaces of the model's data representation that are relatively stable over training.

data diet, deep learning, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks

Neural Information Processing SystemsDec-24-2025, 12:26:53 GMT

A striking observation about iterative magnitude pruning (IMP; Frankle et al. 2020) is that--after just a few hundred steps of dense training--the method can find a sparse sub-network that can be trained to the same accuracy as the dense network. However, the same does not hold at step 0, i.e. random initialization. In this work, we seek to understand how this early phase of pre-training leads to a good initialization for IMP both through the lens of the data distribution and the loss landscape geometry. Empirically we observe that, holding the number of pre-training iterations constant, training on a small fraction of (randomly chosen) data suffices to obtain an equally good initialization for IMP. We additionally observe that by pre-training only on easy training data, we can decrease the number of steps necessary to find a good initialization for IMP compared to training on the full dataset or a randomly chosen subset. Finally, we identify novel properties of the loss landscape of dense networks that are predictive of IMP performance, showing in particular that more examples being linearly mode connected in the dense network correlates well with good initializations for IMP. Combined, these results provide new insight into the role played by the early phase training in IMP.

good initialization, lottery ticket, name change, (6 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Gambling (0.43)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Deep Learning on a Data Diet: Finding Important Examples Early in Training

Neural Information Processing SystemsJan-18-2025, 15:26:55 GMT

data diet, dataset, deep learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback

Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks

Neural Information Processing SystemsJan-13-2025, 20:16:37 GMT

A striking observation about iterative magnitude pruning (IMP; Frankle et al. 2020) is that--after just a few hundred steps of dense training--the method can find a sparse sub-network that can be trained to the same accuracy as the dense network. However, the same does not hold at step 0, i.e. random initialization. In this work, we seek to understand how this early phase of pre-training leads to a good initialization for IMP both through the lens of the data distribution and the loss landscape geometry. Empirically we observe that, holding the number of pre-training iterations constant, training on a small fraction of (randomly chosen) data suffices to obtain an equally good initialization for IMP. We additionally observe that by pre-training only on "easy" training data, we can decrease the number of steps necessary to find a good initialization for IMP compared to training on the full dataset or a randomly chosen subset.

good initialization, lottery ticket, sparse trainable network, (3 more...)

Neural Information Processing Systems

Genre: Contests & Prizes (0.40)

Industry: Leisure & Entertainment > Gambling (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Self-Driving Cars Are Being Put on a Data Diet

WIREDMay-11-2023, 11:00:00 GMT

For self-driving-car developers, like many iPhone and Google Photos users, the growing cost of storing files on the cloud has become a nagging headache. Early on, robocar companies pursued a brute-force approach to maximize miles and data. "We could take all the data the cars have seen over time, the hundreds of thousands of pedestrians, cyclists, and vehicles, [and] take from that a model of how we expect them to move," said Chris Urmson, an early leader of Google's self-driving project, in a 2015 TED Talk. Urmson spoke at a time when autonomous vehicle prototypes were relatively few and the handful of companies testing them could afford to keep almost every data point they scooped up from the road. But nearly a decade later, Google's project and many others have fallen far behind their own predictions of the timeline for success.

data diet, self-driving car, vehicle, (3 more...)

WIRED

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.75)
Information Technology > Robotics & Automation (0.75)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)

Add feedback

Does "Deep Learning on a Data Diet" reproduce? Overall yes, but GraNd at Initialization does not

Kirsch, Andreas

arXiv.org Artificial IntelligenceMar-26-2023

The paper 'Deep Learning on a Data Diet' by Paul et al. (2021) introduces two innovative metrics for pruning datasets during the training of neural networks. While we are able to replicate the results for the EL2N score at epoch 20, the same cannot be said for the GraNd score at initialization. The GraNd scores later in training provide useful pruning signals, however. The GraNd score at initialization calculates the average gradient norm of an input sample across multiple randomly initialized models before any training has taken place. Our analysis reveals a strong correlation between the GraNd score at initialization and the input norm of a sample, suggesting that the latter could have been a cheap new baseline for data pruning. Unfortunately, neither the GraNd score at initialization nor the input norm surpasses random pruning in performance. This contradicts one of the findings in Paul et al. (2021). We were unable to reproduce their CIFAR-10 results using both an updated version of the original JAX repository and in a newly implemented PyTorch codebase. An investigation of the underlying JAX/FLAX code from 2021 surfaced a bug in the checkpoint restoring code that was fixed in April 2021 (https://github.com/google/flax/commit/28fbd95500f4bf2f9924d2560062fa50e919b1a5).

artificial intelligence, initialization, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2303.14753

Country: Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks

Paul, Mansheej, Larsen, Brett W., Ganguli, Surya, Frankle, Jonathan, Dziugaite, Gintare Karolina

arXiv.org Machine LearningJun-2-2022

A striking observation about iterative magnitude pruning (IMP; Frankle et al. 2020) is that $\unicode{x2014}$ after just a few hundred steps of dense training $\unicode{x2014}$ the method can find a sparse sub-network that can be trained to the same accuracy as the dense network. However, the same does not hold at step 0, i.e. random initialization. In this work, we seek to understand how this early phase of pre-training leads to a good initialization for IMP both through the lens of the data distribution and the loss landscape geometry. Empirically we observe that, holding the number of pre-training iterations constant, training on a small fraction of (randomly chosen) data suffices to obtain an equally good initialization for IMP. We additionally observe that by pre-training only on "easy" training data, we can decrease the number of steps necessary to find a good initialization for IMP compared to training on the full dataset or a randomly chosen subset. Finally, we identify novel properties of the loss landscape of dense networks that are predictive of IMP performance, showing in particular that more examples being linearly mode connected in the dense network correlates well with good initializations for IMP. Combined, these results provide new insight into the role played by the early phase training in IMP.

artificial intelligence, machine learning, sparse trainable network, (3 more...)

arXiv.org Machine Learning

2206.01278

Genre:

Research Report (0.40)
Contests & Prizes (0.40)

Industry: Leisure & Entertainment > Gambling (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

An Intelligence In Our Image: The Risks Of Bias And Errors In Artificial Intelligence - Liwaiwai

#artificialintelligenceOct-8-2019, 21:09:18 GMT

Right now, artificial intelligence (AI) and countless algorithms are integrated into our daily life. Because of the efficiency they bring into the table, the use of AI is only expected to widen. With humanity becoming more and more reliant on this technology, it is only natural to think about the implications. In contrast to the common impression that AI and algorithms are impartial and infallible, these technologies can fail miserably. William Welser IV and Osonde Osoba's An Intelligence in Our Image: The Risks of Bias and Errors in Artificial Intelligence evaluates algorithms and AI -- which they group together under the moniker, artificial agents -- its shortcomings and how they can be combated.

algorithm, artificial agent, intelligence, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.34)

Add feedback