AITopics | Engstrom, Logan

Collaborating Authors

Engstrom, Logan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimizing ML Training with Metagradient Descent

Engstrom, Logan, Ilyas, Andrew, Chen, Benjamin, Feldmann, Axel, Moses, William, Madry, Aleksander

arXiv.org Machine LearningMar-17-2025

A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based approach to this problem. We first introduce an algorithm for efficiently calculating metagradients -- gradients through model training -- at scale. We then introduce a "smooth model training" framework that enables effective optimization using metagradients. With metagradient descent (MGD), we greatly improve on existing dataset selection methods, outperform accuracy-degrading data poisoning attacks by an order of magnitude, and automatically find competitive learning rate schedules.

artificial intelligence, machine learning, metagradient, (16 more...)

arXiv.org Machine Learning

2503.13751

Country:

Asia (0.28)
North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.46)
Information Technology (0.45)
Health & Medicine > Diagnostic Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

DsDm: Model-Aware Dataset Selection with Datamodels

Engstrom, Logan, Feldmann, Axel, Madry, Aleksander

arXiv.org Artificial IntelligenceJan-23-2024

When selecting data for training large-scale models, standard practice is to filter for examples that match human notions of data quality. Such filtering yields qualitatively clean datapoints that intuitively should improve model behavior. However, in practice the opposite can often happen: we find that selecting according to similarity with "high quality" data sources may not increase (and can even hurt) performance compared to randomly selecting data. To develop better methods for selecting data, we start by framing dataset selection as an optimization problem that we can directly solve for: given target tasks, a learning algorithm, and candidate data, select the subset that maximizes model performance. This framework thus avoids handpicked notions of data quality, and instead models explicitly how the learning process uses train datapoints to predict on the target tasks. Our resulting method greatly improves language model (LM) performance on both pre-specified tasks and previously unseen tasks. Specifically, choosing target tasks representative of standard LM problems and evaluating on diverse held-out benchmarks, our selected datasets provide a 2x compute multiplier over baseline methods.

large language model, machine learning, target task, (20 more...)

arXiv.org Artificial Intelligence

2401.12926

Country:

Oceania (1.00)
Asia (1.00)
Africa (1.00)
(2 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Leisure & Entertainment (1.00)
Education (1.00)
Media (0.92)
(4 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
(2 more...)

Add feedback

FFCV: Accelerating Training by Removing Data Bottlenecks

Leclerc, Guillaume, Ilyas, Andrew, Engstrom, Logan, Park, Sung Min, Salman, Hadi, Madry, Aleksander

arXiv.org Artificial IntelligenceJun-21-2023

We present FFCV, a library for easy and fast machine learning model training. FFCV speeds up model training by eliminating (often subtle) data bottlenecks from the training process. In particular, we combine techniques such as an efficient file storage format, caching, data pre-loading, asynchronous data transfer, and just-in-time compilation to (a) make data loading and transfer significantly more efficient, ensuring that GPUs can reach full utilization; and (b) offload as much data processing as possible to the CPU asynchronously, freeing GPU cycles for training. Using FFCV, we train ResNet-18 and ResNet-50 on the ImageNet dataset with competitive tradeoff between accuracy and training time. For example, we are able to train an ImageNet ResNet-50 model to 75\% in only 20 mins on a single machine. We demonstrate FFCV's performance, ease-of-use, extensibility, and ability to adapt to resource constraints through several case studies. Detailed installation instructions, documentation, and Slack support channel are available at https://ffcv.io/ .

artificial intelligence, ffcv, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.12517

Country: North America > United States (0.29)

Genre: Research Report (0.64)

Industry: Information Technology (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation

Vendrow, Joshua, Jain, Saachi, Engstrom, Logan, Madry, Aleksander

arXiv.org Artificial IntelligenceJun-19-2023

Distribution shift is a major source of failure for machine learning models. However, evaluating model reliability under distribution shift can be challenging, especially since it may be difficult to acquire counterfactual examples that exhibit a specified shift. In this work, we introduce the notion of a dataset interface: a framework that, given an input dataset and a user-specified shift, returns instances from that input distribution that exhibit the desired shift. We study a number of natural implementations for such an interface, and find that they often introduce confounding shifts that complicate model evaluation. Motivated by this, we propose a dataset interface implementation that leverages Textual Inversion to tailor generation to the input distribution. We then demonstrate how applying this dataset interface to the ImageNet dataset enables studying model behavior across a diverse array of distribution shifts, including variations in background, lighting, and attributes of the objects. Code available at https://github.com/MadryLab/dataset-interfaces.

artificial intelligence, distribution shift, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.07865

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Government > Military (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Datamodels: Predicting Predictions from Training Data

Ilyas, Andrew, Park, Sung Min, Engstrom, Logan, Leclerc, Guillaume, Madry, Aleksander

arXiv.org Machine LearningFeb-1-2022

We present a conceptual framework, datamodeling, for analyzing the behavior of a model class in terms of the training data. For any fixed "target" example $x$, training set $S$, and learning algorithm, a datamodel is a parameterized function $2^S \to \mathbb{R}$ that for any subset of $S' \subset S$ -- using only information about which examples of $S$ are contained in $S'$ -- predicts the outcome of training a model on $S'$ and evaluating on $x$. Despite the potential complexity of the underlying process being approximated (e.g., end-to-end training and evaluation of deep neural networks), we show that even simple linear datamodels can successfully predict model outputs. We then demonstrate that datamodels give rise to a variety of applications, such as: accurately predicting the effect of dataset counterfactuals; identifying brittle predictions; finding semantically similar examples; quantifying train-test leakage; and embedding data into a well-behaved and feature-rich representation space. Data for this paper (including pre-computed datamodels as well as raw predictions from four million trained deep neural networks) is available at https://github.com/MadryLab/datamodels-data .

artificial intelligence, datamodel, machine learning, (21 more...)

arXiv.org Machine Learning

2202.00622

Country:

North America > United States (0.46)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.92)

Industry: Transportation > Ground (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

3DB: A Framework for Debugging Computer Vision Models

Leclerc, Guillaume, Salman, Hadi, Ilyas, Andrew, Vemprala, Sai, Engstrom, Logan, Vineet, Vibhav, Xiao, Kai, Zhang, Pengchuan, Santurkar, Shibani, Yang, Greg, Kapoor, Ashish, Madry, Aleksander

arXiv.org Machine LearningJun-7-2021

We introduce 3DB: an extendable, unified framework for testing and debugging vision models using photorealistic simulation. We demonstrate, through a wide range of use cases, that 3DB allows users to discover vulnerabilities in computer vision systems and gain insights into how models make decisions. 3DB captures and generalizes many robustness analyses from prior work, and enables one to study their interplay. Finally, we find that the insights generated by the system transfer to the physical world. We are releasing 3DB as a library (https://github.com/3db/3db) alongside a set of example analyses, guides, and documentation: https://3db.github.io/3db/ .

deep learning, neural network, us government, (19 more...)

arXiv.org Machine Learning

2106.03805

Country: North America > United States (0.94)

Genre: Research Report > Experimental Study (0.46)

Industry:

Leisure & Entertainment (0.94)
Government > Military (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Identifying Statistical Bias in Dataset Replication

Engstrom, Logan, Ilyas, Andrew, Santurkar, Shibani, Tsipras, Dimitris, Steinhardt, Jacob, Madry, Aleksander

arXiv.org Machine LearningSep-2-2020

The primary objective of supervised learning is to develop models that generalize robustly to unseen data. Benchmark test sets provide a proxy for out-of-sample performance, but can outlive their usefulness in some cases. For example, evaluating on benchmarks alone may steer us towards models that adaptively overfit [Reu03; RFR08; Dwo 15] to the finite test set and do not generalize. Alternatively, we might select for models that are sensitive to insignificant aspects of the dataset creation process and thus do not generalize robustly (e.g., models that are sensitive to the exact set of humans who annotated the test set). To diagnose these issues, recent work has generated new, previously "unseen" testbeds for standard datasets through a process known as dataset replication. Though not yet widespread in machine learning, dataset replication is a natural analogue to experimental replication studies in the natural sciences (cf.

artificial intelligence, machine learning, selection frequency, (19 more...)

arXiv.org Machine Learning

2005.09619

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.46)
Transportation (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Do Adversarially Robust ImageNet Models Transfer Better?

Salman, Hadi, Ilyas, Andrew, Engstrom, Logan, Kapoor, Ashish, Madry, Aleksander

arXiv.org Machine LearningJul-16-2020

Transfer learning is a widely-used paradigm in deep learning, where models pre-trained on standard datasets can be efficiently adapted to downstream tasks. Typically, better pre-trained models yield better transfer results, suggesting that initial accuracy is a key aspect of transfer learning performance. In this work, we identify another such aspect: we find that adversarially robust models, while less accurate, often perform better than their standard-trained counterparts when used for transfer learning. Specifically, we focus on adversarially robust ImageNet classifiers, and show that they yield improved accuracy on a standard suite of downstream classification tasks. Further analysis uncovers more differences between robust and standard models in the context of transfer learning. Our results are consistent with (and in fact, add to) recent hypotheses stating that robustness leads to improved feature representations. Our code and models are available at https://github.com/Microsoft/robust-models-transfer .

accuracy, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

2007.08489

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO

Engstrom, Logan, Ilyas, Andrew, Santurkar, Shibani, Tsipras, Dimitris, Janoos, Firdaus, Rudolph, Larry, Madry, Aleksander

arXiv.org Machine LearningMay-25-2020

We study the roots of algorithmic progress in deep policy gradient algorithms through a case study on two popular algorithms: Proximal Policy Optimization (PPO) and Trust Region Policy Optimization (TRPO). Specifically, we investigate the consequences of "code-level optimizations:" algorithm augmentations found only in implementations or described as auxiliary details to the core algorithm. Seemingly of secondary importance, such optimizations turn out to have a major impact on agent behavior. Our results show that they (a) are responsible for most of PPO's gain in cumulative reward over TRPO, and (b) fundamentally change how RL methods function. These insights show the difficulty and importance of attributing performance gains in deep reinforcement learning. Code for reproducing our results is available at https://github.com/MadryLab/implementation-matters .

artificial intelligence, optimization, reinforcement learning, (18 more...)

arXiv.org Machine Learning

2005.12729

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Add feedback

From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

Tsipras, Dimitris, Santurkar, Shibani, Engstrom, Logan, Ilyas, Andrew, Madry, Aleksander

arXiv.org Machine LearningMay-22-2020

Building rich machine learning datasets in a scalable manner often necessitates a crowd-sourced data collection pipeline. In this work, we use human studies to investigate the consequences of employing such a pipeline, focusing on the popular ImageNet dataset. We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset---including the introduction of biases that state-of-the-art models exploit. Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for. Finally, our findings emphasize the need to augment our current model training and evaluation toolkit to take such misalignments into account. To facilitate further research, we release our refined ImageNet annotations at https://github.com/MadryLab/ImageNetMultiLabel.

crowdsourcing, imagenet label, neural network, (22 more...)

arXiv.org Machine Learning

2005.11295

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.88)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Leisure & Entertainment > Sports (1.00)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback