AITopics

2502.09858

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceFeb-3-2025

s1: Simple test-time scaling

Muennighoff, Niklas, Yang, Zitong, Shi, Weijia, Li, Xiang Lisa, Fei-Fei, Li, Hajishirzi, Hannaneh, Zettlemoyer, Luke, Liang, Percy, Candès, Emmanuel, Hashimoto, Tatsunori

Test-time scaling is a promising new approach to language modeling that uses extra test-time compute to improve performance. Recently, OpenAI's o1 model showed this capability but did not publicly share its methodology, leading to many replication efforts. We seek the simplest approach to achieve test-time scaling and strong reasoning performance. First, we curate a small dataset s1K of 1,000 questions paired with reasoning traces relying on three criteria we validate through ablations: difficulty, diversity, and quality. Second, we develop budget forcing to control test-time compute by forcefully terminating the model's thinking process or lengthening it by appending "Wait" multiple times to the model's generation when it tries to end. This can lead the model to double-check its answer, often fixing incorrect reasoning steps. After supervised finetuning the Qwen2.5-32B-Instruct language model on s1K and equipping it with budget forcing, our model s1-32B exceeds o1-preview on competition math questions by up to 27% (MATH and AIME24). Further, scaling s1-32B with budget forcing allows extrapolating beyond its performance without test-time intervention: from 50% to 57% on AIME24. Our model, data, and code are open-source at https://github.com/simplescaling/s1

large language model, machine learning, natural language, (19 more...)

2501.19393

Country: North America > United States (0.45)

Genre:

Research Report (0.52)
Workflow (0.46)

Industry:

Leisure & Entertainment > Games (0.46)
Education > Educational Setting > K-12 Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

arXiv.org Machine LearningMar-1-2024

A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features

Zeger, Emi, Wang, Yifei, Mishkin, Aaron, Ergen, Tolga, Candès, Emmanuel, Pilanci, Mert

We prove that training neural networks on 1-D data is equivalent to solving a convex Lasso problem with a fixed, explicitly defined dictionary matrix of features. The specific dictionary depends on the activation and depth. We consider 2-layer networks with piecewise linear activations, deep narrow ReLU networks with up to 4 layers, and rectangular and tree networks with sign activation and arbitrary depth. Interestingly in ReLU networks, a fourth layer creates features that represent reflections of training data about themselves.

artificial intelligence, deep learning, machine learning, (18 more...)

2403.01046

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-9-2024

Bellman Conformal Inference: Calibrating Prediction Intervals For Time Series

Yang, Zitong, Candès, Emmanuel, Lei, Lihua

Uncertainty quantification for time series nowcasting and forecasting is crucial in many areas such as climate science, epidemiology, industrial engineering, and macroeconomics. Ideally, the forecaster would generate a prediction interval at each time period that is calibrated in the sense that the fraction of intervals covering the true outcomes is approximately equal to the target coverage level in the long run. Classical approaches for generating prediction intervals are mostly model-based Box and Jenkins [1976], Engle [1982a], Stock and Watson [2010], Brown [1964], Jorda [2005]. However, time series models are often mis-specified due to nonstationarity or changing environments. As a result, the model-based prediction intervals tend to be poorly calibrated (see for instance the gray curves in Figure 1). Moreover, many forecasters have upgraded their workflows by incorporating black-box machine learning algorithms [e.g. Taylor and Letham, 2018, Makridakis et al., 2018, Herzen et al., 2022], for which valid uncertainty quantification proves to be challenging.

artificial intelligence, machine learning, prediction interval, (16 more...)

2402.05203

Country:

Europe (0.68)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report (0.64)
Workflow (0.48)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Machine LearningOct-30-2023

Uncertainty Quantification over Graph with Conformalized Graph Neural Networks

Huang, Kexin, Jin, Ying, Candès, Emmanuel, Leskovec, Jure

Graph Neural Networks (GNNs) are powerful machine learning prediction models on graph-structured data. However, GNNs lack rigorous uncertainty estimates, limiting their reliable deployment in settings where the cost of errors is significant. We propose conformalized GNN (CF-GNN), extending conformal prediction (CP) to graph-based models for guaranteed uncertainty estimates. Given an entity in the graph, CF-GNN produces a prediction set/interval that provably contains the true label with pre-defined coverage probability (e.g. 90%). We establish a permutation invariance condition that enables the validity of CP on graph data and provide an exact characterization of the test-time coverage. Moreover, besides valid coverage, it is crucial to reduce the prediction set size/interval length for practical use. We observe a key connection between non-conformity scores and network structures, which motivates us to develop a topology-aware output correction model that learns to update the prediction and produces more efficient prediction sets/intervals. Extensive experiments show that CF-GNN achieves any pre-defined target marginal coverage while significantly reducing the prediction set/interval size by up to 74% over the baselines. It also empirically achieves satisfactory conditional coverage over various raw and network features.

artificial intelligence, machine learning, prediction, (18 more...)

2305.14535

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.67)
Information Technology > Networks (0.48)
Telecommunications > Networks (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

arXiv.org Artificial IntelligenceOct-5-2023

Conformal Inference for Online Prediction with Arbitrary Distribution Shifts

Gibbs, Isaac, Candès, Emmanuel

We consider the problem of forming prediction sets in an online setting where the distribution generating the data is allowed to vary over time. Previous approaches to this problem suffer from over-weighting historical data and thus may fail to quickly react to the underlying dynamics. Here we correct this issue and develop a novel procedure with provably small regret over all local time intervals of a given width. We achieve this by modifying the adaptive conformal inference (ACI) algorithm of Gibbs and Cand\`{e}s (2021) to contain an additional step in which the step-size parameter of ACI's gradient descent update is tuned over time. Crucially, this means that unlike ACI, which requires knowledge of the rate of change of the data-generating mechanism, our new procedure is adaptive to both the size and type of the distribution shift. Our methods are highly flexible and can be used in combination with any baseline predictive algorithm that produces point estimates or estimated quantiles of the target without the need for distributional assumptions. We test our techniques on two real-world datasets aimed at predicting stock market volatility and COVID-19 case counts and find that they are robust and adaptive to real-world distribution shifts.

artificial intelligence, machine learning, prediction, (16 more...)

2208.08401

Country: North America > United States (1.00)

Genre: Research Report > Promising Solution (0.34)

Industry: Banking & Finance > Trading (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningOct-28-2021

Adaptive Conformal Inference Under Distribution Shift

Gibbs, Isaac, Candès, Emmanuel

We develop methods for forming prediction sets in an online setting where the data generating distribution is allowed to vary over time in an unknown fashion. Our framework builds on ideas from conformal inference to provide a general wrapper that can be combined with any black box method that produces point predictions of the unseen label or estimated quantiles of its distribution. While previous conformal inference methods rely on the assumption that the data points are exchangeable, our adaptive approach provably achieves the desired coverage frequency over long-time intervals irrespective of the true data generating process. We accomplish this by modelling the distribution shift as a learning problem in a single parameter whose optimal value is varying over time and must be continuously re-estimated. We test our method, adaptive conformal inference, on two real world datasets and find that its predictions are robust to visible and significant distribution shifts.

artificial intelligence, machine learning, prediction, (21 more...)

2106.0017

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Banking & Finance > Trading (1.00)
Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

arXiv.org Machine LearningApr-19-2021

Testing for Outliers with Conformal p-values

Bates, Stephen, Candès, Emmanuel, Lei, Lihua, Romano, Yaniv, Sesia, Matteo

This paper studies the construction of p-values for nonparametric outlier detection, taking a multiple-testing perspective. The goal is to test whether new independent samples belong to the same distribution as a reference data set or are outliers. We propose a solution based on conformal inference, a broadly applicable framework which yields p-values that are marginally valid but mutually dependent for different test points. We prove these p-values are positively dependent and enable exact false discovery rate control, although in a relatively weak marginal sense. We then introduce a new method to compute p-values that are both valid conditionally on the training data and independent of each other for different test points; this paves the way to stronger type-I error guarantees. Our results depart from classical conformal inference as we leverage concentration inequalities rather than combinatorial arguments to establish our finite-sample guarantees. Furthermore, our techniques also yield a uniform confidence bound for the false positive rate of any outlier detection algorithm, as a function of the threshold applied to its raw statistics. Finally, the relevance of our results is demonstrated by numerical experiments on real and simulated data.

artificial intelligence, conformal p-value, health & medicine, (19 more...)

2104.08279

Country: North America > United States > California (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Government (0.67)
Energy > Oil & Gas (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)