AITopics | learning curve

Collaborating Authors

learning curve

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How Patterns Dictate Learnability in Sequential Data

Neural Information Processing SystemsJun-15-2026, 15:03:34 GMT

Sequential data--ranging from financial time series to natural language--has driven the growing adoption of autoregressive models. However, these algorithms rely on the presence of underlying patterns in the data, and their identification often depends heavily on human expertise. Misinterpreting these patterns can lead to model misspecification, resulting in increased generalization error and degraded performance. The recently proposed evolving pattern (EvoRate) metric addresses this by using the mutual information between the next data point and its past to guide regression order estimation and feature selection. Building on this idea, we introduce a general framework based on predictive information--the mutual information between the past and the future, I(Xpast;Xfuture). This quantity naturally defines an information-theoretic learning curve, which quantifies the amount of predictive information available as the observation window grows. Using this formalism, we show that the presence or absence of temporal patterns fundamentally constrains the learnability of sequential models: even an optimal predictor cannot outperform the intrinsic information limit imposed by the data. We validate our framework through experiments on synthetic data, demonstrating its ability to assess model adequacy, quantify the inherent complexity of a dataset, and reveal interpretable structure in sequential data.

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

1543d6d5cb976e4f9fbfaedf2e257967-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsJun-15-2026, 01:41:12 GMT

Sample-wise learning curves plot performance versus training set size. They are useful for studying scaling laws and speeding up hyperparameter tuning and model selection. Learning curves are often assumed to be well-behaved: monotone (i.e.

artificial intelligence, lcdb 1, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America (0.28)
Asia (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.92)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
(2 more...)

Add feedback

fbd8e65962da06f83f3f28b52774ffd0-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-28-2026, 11:03:33 GMT

artificial intelligence, machine learning, point maze, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.70)

Add feedback

Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

Lu, Weihao, Lin, Qian, Xia, Yingcun, Huang, Dongming

arXiv.org Machine LearningApr-28-2026

Existing large-dimensional theory for spectral algorithms resolves either the optimally tuned point or the interpolation limit, but leaves the under-regularized regime unexplored. We study the learning curve and benign overfitting of spectral algorithms in the largedimensional setting where the sample size and dimension are of comparable order, i.e., n dγ for some γ > 0. We first consider inner-product kernels on the sphere Sd 1 and establish a sharp asymptotic characterization of the excess risk across the full regularization path under various source conditions s 0, where smeasures the relative smoothness of the regression function. Our results reveal that the learning curve is not simply U-shaped but instead consists of three distinct regimes: over-regularized, under-regularized, and interpolation regimes. This characterization allows us to fully capture the benign overfitting phenomenon, demonstrating that benign overfitting arises consistently across both the under-regularized and interpolation regimes whenever sis positive but no larger than a critical threshold. We further show that, in the sufficiently regularized regime, the kernel learning curve is recovered by an associated sequence model. Finally, we extend the learning-curve analysis to large-dimensional KRR for a class of kernels on general domains in Rd whose low-degree eigenspaces satisfy spectral-scaling and hyper-contractivity conditions. Keywords: Spectral algorithms, learning curves, high dimension, benign overfitting. 1 Introduction Nonparametric regression studies the estimation of an unknown function f: Rd R from ni.i.d.

artificial intelligence, learning curve, machine learning, (18 more...)

arXiv.org Machine Learning

2604.23212

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

20e6b4dd2b1f82bc599c593882f67f75-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 19:40:32 GMT

artificial intelligence, foveal observation size, machine learning, (15 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

example where multi step outperforms one step

Neural Information Processing SystemsApr-25-2026, 04:38:49 GMT

As explained in the main text, this section presents an example that is only a slight modification of the one in Figure 4, but where a multi-step approach is clearly preferred over just one step. The data-generating and learning processes are exactly the same (100 trajectories of length 100, discount 0.9, α = 0.1for reverse KL regularization). The only difference is that rather than using a behavior that is a mixture of optimal and uniform, we use a behavior that is a mixture of maximally suboptimal and uniform. If we call the suboptimal policy π (which always goes down and left in our gridworld), then the behavior for the modified example is β = 0.2 π +0.8 u, where uis uniform. Results are shown in Figure 7. Figure 7: A gridworld example with modified behavior where multi-step is much better than one-step.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression

Neural Information Processing SystemsApr-24-2026, 23:33:53 GMT

As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes. Currently, theoretical understanding of the learning curves that characterize how the prediction error depends on the number of samples is restricted to either largesample asymptotics (m!1) or, for certain simple data distributions, to the high-dimensional asymptotics in which the number of samples scales linearly with the dimension (m / d). There is a wide gulf between these two regimes, including all higher-order scaling relations m / dr, which are the subject of the present paper. We focus on the problem of kernel ridge regression for dot-product kernels and present precise formulas for the mean of the test error, bias, and variance, for data drawn uniformly from the sphere with isotropic random labels in the rth-order asymptotic scaling regime m!1 with m/dr held constant. We observe a peak in the learning curve whenever m dr/r! for any integer r, leading to multiple sample-wise descent and nontrivial behavior at multiple scales. We include a colab2 notebook that reproduces the essential results of the paper.

artificial intelligence, kernel, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Add feedback

AMore Discussion

Neural Information Processing SystemsApr-24-2026, 21:47:06 GMT

Why One-step and IQL are imitation-based methods? The core difference between RL-based and imitation-based methods is that RL-based methods learn a value function of policy π while imitation-based methods don't. Learning the value function of π requires off-policy evaluation of π (i.e., learning Qπ or Vπ), which is prone to distribution shift. The policy evaluation and policy improvement will also affect each other as they are coupled. Imitation-based methods don't learn Qπ or Vπ, but some of them do learn a value function.

artificial intelligence, iteration, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.37)

Add feedback

Filters

Collaborating Authors

learning curve

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

How Patterns Dictate Learnability in Sequential Data

1543d6d5cb976e4f9fbfaedf2e257967-Paper-Datasets_and_Benchmarks_Track.pdf

fbd8e65962da06f83f3f28b52774ffd0-Supplemental-Conference.pdf

Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

20e6b4dd2b1f82bc599c593882f67f75-Supplemental-Conference.pdf

example where multi step outperforms one step

1d3591b6746204b332acb464b775d38d-Supplemental-Conference.pdf

Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression

AMore Discussion

8078e8c3055303a884ffae2d3ea00338-Supplemental-Conference.pdf