AITopics | pre

Collaborating Authors

pre

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Asymptotics of Self-Supervised Pre-training: Two-Stage M-Estimation and Representation Symmetry

Tinati, Mohammad, Tu, Stephen

arXiv.org Machine LearningMar-31-2026

Self-supervised pre-training, where large corpora of unlabeled data are used to learn representations for downstream fine-tuning, has become a cornerstone of modern machine learning. While a growing body of theoretical work has begun to analyze this paradigm, existing bounds leave open the question of how sharp the current rates are, and whether they accurately capture the complex interaction between pre-training and fine-tuning. In this paper, we address this gap by developing an asymptotic theory of pre-training via two-stage M-estimation. A key challenge is that the pre-training estimator is often identifiable only up to a group symmetry, a feature common in representation learning that requires careful treatment. We address this issue using tools from Riemannian geometry to study the intrinsic parameters of the pre-training representation, which we link with the downstream predictor through a notion of orbit-invariance, precisely characterizing the limiting distribution of the downstream test risk. We apply our main result to several case studies, including spectral pre-training, factor models, and Gaussian mixture models, and obtain substantial improvements in problem-specific factors over prior art when applicable.

artificial intelligence, machine learning, pre, (18 more...)

arXiv.org Machine Learning

2603.27631

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

On the Convergence of Encoder-only Shallow Transformers

Neural Information Processing SystemsFeb-16-2026, 07:01:55 GMT

Besides, neural tangent kernel (NTK) based analysis is also given, which facilitates a comprehensive comparison. Our theory demonstrates the separation on the importance of different scaling schemes and initialization.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

710445227fa8c1b6a9ceada902dd4741-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 18:55:45 GMT

large language model, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.67)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

A Concept uniqueness and granularity

Neural Information Processing SystemsFeb-10-2026, 07:18:15 GMT

Here, we report statistics about the uniqueness of neuron concepts, as we increase the maximum formula length of our explanations. Figure S1: Number of repeated concepts across probed vision and NLI models, by maximum formula length. Table S1: For probed Image Classification and NLI models, average number of occurrences of each detected concept and percentage of detected concepts that are unique (i.e. A.1 Image Classification Figure S1 (left) plots the number of times each unique concept appears across the 512 units of ResNet-18 as the maximum formula length increases. Table S1 displays the mean number of occurrences per concept, and percentage of concepts occurring that are unique (i.e.

artificial intelligence, hyp, maximum formula length, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.75)

Add feedback

Compositional Neurons

Neural Information Processing SystemsFeb-10-2026, 07:18:08 GMT

Figure Broden 5], reproduced mission. Image Classification.Figure 3 (left) foreachneuronN.

artificial intelligence, hyp, natural language, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Leisure & Entertainment > Sports (0.94)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.99)

Add feedback

492114f6915a69aa3dd005aa4233ef51-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 07:55:24 GMT

A deterministic path uses a self-attention and cross-attention to summarize contexts. B.1 1DRegression Architectures For models without attention (CNP, NP, BNP), we set`pre = 4,`post = 2,`dec = 3,dh = 128. For NP we set dz = 128. For Student-t noise, we addedε γ T(2.1) to the curves generated from GP with RBF kernel, whereT(2.1) is a Student'st distribution with degree of freedom2.1 and γ Unif(0,0.15). After realizing them, the prior functions are used to optimize via Bayesian optimization.

artificial intelligence, machine learning, unif, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > South Korea > Seoul > Seoul (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

05128e44e27c36bdba71221bfccf735d-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 08:25:35 GMT

algorithm, independence system, pre, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications (0.55)
Information Technology > Artificial Intelligence (0.47)

Add feedback

05128e44e27c36bdba71221bfccf735d-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 08:25:28 GMT

Recently,there appear studies aiming atdesigning more efficient and practical algorithms for this problem.

algorithm, artificial intelligence, submodular maximization, (15 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence (0.70)
Information Technology > Communications > Networks (0.47)

Add feedback

Fine Tuning a Simulation-Driven Estimator

Lakshminarayanan, Braghadeesh, Guerrero, Margarita A., Rojas, Cristian R.

arXiv.org Machine LearningJan-28-2026

Many industries now deploy high-fidelity simulators (digital twins) to represent physical systems, yet their parameters must be calibrated to match the true system. This motivated the construction of simulation-driven parameter estimators, built by generating synthetic observations for sampled parameter values and learning a supervised mapping from observations to parameters. However, when the true parameters lie outside the sampled range, predictions suffer from an out-of-distribution (OOD) error. This paper introduces a fine-tuning approach for the Two-Stage estimator that mitigates OOD effects and improves accuracy. The effectiveness of the proposed method is verified through numerical simulations.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1109/LCSYS.2025.3647070

2504.0448

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Filters

Collaborating Authors

pre

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

On the Asymptotics of Self-Supervised Pre-training: Two-Stage M-Estimation and Representation Symmetry

On the Convergence of Encoder-only Shallow Transformers

710445227fa8c1b6a9ceada902dd4741-Paper-Conference.pdf

8644b61a9bc87bf7844750a015feb600-Paper-Conference.pdf

A Concept uniqueness and granularity

Compositional Neurons

492114f6915a69aa3dd005aa4233ef51-Supplemental.pdf

05128e44e27c36bdba71221bfccf735d-Supplemental.pdf

05128e44e27c36bdba71221bfccf735d-Paper.pdf

Fine Tuning a Simulation-Driven Estimator