AITopics | nullv

Collaborating Authors

nullv

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scaling Laws for Precision in High-Dimensional Linear Regression

Zhang, Dechen, Tang, Xuan, Liang, Yingyu, Zou, Difan

arXiv.org Machine LearningFeb-27-2026

Low-precision training is critical for optimizing the trade-off between model quality and training costs, necessitating the joint allocation of model size, dataset size, and numerical precision. While empirical scaling laws suggest that quantization impacts effective model and data capacities or acts as an additive error, the theoretical mechanisms governing these effects remain largely unexplored. In this work, we initiate a theoretical study of scaling laws for low-precision training within a high-dimensional sketched linear regression framework. By analyzing multiplicative (signal-dependent) and additive (signal-independent) quantization, we identify a critical dichotomy in their scaling behaviors. Our analysis reveals that while both schemes introduce an additive error and degrade the effective data size, they exhibit distinct effects on effective model size: multiplicative quantization maintains the full-precision model size, whereas additive quantization reduces the effective model size. Numerical experiments validate our theoretical findings. By rigorously characterizing the complex interplay among model scale, dataset size, and quantization error, our work provides a principled theoretical basis for optimizing training protocols under practical hardware constraints.

artificial intelligence, assumption 3, machine learning, (16 more...)

arXiv.org Machine Learning

2602.19241

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)

Add feedback

A Unified Discretization Framework for Differential Equation Approach with Lyapunov Arguments for Convex Optimization

Neural Information Processing SystemsFeb-11-2026, 20:48:03 GMT

The differential equation (DE) approach for convex optimization, which relates optimization methods to specific continuous DEs with rate-revealing Lyapunov functionals, has gained increasing interest since the seminal paper by Su-Boyd-Candès (2014).

artificial intelligence, optimization method, optimization problem, (16 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Bandit Phase Retrieval

Tor Lattimore

Neural Information Processing SystemsFeb-10-2026, 05:24:47 GMT

Roy, 2014] are not sufficient for optimal regret.

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

8be627bc543fd91be4d7f26ee86f5ee9-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 19:01:42 GMT

exp, inequality, nullv, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

Dimension-free error estimate for diffusion model and optimal scheduling

de Bortoli, Valentin, Elie, Romuald, Kazeykina, Anna, Ren, Zhenjie, Zhang, Jiacheng

arXiv.org Machine LearningDec-2-2025

Diffusion generative models have emerged as powerful tools for producing synthetic data from an empirically observed distribution. A common approach involves simulating the time-reversal of an Ornstein-Uhlenbeck (OU) process initialized at the true data distribution. Since the score function associated with the OU process is typically unknown, it is approximated using a trained neural network. This approximation, along with finite time simulation, time discretization and statistical approximation, introduce several sources of error whose impact on the generated samples must be carefully understood. Previous analyses have quantified the error between the generated and the true data distributions in terms of Wasserstein distance or Kullback-Leibler (KL) divergence. However, both metrics present limitations: KL divergence requires absolute continuity between distributions, while Wasserstein distance, though more general, leads to error bounds that scale poorly with dimension, rendering them impractical in high-dimensional settings. In this work, we derive an explicit, dimension-free bound on the discrepancy between the generated and the true data distributions. The bound is expressed in terms of a smooth test functional with bounded first and second derivatives. The key novelty lies in the use of this weaker, functional metric to obtain dimension-independent guarantees, at the cost of higher regularity on the test functions. As an application, we formulate and solve a variational problem to minimize the time-discretization error, leading to the derivation of an optimal time-scheduling strategy for the reverse-time diffusion. Interestingly, this scheduler has appeared previously in the literature in a different context; our analysis provides a new justification for its optimality, now grounded in minimizing the discretization bias in generative sampling.

arxiv, data distribution, diffusion model, (14 more...)

arXiv.org Machine Learning

2512.0182

Country:

Europe > France (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Minimizing Quadratic Functions in Constant Time

Kohei Hayashi, Yuichi Yoshida

Neural Information Processing SystemsNov-21-2025, 11:08:53 GMT

A sampling-based optimization method for quadratic functions is proposed.

artificial intelligence, machine learning, sequence, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Africa > Sudan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

Block Coordinate Descent for Neural Networks Provably Finds Global Minima

Akiyama, Shunta

arXiv.org Machine LearningOct-28-2025

In this paper, we consider a block coordinate descent (BCD) algorithm for training deep neural networks and provide a new global convergence guarantee under strictly monotonically increasing activation functions. While existing works demonstrate convergence to stationary points for BCD in neural networks, our contribution is the first to prove convergence to global minima, ensuring arbitrarily small loss. We show that the loss with respect to the output layer decreases exponentially while the loss with respect to the hidden layers remains well-controlled. Additionally, we derive generalization bounds using the Rademacher complexity framework, demonstrating that BCD not only achieves strong optimization guarantees but also provides favorable generalization performance. Moreover, we propose a modified BCD algorithm with skip connections and non-negative projection, extending our convergence guarantees to ReLU activation, which are not strictly monotonic. Empirical experiments confirm our theoretical findings, showing that the BCD algorithm achieves a small loss for strictly monotonic and ReLU activations.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2510.22667

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

A Globally Optimal Portfolio for m-Sparse Sharpe Ratio Maximization

Neural Information Processing SystemsOct-9-2025, 20:29:35 GMT

The convergence rates of PGA are also provided.

constraint, optimal solution, portfolio, (17 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe (0.04)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.96)

Add feedback

A Unified Discretization Framework for Differential Equation Approach with Lyapunov Arguments for Convex Optimization

Neural Information Processing SystemsOct-8-2025, 17:01:03 GMT

artificial intelligence, optimization method, optimization problem, (16 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Russia (0.04)
Asia > Russia (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

On the Limitation of Spectral Methods: From the Gaussian Hidden Clique Problem to Rank-One Perturbations of Gaussian Tensors

Andrea Montanari, Daniel Reichman, Ofer Zeitouni

Neural Information Processing SystemsOct-2-2025, 13:43:25 GMT

We consider the following detection problem: given a realization of a symmetric matrix X of dimension n, distinguish between the hypothesis that all upper triangular variables are i.i.d. Gaussians variables with mean 0 and variance 1 and the hypothesis that there is a planted principal submatrix B of dimension L for which all upper triangular variables are i.i.d. Gaussians with mean 1 and variance 1, whereas all other upper triangular elements of X not in B are i.i.d.

clique problem, detection problem, matrix, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback