AITopics | square regression

Collaborating Authors

square regression

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

2c29d89cc56cdb191c60db2f0bae796b-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 06:53:48 GMT

A.1 Does our neural regression method work? To ensure our neural regression method works, we verify its efficacy on a known benchmark: the activity of 256 cells in the V4 and IT regions of two Rhesus macaque monkeys, a core component of BrainScore [4]. BrainScore's in-house method involves a combination of principal components analysis (for dimensionality reduction) and k-fold cross-validated partial least squares regression (for the linear mapping of model to brain activity). Here, we exchange principal components analysis for sparse random projection and partial least squares regression for ridge regression with generalized cross-validation. We compute the scores for each benchmark in the same fashion as BrainScore: as the Pearson correlation coefficient between the actual and predicted (cross-validated) activity of the biological neurons in the V4 and IT samples.

artificial intelligence, machine learning, regression, (18 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.45)

Add feedback

Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update

Neural Information Processing SystemsApr-24-2026, 21:10:17 GMT

In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration. Randomized sketching has emerged as a powerful technique for constructing estimates of the Hessian which can be used to perform approximate Newton steps. This involves multiplication by a random sketching matrix, which introduces a trade-off between the computational cost of sketching and the convergence rate of the optimization algorithm. A theoretically desirable but practically much too expensive choice is to use a dense Gaussian sketching matrix, which produces unbiased estimates of the exact Newton step and which offers strong problem-independent convergence guarantees. We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching, without substantially affecting its convergence properties. This approach, called Newton-LESS, is based on a recently introduced sketching technique: LEverage Score Sparsified (LESS) embeddings. We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings, not just up to constant factors but even down to lower order terms, for a large class of optimization tasks. In particular, this leads to a new state-of-the-art convergence result for an iterative least squares solver. Finally, we extend LESS embeddings to include uniformly sparsified random sign matrices which can be implemented efficiently and which perform well in numerical experiments.

artificial intelligence, convergence rate, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Kernel Truncated Randomized Ridge Regression: Optimal Rates and Low Noise Acceleration

Kwang-Sung Jun, Ashok Cutkosky, Francesco Orabona

Neural Information Processing SystemsFeb-14-2026, 07:07:25 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, assumption, regression, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada (0.04)
(2 more...)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)

Add feedback

Generalization Error Analysis of Quantized Compressive Learning

Xiaoyun Li, Ping Li

Neural Information Processing SystemsFeb-11-2026, 14:46:39 GMT

In this paper,we consider the learning problem where the projected data isfurther compressed byscalarquantization, which iscalled quantized compressivelearning. Generalization error bounds are derived for three models: nearest neighbor (NN) classifier, linear classifier and least squares regression.

artificial intelligence, classifier, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
North America > United States > Ohio > Franklin County > Columbus (0.04)
(6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Asymptotics for Sketching in Least Squares Regression

Neural Information Processing SystemsDec-25-2025, 02:51:48 GMT

We consider a least squares regression problem where the data has been generated from a linear model, and we are interested to learn the unknown regression parameters. We consider sketch-and-solve methods that randomly project the data first, and do regression after. Previous works have analyzed the statistical and computational performance of such methods. However, the existing analysis is not fine-grained enough to show the fundamental differences between various methods, such as the Subsampled Randomized Hadamard Transform (SRHT) and Gaussian projections. In this paper, we make progress on this problem, working in an asymptotic framework where the number of datapoints and dimension of features goes to infinity. We find the limits of the accuracy loss (for estimation and test error) incurred by popular sketching methods. We show separation between different methods, so that SRHT is better than Gaussian projections. Our theoretical results are verified on both real and synthetic data. The analysis of SRHT relies on novel methods from random matrix theory that may be of independent interest.

asymptotic, name change, sketching, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Least Squares Regression Can Exhibit Under-Parameterized Double Descent

Neural Information Processing SystemsDec-24-2025, 17:11:30 GMT

The relationship between the number of training data points, the number of parameters, and the generalization capabilities of models has been widely studied. Previous work has shown that double descent can occur in the over-parameterized regime and that the standard bias-variance trade-off holds in the under-parameterized regime. These works provide multiple reasons for the existence of the peak. We postulate that the location of the peak depends on the technical properties of both the spectrum as well as the eigenvectors of the sample covariance. We present two simple examples that provably exhibit double descent in the under-parameterized regime and do not seem to occur for reasons provided in prior work.

artificial intelligence, machine learning, square regression, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.42)

Add feedback

Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms

Neural Information Processing SystemsDec-24-2025, 13:45:52 GMT

We study the problem of least squares linear regression where the datapoints are dependent and are sampled from a Markov chain. We establish sharp information theoretic minimax lower bounds for this problem in terms of $\tmix$, the mixing time of the underlying Markov chain, under different noise settings. Our results establish that in general, optimization with Markovian data is strictly harder than optimization with independent data and a trivial algorithm (SGD-DD) that works with only one in every $\tmix$ samples, which are approximately independent, is minimax optimal. In fact, it is strictly better than the popular Stochastic Gradient Descent (SGD) method with constant step-size which is otherwise minimax optimal in the regression with independent data setting. Beyond a worst case analysis, we investigate whether structured datasets seen in practice such as Gaussian auto-regressive dynamics can admit more efficient optimization schemes. Surprisingly, even in this specific and natural setting, Stochastic Gradient Descent (SGD) with constant step-size is still no better than SGD-DD. Instead, we propose an algorithm based on experience replay--a popular reinforcement learning technique--that achieves a significantly better error rate. Our improved rate serves as one of the first results where an algorithm outperforms SGD-DD on an interesting Markov chain and also provides one of the first theoretical analyses to support the use of experience replay in practice.

fundamental limit and algorithm, markovian data, square regression, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Optimal Subsampling with Influence Functions

Daniel Ting, Eric Brochu

Neural Information Processing SystemsNov-20-2025, 16:36:48 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, influence function, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > Canada > Quebec > Montreal (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Add feedback

Kernel Truncated Randomized Ridge Regression: Optimal Rates and Low Noise Acceleration

Kwang-Sung Jun, Ashok Cutkosky, Francesco Orabona

Neural Information Processing SystemsAug-20-2025, 03:29:01 GMT

In this paper, we consider the nonparametric least square regression in a Reproducing Kernel Hilbert Space (RKHS).

algorithm, assumption, regression, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada (0.04)
(2 more...)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)

Add feedback

Review for NeurIPS paper: Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms

Neural Information Processing SystemsMay-31-2025, 17:37:34 GMT

Summary and Contributions: The authors theoretically study the problem of least squares regression where it is assumed that the data is generated from a Markov chain that has reached stationary. In this setting, the authors first establish information theoretic lower bounds for the minimax excess risk. It is shown that the convergence rate suffer by a factor \tau_mix that indicates the mixing time showing that the problem of Markovian data is intrinsically harder. It is also established that the lower bounds are tight by showing that for different noise settings SGD with data drop and Parallel SGD achieves the rate up to lograthmic factor. It is also shown that for both the noise settings, vanilla SGD with constant step size in sub-optimal. This is shown by constructing an example where updating at each step leads to a constant bias.

fundamental limit and algorithm, markovian data, square regression, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback