AITopics | lasso problem

Collaborating Authors

lasso problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unveiling Hidden Convexity in Deep Learning: a Sparse Signal Processing Perspective

Zeger, Emi, Pilanci, Mert

arXiv.org Machine LearningMar-26-2026

Deep neural networks (DNNs), particularly those using Rectified Linear Unit (ReLU) activation functions, have achieved remarkable success across diverse machine learning tasks, including image recognition, audio processing, and language modeling. Despite this success, the non-convex nature of DNN loss functions complicates optimization and limits theoretical understanding. In this paper, we highlight how recently developed convex equivalences of ReLU NNs and their connections to sparse signal processing models can address the challenges of training and understanding NNs. Recent research has uncovered several hidden convexities in the loss landscapes of certain NN architectures, notably two-layer ReLU networks and other deeper or varied architectures. This paper seeks to provide an accessible and educational overview that bridges recent advances in the mathematics of deep learning with traditional signal processing, encouraging broader signal processing applications.

artificial intelligence, machine learning, training problem, (19 more...)

arXiv.org Machine Learning

2603.23831

Country:

North America > United States > New York (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A List of definitions and notations

Neural Information Processing SystemsFeb-8-2026, 21:03:09 GMT

For the convenience of the reader, we summarize a list of notations blow. 1. null G In Appendix B.1, we present a general statement of Theorem 3.1 (a) along with its proof. Theorem 3.1 (a) states the order recovery guarantee for a specified parameter We summarize the bounds for (I) and (II) in Lemma B.1 and Lemma B.2, which can be found in Collecting the results in Lemma B.1 and Lemma B.2 and reorganizing the terms in the inequalities, we have the following conclusion. We now state the proof of this Lemma. Then we bound the first term using the concentration bound on Chi-squared random variables. For the non-identifiable models, we can use Lemma H.1 in a similar way to obtain that with probability We now state the proof of this Lemma.

artificial intelligence, machine learning, probability, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

A fast algorithm for solving the lasso problem exactly without homotopy using differential inclusions

Langlois, Gabriel P., Darbon, Jérôme

arXiv.org Artificial IntelligenceOct-21-2025

We prove in this work that the well-known lasso problem can be solved exactly without homotopy using novel differential inclusions techniques. Specifically, we show that a selection principle from the theory of differential inclusions transforms the dual lasso problem into the problem of calculating the trajectory of a projected dynamical system that we prove is integrable. Our analysis yields an exact algorithm for the lasso problem, numerically up to machine precision, that is amenable to computing regularization paths and is very fast. Moreover, we show the continuation of solutions to the integrable projected dynamical system in terms of the hyperparameter naturally yields a rigorous homotopy algorithm. Numerical experiments confirm that our algorithm outperforms the state-of-the-art algorithms in both efficiency and accuracy. Beyond this work, we expect our results and analysis can be adapted to compute exact or approximate solutions to a broader class of polyhedral-constrained optimization problems.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.05562

Country: North America > United States (0.92)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

170f6aa36530c364b77ddf83a84e7351-Supplemental.pdf

Neural Information Processing SystemsOct-9-2025, 13:22:31 GMT

artificial intelligence, encoder gap, perturbation, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.46)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 03:46:58 GMT

Memory can be an issue when estimating a large-scale inverse covariance matrix. In the setting where an l^1 penalty is used, the authors propose a block approach that significantly reduces memory usage while satisfying convergence guarantees.

algorithm, iteration, lasso problem, (11 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

A List of definitions and notations

Neural Information Processing SystemsAug-14-2025, 17:08:48 GMT

inequality hold, probability, theorem 3, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Adaptive Iterative Soft-Thresholding Algorithm with the Median Absolute Deviation

Feng, Yining, Selesnick, Ivan

arXiv.org Machine LearningJul-4-2025

Abstract--The adaptive Iterative Soft-Thresholding Algorithm (IST A) has been a popular algorithm for finding a desirable solution to the LASSO problem without explicitly tuning the regularization parameter λ. Despite that the adaptive IST A is a successful practical algorithm, few theoretical results exist. In this paper, we present the theoretical analysis on the adaptive IST A with the thresh-olding strategy of estimating noise level by median absolut e deviation. We show properties of the fixed points of the algorithm, including scale equivariance, non-uniqueness, and local stability, prove the local linear convergence guarantee, and show its global convergence behavior . Many sparse approximation problems in machine learning and signal processing can be obtained as the solution to the LASSO problem, which can be solved by IST A. Despite its popularity, tuning The obtained LASSO solution is optimal in the mean-squared-error (MSE) sense with minimum assumptions, but LARS is not competitive in terms of computation time for large-scale problems [7].

algorithm, artificial intelligence, lasso solution, (15 more...)

arXiv.org Machine Learning

2507.02084

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsFeb-12-2025, 00:10:07 GMT

Summary: This paper presents an unprecedentedly fast method for eliminating variables during iterations of multi-task group lasso that is provably safe, meaning that all variables that are eliminated would ultimately obtain zero weights when running vanilla group lasso. This method is iterative; as the primal solver converges, it eliminates an increasing number of variables. The authors compare their method to previous methods and demonstrate that all previous methods are either unsafe or are substantially slower for small duality gaps. The authors describe how their method should be applied to specific cases of group lasso, including l1 and l1/l2 regularized logistic regression, and present real applications where their method achieves a substantial speed-up over vanilla group lasso and an existing method for small duality gap thresholds. The primary situation where this method obtains substantial speed improvements over alternatives is where the duality gap threshold is extremely small.

group lasso, section 2, section 4, (10 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.71)

Technology: Information Technology > Artificial Intelligence (0.36)

Add feedback

Black Boxes and Looking Glasses: Multilevel Symmetries, Reflection Planes, and Convex Optimization in Deep Networks

Zeger, Emi, Pilanci, Mert

arXiv.org Machine LearningOct-11-2024

We show that training deep neural networks (DNNs) with absolute value activation and arbitrary input dimension can be formulated as equivalent convex Lasso problems with novel features expressed using geometric algebra. This formulation reveals geometric structures encoding symmetry in neural networks. Using the equivalent Lasso form of DNNs, we formally prove a fundamental distinction between deep and shallow networks: deep networks inherently favor symmetric structures in their fitted functions, with greater depth enabling multilevel symmetries, i.e., symmetries within symmetries. Moreover, Lasso features represent distances to hyperplanes that are reflected across training points. These reflection hyperplanes are spanned by training data and are orthogonal to optimal weight vectors. Numerical experiments support theory and demonstrate theoretically predicted features when training networks using embeddings generated by Large Language Models. Recent advancements have demonstrated that deep neural networks are powerful models that can perform tasks including natural language processing, synthetic data and image generation, classification, and regression. However, research literature still lacks in intuitively understanding why deep networks are so powerful: what they "look for" in data, or in other words, how each layer extracts features. We are interested in the following question: Is there a fundamental difference in the nature of functions learned by deep networks, as opposed to shallow networks? We answer this question by transforming non-convex training problems into convex formulations and analyzing their structure.

machine learning, natural language, test error, (18 more...)

arXiv.org Machine Learning

2410.04279

Country:

North America > United States > Oregon (0.14)
Europe > Czechia (0.14)

Genre: Research Report (0.64)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.69)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.69)
Energy > Oil & Gas > Midstream (0.69)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Lasso Screening Rules via Dual Polytope Projection

Neural Information Processing SystemsMar-13-2024, 18:21:45 GMT

Lasso is a widely used regression technique to find sparse representations. When the dimension of the feature space and the number of samples are extremely large, solving the Lasso problem remains challenging. To improve the efficiency of solving large-scale Lasso problems, El Ghaoui and his colleagues have proposed the SAFE rules which are able to quickly identify the inactive predictors, i.e., predictors that have 0 components in the solution vector. Then, the inactive predictors or features can be removed from the optimization problem to reduce its scale. By transforming the standard Lasso to its dual form, it can be shown that the inactive predictors include the set of inactive constraints on the optimal dual solution.

dpp rule, lasso problem, predictor, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Arizona > Maricopa County > Tempe (0.04)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback