Goto

Collaborating Authors

 elasticnet


A novel hybrid approach for positive-valued DAG learning

Zhao, Yao

arXiv.org Machine Learning

Causal discovery from observational data remains a fundamental challenge in machine learning and statistics, particularly when variables represent inherently positive quantities such as gene expression levels, asset prices, company revenues, or population counts, which often follow multiplicative rather than additive dynamics. We propose the Hybrid Moment-Ratio Scoring (H-MRS) algorithm, a novel method for learning directed acyclic graphs (DAGs) from positive-valued data by combining moment-based scoring with log-scale regression. The key idea is that for positive-valued variables, the moment ratio $\frac{\mathbb{E}[X_j^2]}{\mathbb{E}[(\mathbb{E}[X_j \mid S])^2]}$ provides an effective criterion for causal ordering, where $S$ denotes candidate parent sets. H-MRS integrates log-scale Ridge regression for moment-ratio estimation with a greedy ordering procedure based on raw-scale moment ratios, followed by Elastic Net-based parent selection to recover the final DAG structure. Experiments on synthetic log-linear data demonstrate competitive precision and recall. The proposed method is computationally efficient and naturally respects positivity constraints, making it suitable for applications in genomics and economics. These results suggest that combining log-scale modeling with raw-scale moment ratios provides a practical framework for causal discovery in positive-valued domains.


Choosing the Right Regularizer for Applied ML: Simulation Benchmarks of Popular Scikit-learn Regularization Frameworks

Knight, Benjamin S., Bajaj, Ahsaas

arXiv.org Machine Learning

This study surveys the historical development of regularization, tracing its evolution from stepwise regression in the 1960s to recent advancements in formal error control, structured penalties for non-independent features, Bayesian methods, and l0-based regularization (among other techniques). We empirically evaluate the performance of four canonical frameworks -- Ridge, Lasso, ElasticNet, and Post-Lasso OLS -- across 134,400 simulations spanning a 7-dimensional manifold grounded in eight production-grade machine learning models. Our findings demonstrate that for prediction accuracy when the sample-to-feature ratio is sufficient (n/p >= 78), Ridge, Lasso, and ElasticNet are nearly interchangeable. However, we find that Lasso recall is highly fragile under multicollinearity; at high condition numbers (kappa) and low SNR, Lasso recall collapses to 0.18 while ElasticNet maintains 0.93. Consequently, we advise practitioners against using Lasso or Post-Lasso OLS at high kappa with small sample sizes. The analysis concludes with an objective-driven decision guide to assist machine learning engineers in selecting the optimal scikit-learn-supported framework based on observable feature space attributes.



Provably tuning the ElasticNet across instances

Neural Information Processing Systems

An important unresolved challenge in the theory of regularization is to set the regularization coefficients of popular techniques like the ElasticNet with general provable guarantees. We consider the problem of tuning the regularization parameters of Ridge regression, LASSO, and the ElasticNet across multiple problem instances, a setting that encompasses both cross-validation and multi-task hyperparameter optimization. We obtain a novel structural result for the ElasticNet which characterizes the loss as a function of the tuning parameters as a piecewise-rational function with algebraic boundaries. We use this to bound the structural complexity of the regularized loss functions and show generalization guarantees for tuning the ElasticNet regression coefficients in the statistical setting. We also consider the more challenging online learning setting, where we show vanishing average expected regret relative to the optimal parameter pair. We further extend our results to tuning classification algorithms obtained by thresholding regression fits regularized by Ridge, LASSO, or ElasticNet. Our results are the first general learning-theoretic guarantees for this important class of problems that avoid strong assumptions on the data distribution. Furthermore, our guarantees hold for both validation and popular information criterion objectives.







Provably tuning the ElasticNet across instances

Neural Information Processing Systems

An important unresolved challenge in the theory of regularization is to set the regularization coefficients of popular techniques like the ElasticNet with general provable guarantees. We consider the problem of tuning the regularization parameters of Ridge regression, LASSO, and the ElasticNet across multiple problem instances, a setting that encompasses both cross-validation and multi-task hyperparameter optimization. We obtain a novel structural result for the ElasticNet which characterizes the loss as a function of the tuning parameters as a piecewise-rational function with algebraic boundaries. We use this to bound the structural complexity of the regularized loss functions and show generalization guarantees for tuning the ElasticNet regression coefficients in the statistical setting. We also consider the more challenging online learning setting, where we show vanishing average expected regret relative to the optimal parameter pair.