group lasso
Consistent feature selection for analytic deep neural networks
One of the most important steps toward interpretability and explainability of neural network models is feature selection, which aims to identify the subset of relevant features. Theoretical results in the field have mostly focused on the prediction aspect of the problem with virtually no work on feature selection consistency for deep neural networks due to the model's severe nonlinearity and unidentifiability. This lack of theoretical foundation casts doubt on the applicability of deep learning to contexts where correct interpretations of the features play a central role. In this work, we investigate the problem of feature selection for analytic deep networks. We prove that for a wide class of networks, including deep feed-forward neural networks, convolutional neural networks and a major sub-class of residual neural networks, the Adaptive Group Lasso selection procedure with Group Lasso as the base estimator is selection-consistent. The work provides further evidence that Group Lasso might be inefficient for feature selection with neural networks and advocates the use of Adaptive Group Lasso over the popular Group Lasso.
Smooth Bilevel Programming for Sparse Regularization
Iteratively reweighted least square (IRLS) is a popular approach to solve sparsity-enforcing regression problems in machine learning. State of the art approaches are more efficient but typically rely on specific coordinate pruning schemes. In this work, we show how a surprisingly simple re-parametrization of IRLS, coupled with a bilevel resolution (instead of an alternating scheme) is able to achieve top performances on a wide range of sparsity (such as Lasso, group Lasso and trace norm regularizations), regularization strength (including hard constraints), and design matrices (ranging from correlated designs to differential operators). Similarly to IRLS, our method only involves linear systems resolutions, but in sharp contrast, corresponds to the minimization of a smooth function. Despite being non-convex, we show that there is no spurious minima and that saddle points are ridable'', so that there always exists a descent direction. We thus advocate for the use of a BFGS quasi-Newton solver, which makes our approach simple, robust and efficient. We perform a numerical benchmark of the convergence speed of our algorithm against state of the art solvers for Lasso, group Lasso, trace norm and linearly constrained problems. These results highlight the versatility of our approach, removing the need to use different solvers depending on the specificity of the ML problem under study.
- North America > United States > Washington > King County > Seattle (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Austria > Vienna (0.14)
- North America > United States > California (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (2 more...)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
- North America > United States > Virginia (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- (5 more...)
- Government > Regional Government > North America Government > United States Government (0.68)
- Energy (0.46)
Efficient Group Lasso Regularized Rank Regression with Data-Driven Parameter Determination
Lin, Meixia, Shi, Meijiao, Xiao, Yunhai, Zhang, Qian
High-dimensional regression often suffers from heavy-tailed noise and outliers, which can severely undermine the reliability of least-squares based methods. To improve robustness, we adopt a non-smooth Wilcoxon score based rank objective and incorporate structured group sparsity regularization, a natural generalization of the lasso, yielding a group lasso regularized rank regression method. By extending the tuning-free parameter selection scheme originally developed for the lasso, we introduce a data-driven, simulation-based tuning rule and further establish a finite-sample error bound for the resulting estimator. On the computational side, we develop a proximal augmented Lagrangian method for solving the associated optimization problem, which eliminates the singularity issues encountered in existing methods, thereby enabling efficient semismooth Newton updates for the subproblems. Extensive numerical experiments demonstrate the robustness and effectiveness of our proposed estimator against alternatives, and showcase the scalability of the algorithm across both simulated and real-data settings.
- Asia > Singapore (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Henan Province > Zhengzhou (0.04)
- (3 more...)
Structure-Aware Compound-Protein Affinity Prediction via Graph Neural Network with Group Lasso Regularization
Shi, Zanyu, Wang, Yang, Weerawarna, Pathum, Zhang, Jie, Richardson, Timothy, Wang, Yijie, Huang, Kun
Explainable artificial intelligence (XAI) approaches have been increasingly applied in drug discovery to learn molecular representations and identify substructures driving property predictions. However, building end-to-end explainable models for structure-activity relationship (SAR) modeling for compound property prediction faces many challenges, such as the limited number of compound-protein interaction activity data for specific protein targets, and plenty of subtle changes in molecular configuration sites significantly affecting molecular properties. We exploit pairs of molecules with activity cliffs that share scaffolds but differ at substituent sites, characterized by large potency differences for specific protein targets. We propose a framework by implementing graph neural networks (GNNs) to leverage property and structure information from activity cliff pairs to predict compound-protein affinity (i.e., half maximal inhibitory concentration, IC50). To enhance model performance and explainability, we train GNNs with structure-aware loss functions using group lasso and sparse group lasso regularizations, which prune and highlight molecular subgraphs relevant to activity differences. We applied this framework to activity cliff data of molecules targeting three proto-oncogene tyrosine-protein kinase Src proteins (PDB IDs: 1O42, 2H8H, 4MXO). Our approach improved property prediction by integrating common and uncommon node information with sparse group lasso, as reflected in reduced root mean squared error (RMSE) and improved Pearson's correlation coefficient (PCC). Applying regularizations also enhances feature attribution for GNN by boosting graph-level global direction scores and improving atom-level coloring accuracy. These advances strengthen model interpretability in drug discovery pipelines, particularly for identifying critical molecular substructures in lead optimization.
- North America > United States > Indiana > Marion County > Indianapolis (0.05)
- North America > United States > Indiana > Monroe County > Bloomington (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
Exclusive Feature Learning on Arbitrary Structures via $\ell_{1,2}$-norm
Deguang Kong, Ryohei Fujimaki, Ji Liu, Feiping Nie, Chris Ding
Group LASSO is widely used to enforce the structural sparsity, which achieves the sparsity at the inter-group level. In this paper, we propose a new formulation called "exclusive group LASSO", which brings out sparsity at intra-group level in the context of feature selection. The proposed exclusive group LASSO is applicable on any feature structures, regardless of their overlapping or non-overlapping structures. We provide analysis on the properties of exclusive group LASSO, and propose an effective iteratively re-weighted algorithm to solve the corresponding optimization problem with rigorous convergence analysis. We show applications of exclusive group LASSO for uncorrelated feature selection. Extensive experiments on both synthetic and real-world datasets validate the proposed method.
- North America > United States > Texas > Tarrant County > Arlington (0.14)
- North America > United States > California > Santa Clara County > Cupertino (0.04)
- North America > United States > New York > Monroe County > Rochester (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > Arizona (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)