AITopics

2503.01728

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(2 more...)

arXiv.org Machine LearningFeb-1-2025

Distributed Primal-Dual Algorithms: Unification, Connections, and Insights

Wu, Runxiong, Liu, Dong, Wang, Xueqin, Wang, Andi

We study primal-dual algorithms for general empirical risk minimization problems in distributed settings, focusing on two prominent classes of algorithms. The first class is the communication-efficient distributed dual coordinate ascent (CoCoA), derived from the coordinate ascent method for solving the dual problem. The second class is the alternating direction method of multipliers (ADMM), including consensus ADMM, linearized ADMM, and proximal ADMM. We demonstrate that both classes of algorithms can be transformed into a unified update form that involves only primal and dual variables. This discovery reveals key connections between the two classes of algorithms: CoCoA can be interpreted as a special case of proximal ADMM for solving the dual problem, while consensus ADMM is closely related to a proximal ADMM algorithm. This discovery provides the insight that by adjusting the augmented Lagrangian parameter, we can easily enable the ADMM variants to outperform the CoCoA variants. We further explore linearized versions of ADMM and analyze the effects of tuning parameters on these ADMM variants in the distributed setting. Our theoretical findings are supported by extensive simulation studies and real-world data analysis.

algorithm, artificial intelligence, machine learning, (18 more...)

2502.0047

Country:

Asia (0.28)
North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

arXiv.org Machine LearningJun-17-2024

Sparsity-Constraint Optimization via Splicing Iteration

Wang, Zezhi, Zhu, Jin, Zhu, Junxian, Tang, Borui, Lin, Hongmei, Wang, Xueqin

Sparsity-constraint optimization has wide applicability in signal processing, statistics, and machine learning. Existing fast algorithms must burdensomely tune parameters, such as the step size or the implementation of precise stop criteria, which may be challenging to determine in practice. To address this issue, we develop an algorithm named Sparsity-Constraint Optimization via sPlicing itEration (SCOPE) to optimize nonlinear differential objective functions with strong convexity and smoothness in low dimensional subspaces. Algorithmically, the SCOPE algorithm converges effectively without tuning parameters. Theoretically, SCOPE has a linear convergence rate and converges to a solution that recovers the true support set when it correctly specifies the sparsity. We also develop parallel theoretical results without restricted-isometry-property-type conditions. We apply SCOPE's versatility and power to solve sparse quadratic optimization, learn sparse classifiers, and recover sparse Markov networks for binary variables. The numerical results on these specific tasks reveal that SCOPE perfectly identifies the true support set with a 10--1000 speedup over the standard exact solver, confirming SCOPE's algorithmic and theoretical merits. Our open-source Python package skscope based on C++ implementation is publicly available on GitHub, reaching a ten-fold speedup on the competing convex relaxation methods implemented by the cvxpy library.

artificial intelligence, machine learning, sparsity-constraint optimization, (17 more...)

2406.12017

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

arXiv.org Machine LearningMar-27-2024

skscope: Fast Sparsity-Constrained Optimization in Python

Wang, Zezhi, Zhu, Jin, Chen, Peng, Peng, Huiyang, Zhang, Xiaoke, Wang, Anran, Zheng, Yu, Zhu, Junxian, Wang, Xueqin

Applying iterative solvers on sparsity-constrained optimization (SCO) requires tedious mathematical deduction and careful programming/debugging that hinders these solvers' broad impact. In the paper, the library skscope is introduced to overcome such an obstacle. With skscope, users can solve the SCO by just programming the objective function. The convenience of skscope is demonstrated through two examples in the paper, where sparse linear regression and trend filtering are addressed with just four lines of code. More importantly, skscope's efficient implementation allows state-of-the-art solvers to quickly attain the sparse solution regardless of the high dimensionality of parameter space. Numerical experiments reveal the available solvers in skscope can achieve up to 80x speedup on the competing relaxation solutions obtained via the benchmarked convex solver.

artificial intelligence, machine learning, optimization problem, (13 more...)

2403.1854

Country:

Asia > China > Guangdong Province (0.14)
Asia > China > Anhui Province (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

arXiv.org Artificial IntelligenceNov-12-2023

The impact of spatio-temporal travel distance on epidemics using an interpretable attention-based sequence-to-sequence model

Jiang, Yukang, Tian, Ting, Xie, Huajun, Guo, Hailiang, Wang, Xueqin

Amidst the COVID-19 pandemic, travel restrictions have emerged as crucial interventions for mitigating the spread of the virus. In this study, we enhance the predictive capabilities of our model, Sequence-to-Sequence Epidemic Attention Network (S2SEA-Net), by incorporating an attention module, allowing us to assess the impact of distinct classes of travel distances on epidemic dynamics. Furthermore, our model provides forecasts for new confirmed cases and deaths. To achieve this, we leverage daily data on population movement across various travel distance categories, coupled with county-level epidemic data in the United States. Our findings illuminate a compelling relationship between the volume of travelers at different distance ranges and the trajectories of COVID-19. Notably, a discernible spatial pattern emerges with respect to these travel distance categories on a national scale. We unveil the geographical variations in the influence of population movement at different travel distances on the dynamics of epidemic spread. This will contribute to the formulation of strategies for future epidemic prevention and public health policies.

artificial intelligence, machine learning, travel distance, (18 more...)

2206.02536

Country:

Asia (0.93)
North America > United States > Hawaii > Honolulu County (0.16)
North America > United States > California > Los Angeles County (0.16)
(7 more...)

Genre: Research Report > New Finding (0.54)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Public Health (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Machine LearningSep-12-2023

A Consistent and Scalable Algorithm for Best Subset Selection in Single Index Models

Tang, Borui, Zhu, Jin, Zhu, Junxian, Wang, Xueqin, Zhang, Heping

Analysis of high-dimensional data has led to increased interest in both single index models (SIMs) and best subset selection. SIMs provide an interpretable and flexible modeling framework for high-dimensional data, while best subset selection aims to find a sparse model from a large set of predictors. However, best subset selection in high-dimensional models is known to be computationally intractable. Existing methods tend to relax the selection, but do not yield the best subset solution. In this paper, we directly tackle the intractability by proposing the first provably scalable algorithm for best subset selection in high-dimensional SIMs. Our algorithmic solution enjoys the subset selection consistency and has the oracle property with a high probability. The algorithm comprises a generalized information criterion to determine the support size of the regression coefficients, eliminating the model selection tuning. Moreover, our method does not assume an error distribution or a specific link function and hence is flexible to apply. Extensive simulation results demonstrate that our method is not only computationally efficient but also able to exactly recover the best subset in various settings (e.g., linear regression, Poisson regression, heteroscedastic models).

artificial intelligence, machine learning, selection, (17 more...)

2309.0623

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

arXiv.org Artificial IntelligenceJul-31-2023

Best-Subset Selection in Generalized Linear Models: A Fast and Consistent Algorithm via Splicing Technique

Zhu, Junxian, Zhu, Jin, Tang, Borui, Chen, Xuanyu, Lin, Hongmei, Wang, Xueqin

In high-dimensional generalized linear models, it is crucial to identify a sparse model that adequately accounts for response variation. Although the best subset section has been widely regarded as the Holy Grail of problems of this type, achieving either computational efficiency or statistical guarantees is challenging. In this article, we intend to surmount this obstacle by utilizing a fast algorithm to select the best subset with high certainty. We proposed and illustrated an algorithm for best subset recovery in regularity conditions. Under mild conditions, the computational complexity of our algorithm scales polynomially with sample size and dimension. In addition to demonstrating the statistical properties of our method, extensive numerical experiments reveal that it outperforms existing methods for variable selection and coefficient estimation. The runtime analysis shows that our implementation achieves approximately a fourfold speedup compared to popular variable selection toolkits like glmnet and ncvreg.

algorithm, artificial intelligence, machine learning, (17 more...)

2308.00251

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

arXiv.org Artificial IntelligenceDec-2-2022

Simultaneous Best Subset Selection and Dimension Reduction via Primal-Dual Iterations

Wen, Canhong, Dong, Ruipeng, Wang, Xueqin, Li, Weiyu, Zhang, Heping

Sparse reduced rank regression is an essential statistical learning method. In the contemporary literature, estimation is typically formulated as a nonconvex optimization that often yields to a local optimum in numerical computation. Yet, their theoretical analysis is always centered on the global optimum, resulting in a discrepancy between the statistical guarantee and the numerical computation. In this research, we offer a new algorithm to address the problem and establish an almost optimal rate for the algorithmic solution. We also demonstrate that the algorithm achieves the estimation with a polynomial number of iterations. In addition, we present a generalized information criterion to simultaneously ensure the consistency of support set recovery and rank estimation. Under the proposed criterion, we show that our algorithm can achieve the oracle reduced rank estimation with a significant probability. The numerical studies and an application in the ovarian cancer genetic data demonstrate the effectiveness and scalability of our approach.

artificial intelligence, estimation, machine learning, (19 more...)

2211.15889

Country: North America > United States (0.28)

Genre: Research Report (0.81)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

arXiv.org Machine LearningJul-31-2022

Interpreting and predicting the economy flows: A time-varying parameter global vector autoregressive integrated the machine learning model

Jiang, Yukang, Wang, Xueqin, Xiong, Zhixi, Yang, Haisheng, Tian, Ting

The paper proposes a time-varying parameter global vector autoregressive (TVP-GVAR) framework for predicting and analysing developed region economic variables. We want to provide an easily accessible approach for the economy application settings, where a variety of machine learning models can be incorporated for out-of-sample prediction. The LASSO-type technique for numerically efficient model selection of mean squared errors (MSEs) is selected. We show the convincing in-sample performance of our proposed model in all economic variables and relatively high precision out-of-sample predictions with different-frequency economic inputs. Furthermore, the time-varying orthogonal impulse responses provide novel insights into the connectedness of economic variables at critical time points across developed regions. We also derive the corresponding asymptotic bands (the confidence intervals) for orthogonal impulse responses function under standard assumptions.

artificial intelligence, gdp, machine learning, (17 more...)

2209.05998

Country:

North America > United States > Texas (0.14)
North America > United States > Nevada (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.40)

Industry:

Energy > Oil & Gas (1.00)
Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

arXiv.org Machine LearningOct-18-2021

abess: A Fast Best Subset Selection Library in Python and R

Zhu, Jin, Hu, Liyuan, Huang, Junhao, Jiang, Kangkang, Zhang, Yanhang, Lin, Shiyun, Zhu, Junxian, Wang, Xueqin

We introduce a new library named abess that implements a unified framework of best-subset selection for solving diverse machine learning problems, e.g., linear regression, classification, and principal component analysis. Particularly, the abess certifiably gets the optimal solution within polynomial times under the linear model. Our efficient implementation allows abess to attain the solution of best-subset selection problems as fast as or even 100x faster than existing competing variable (model) selection toolboxes. Furthermore, it supports common variants like best group subset selection and $\ell_2$ regularized best-subset selection. The core of the library is programmed in C++. For ease of use, a Python library is designed for conveniently integrating with scikit-learn, and it can be installed from the Python library Index. In addition, a user-friendly R library is available at the Comprehensive R Archive Network. The source code is available at: https://github.com/abess-team/abess.

artificial intelligence, health & medicine, machine learning, (18 more...)

2110.09697

Country:

Asia > China > Guangdong Province (0.15)
Asia > China > Anhui Province (0.14)

Genre: Research Report (0.65)

Industry:

Education (0.34)
Health & Medicine > Therapeutic Area (0.31)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)