AITopics

doi: 10.3390/bdcc8110143

2412.06837

Country:

Europe > United Kingdom (0.04)
Asia > Singapore (0.04)
Africa > Botswana (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Wycoff, Nathan, Singh, Lisa O., Arab, Ali, Donato, Katharine M.

Proximal Iteration for Nonlinear Adaptive Lasso

arXiv.org Machine LearningDec-7-2024

Augmenting a smooth cost function with an $\ell_1$ penalty allows analysts to efficiently conduct estimation and variable selection simultaneously in sophisticated models and can be efficiently implemented using proximal gradient methods. However, one drawback of the $\ell_1$ penalty is bias: nonzero parameters are underestimated in magnitude, motivating techniques such as the Adaptive Lasso which endow each parameter with its own penalty coefficient. But it's not clear how these parameter-specific penalties should be set in complex models. In this article, we study the approach of treating the penalty coefficients as additional decision variables to be learned in a \textit{Maximum a Posteriori} manner, developing a proximal gradient approach to joint optimization of these together with the parameters of any differentiable cost function. Beyond reducing bias in estimates, this procedure can also encourage arbitrary sparsity structure via a prior on the penalty coefficients. We compare our method to implementations of specific sparsity structures for non-Gaussian regression on synthetic and real datasets, finding our more general method to be competitive in terms of both speed and accuracy. We then consider nonlinear models for two case studies: COVID-19 vaccination behavior and international refugee movement, highlighting the applicability of this approach to complex problems and intricate sparsity structures.

artificial intelligence, data mining, machine learning, (20 more...)

2412.05726

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > District of Columbia > Washington (0.04)
North America > Canada (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Modeling & Simulation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
(3 more...)

arXiv.org Artificial IntelligenceDec-7-2024

Combining Observational Data and Language for Species Range Estimation

Hamilton, Max, Lange, Christian, Cole, Elijah, Shepard, Alexander, Heinrich, Samuel, Mac Aodha, Oisin, Van Horn, Grant, Maji, Subhransu

Species range maps (SRMs) are essential tools for research and policy-making in ecology, conservation, and environmental management. However, traditional SRMs rely on the availability of environmental covariates and high-quality species location observation data, both of which can be challenging to obtain due to geographic inaccessibility and resource constraints. We propose a novel approach combining millions of citizen science species observations with textual descriptions from Wikipedia, covering habitat preferences and range descriptions for tens of thousands of species. Our framework maps locations, species, and text descriptions into a common space, facilitating the learning of rich spatial covariates at a global scale and enabling zero-shot range estimation from textual descriptions. Evaluated on held-out species, our zero-shot SRMs significantly outperform baselines and match the performance of SRMs obtained using tens of observations. Our approach also acts as a strong prior when combined with observational data, resulting in more accurate range estimation with less data. We present extensive quantitative and qualitative analyses of the learned representations in the context of range estimation and other spatial tasks, demonstrating the effectiveness of our approach.

large language model, machine learning, natural language, (18 more...)

2410.10931

Country:

Asia > Taiwan (0.05)
South America > Colombia (0.04)
South America > Venezuela (0.04)
(36 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Sun, Hao, Ertefaie, Ashkan, Duttweiler, Luke, Johnson, Brent A.

Constructing optimal treatment length strategies to maximize quality-adjusted lifetimes

arXiv.org Machine LearningDec-6-2024

Real-world clinical decision making is a complex process that involves balancing the risks and benefits of treatments. Quality-adjusted lifetime is a composite outcome that combines patient quantity and quality of life, making it an attractive outcome in clinical research. We propose methods for constructing optimal treatment length strategies to maximize this outcome. Existing methods for estimating optimal treatment strategies for survival outcomes cannot be applied to a quality-adjusted lifetime due to induced informative censoring. We propose a weighted estimating equation that adjusts for both confounding and informative censoring. We also propose a nonparametric estimator of the mean counterfactual quality-adjusted lifetime survival curve under a given treatment length strategy, where the weights are estimated using an undersmoothed sieve-based estimator. We show that the estimator is asymptotically linear and provide a data-dependent undersmoothing criterion. We apply our method to obtain the optimal time for percutaneous endoscopic gastrostomy insertion in patients with amyotrophic lateral sclerosis.

artificial intelligence, machine learning, modeling & simulation, (16 more...)

2412.05108

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Therapeutic Area > Rheumatology (0.88)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.88)
Health & Medicine > Therapeutic Area > Neurology > Amyotrophic Lateral Sclerosis (ALS) (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Modeling & Simulation (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Kim, Gyu Min, Jeon, Jeong Min

Hybrid deep additive neural networks

arXiv.org Machine LearningDec-6-2024

Traditional neural networks (multi-layer perceptrons) have become an important tool in data science due to their success across a wide range of tasks. However, their performance is sometimes unsatisfactory, and they often require a large number of parameters, primarily due to their reliance on the linear combination structure. Meanwhile, additive regression has been a popular alternative to linear regression in statistics. In this work, we introduce novel deep neural networks that incorporate the idea of additive regression. Our neural networks share architectural similarities with Kolmogorov-Arnold networks but are based on simpler yet flexible activation and basis functions. Additionally, we introduce several hybrid neural networks that combine this architecture with that of traditional neural networks. We derive their universal approximation properties and demonstrate their effectiveness through simulation studies and a real-data application. The numerical results indicate that our neural networks generally achieve better performance than traditional neural networks while using fewer parameters.

artificial intelligence, machine learning, poly 0, (19 more...)

2411.09175

Country:

North America > United States > California (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Machine LearningDec-6-2024

Another look at inference after prediction

Gronsbell, Jessica, Gao, Jianhui, Shi, Yaqi, McCaw, Zachary R., Cheng, David

Prediction-based (PB) inference is increasingly used in applications where the outcome of interest is difficult to obtain, but its predictors are readily available. Unlike traditional inference, PB inference performs statistical inference using a partially observed outcome and a set of covariates by leveraging a prediction of the outcome generated from a machine learning (ML) model. Motwani and Witten (2023) recently revisited two innovative PB inference approaches for ordinary least squares. They found that the method proposed by Wang et al. (2020) yields a consistent estimator for the association of interest when the ML model perfectly captures the underlying regression function. Conversely, the prediction-powered inference (PPI) method proposed by Angelopoulos et al. (2023) yields valid inference regardless of the model's accuracy. In this paper, we study the statistical efficiency of the PPI estimator. Our analysis reveals that a more efficient estimator, proposed 25 years ago by Chen and Chen (2000), can be obtained by simply adding a weight to the PPI estimator. We also contextualize PB inference with methods from the economics and statistics literature dating back to the 1960s. Our extensive theoretical and numerical analyses indicate that the Chen and Chen (CC) estimator offers a balance between robustness to ML model specification and statistical efficiency, making it the preferred choice for use in practice.

artificial intelligence, inference, machine learning, (16 more...)

2411.19908

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

arXiv.org Artificial IntelligenceDec-5-2024

Labeling questions inside issue trackers

Rasti, Aidin

One of the issues faced by the maintainers of popular open source software is the triage of newly reported issues. Many of the issues submitted to issue trackers are questions. Many people ask questions on issue trackers about their problem instead of using a proper QA website like StackOverflow. This may seem insignificant but for many of the big projects with thousands of users, this leads to spamming of the issue tracker. Reading and labeling these unrelated issues manually is a serious time consuming task and these unrelated questions add to the burden. In fact, most often maintainers demand to not submit questions in the issue tracker. To address this problem, first, we leveraged dozens of patterns to clean text of issues, we removed noises like logs, stack traces, environment variables, error messages, etc. Second, we have implemented a classification-based approach to automatically label unrelated questions. Empirical evaluations on a dataset of more than 102,000 records show that our approach can label questions with an accuracy of over 81%.

data mining, machine learning, natural language, (16 more...)

2412.04523

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia (0.04)
North America > Canada > Ontario > National Capital Region > Ottawa (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.94)
(3 more...)

Gao, Zhaoxing, Tsay, Ruey S.

Modeling High-Dimensional Dependent Data in the Presence of Many Explanatory Variables and Weak Signals

arXiv.org Machine LearningDec-5-2024

This article considers a novel and widely applicable approach to modeling high-dimensional dependent data when a large number of explanatory variables are available and the signal-to-noise ratio is low. We postulate that a $p$-dimensional response series is the sum of a linear regression with many observable explanatory variables and an error term driven by some latent common factors and an idiosyncratic noise. The common factors have dynamic dependence whereas the covariance matrix of the idiosyncratic noise can have diverging eigenvalues to handle the situation of low signal-to-noise ratio commonly encountered in applications. The regression coefficient matrix is estimated using penalized methods when the dimensions involved are high. We apply factor modeling to the regression residuals, employ a high-dimensional white noise testing procedure to determine the number of common factors, and adopt a projected Principal Component Analysis when the signal-to-noise ratio is low. We establish asymptotic properties of the proposed method, both for fixed and diverging numbers of regressors, as $p$ and the sample size $T$ approach infinity. Finally, we use simulations and empirical applications to demonstrate the efficacy of the proposed approach in finite samples.

artificial intelligence, gao and tsay, machine learning, (17 more...)

2412.04736

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

arXiv.org Artificial IntelligenceDec-5-2024

Deep Causal Inference for Point-referenced Spatial Data with Continuous Treatments

Jiang, Ziyang, Calhoun, Zach, Liu, Yiling, Duan, Lei, Carlson, David

Causal reasoning is often challenging with spatial data, particularly when handling high-dimensional inputs. To address this, we propose a neural network (NN) based framework integrated with an approximate Gaussian process to manage spatial interference and unobserved confounding. Additionally, we adopt a generalized propensity-score-based approach to address partially observed outcomes when estimating causal effects with continuous treatments. We evaluate our framework using synthetic, semi-synthetic, and real-world data inferred from satellite imagery. Our results demonstrate that NN-based models significantly outperform linear spatial regression models in estimating causal effects. Furthermore, in real-world case studies, NN-based models offer more reasonable predictions of causal effects, facilitating decision-making in relevant applications.

causal inference, confounder, indirect effect, (14 more...)

2412.04285

Country:

North America > United States > North Carolina > Durham County > Durham (0.14)
Oceania > Australia > New South Wales (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.68)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

arXiv.org Machine LearningDec-5-2024

Iterative Reweighted Framework Based Algorithms for Sparse Linear Regression with Generalized Elastic Net Penalty

Ding, Yanyun, Yao, Zhenghua, Li, Peili, Xiao, Yunhai

The elastic net penalty is frequently employed in high-dimensional statistics for parameter regression and variable selection. It is particularly beneficial compared to lasso when the number of predictors greatly surpasses the number of observations. However, empirical evidence has shown that the $\ell_q$-norm penalty (where $0 < q < 1$) often provides better regression compared to the $\ell_1$-norm penalty, demonstrating enhanced robustness in various scenarios. In this paper, we explore a generalized elastic net model that employs a $\ell_r$-norm (where $r \geq 1$) in loss function to accommodate various types of noise, and employs a $\ell_q$-norm (where $0 < q < 1$) to replace the $\ell_1$-norm in elastic net penalty. Theoretically, we establish the computable lower bounds for the nonzero entries of the generalized first-order stationary points of the proposed generalized elastic net model. For implementation, we develop two efficient algorithms based on the locally Lipschitz continuous $\epsilon$-approximation to $\ell_q$-norm. The first algorithm employs an alternating direction method of multipliers (ADMM), while the second utilizes a proximal majorization-minimization method (PMM), where the subproblems are addressed using the semismooth Newton method (SNN). We also perform extensive numerical experiments with both simulated and real data, showing that both algorithms demonstrate superior performance. Notably, the PMM-SSN is efficient than ADMM, even though the latter provides a simpler implementation.

admm, algorithm, generalized first-order stationary point, (14 more...)

2411.14875

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Henan Province > Zhengzhou (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.51)