AITopics

2401.02152

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)
Europe > Finland > Uusimaa > Helsinki (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Siciliano, Federico, Maiano, Luca, Papa, Lorenzo, Baccini, Federica, Amerini, Irene, Silvestri, Fabrizio

Adversarial Data Poisoning for Fake News Detection: How to Make a Model Misclassify a Target News without Modifying It

arXiv.org Artificial IntelligenceJan-4-2024

Fake news detection models are critical to countering disinformation but can be manipulated through adversarial attacks. In this position paper, we analyze how an attacker can compromise the performance of an online learning detector on specific news content without being able to manipulate the original target news. In some contexts, such as social networks, where the attacker cannot exert complete control over all the information, this scenario can indeed be quite plausible. Therefore, we show how an attacker could potentially introduce poisoning data into the training data to manipulate the behavior of an online learning method. Our initial findings reveal varying susceptibility of logistic regression models based on complexity and attack type.

data poisoning attack, online, target sample, (11 more...)

2312.15228

Country: Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report (0.93)

Industry:

Media > News (1.00)
Information Technology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

arXiv.org Artificial IntelligenceJan-4-2024

Predicting Drug Solubility Using Different Machine Learning Methods -- Linear Regression Model with Extracted Chemical Features vs Graph Convolutional Neural Network

Ho, John, Yin, Zhao-Heng, Zhang, Colin, Guo, Nicole, Ha, Yang

Predicting the solubility of given molecules remains crucial in the pharmaceutical industry. In this study, we revisited this extensively studied topic, leveraging the capabilities of contemporary computing resources. We employed two machine learning models: a linear regression model and a graph convolutional neural network (GCNN) model, using various experimental datasets. Both methods yielded reasonable predictions, with the GCNN model exhibiting the highest level of performance. However, the present GCNN model has limited interpretability while the linear regression model allows scientists for a greater in-depth analysis of the underlying factors through feature importance analysis, although more human inputs and evaluations on the overall dataset is required. From the perspective of chemistry, using the linear regression model, we elucidated the impact of individual atom species and functional groups on overall solubility, highlighting the significance of comprehending how chemical structure influences chemical properties in the drug development process. It is learned that introducing oxygen atoms can increase the solubility of organic molecules, while almost all other hetero atoms except oxygen and nitrogen tend to decrease solubility.

linear regression model, molecule, solubility, (13 more...)

2308.12325

Country:

North America > United States > California > Alameda County > Berkeley (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Cheng, Kai, Zimmermann, Ralf

Sliced gradient-enhanced Kriging for high-dimensional function approximation

arXiv.org Machine LearningJan-4-2024

Gradient-enhanced Kriging (GE-Kriging) is a well-established surrogate modelling technique for approximating expensive computational models. However, it tends to get impractical for high-dimensional problems due to the size of the inherent correlation matrix and the associated high-dimensional hyper-parameter tuning problem. To address these issues, a new method, called sliced GE-Kriging (SGE-Kriging), is developed in this paper for reducing both the size of the correlation matrix and the number of hyper-parameters. We first split the training sample set into multiple slices, and invoke Bayes' theorem to approximate the full likelihood function via a sliced likelihood function, in which multiple small correlation matrices are utilized to describe the correlation of the sample set rather than one large one. Then, we replace the original high-dimensional hyper-parameter tuning problem with a low-dimensional counterpart by learning the relationship between the hyper-parameters and the derivative-based global sensitivity indices. The performance of SGE-Kriging is finally validated by means of numerical experiments with several benchmarks and a high-dimensional aerodynamic modeling problem. The results show that the SGE-Kriging model features an accuracy and robustness that is comparable to the standard one but comes at much less training costs. The benefits are most evident for high-dimensional problems with tens of variables.

artificial intelligence, machine learning, sge-kriging, (17 more...)

doi: 10.1137/22M154315X

2204.03562

Country:

Europe > Denmark > Southern Denmark (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
(2 more...)

arXiv.org Machine LearningJan-3-2024

Observable adjustments in single-index models for regularized M-estimators

Bellec, Pierre C

We consider observations $(X,y)$ from single index models with unknown link function, Gaussian covariates and a regularized M-estimator $\hat\beta$ constructed from convex loss function and regularizer. In the regime where sample size $n$ and dimension $p$ are both increasing such that $p/n$ has a finite limit, the behavior of the empirical distribution of $\hat\beta$ and the predicted values $X\hat\beta$ has been previously characterized in a number of models: The empirical distributions are known to converge to proximal operators of the loss and penalty in a related Gaussian sequence model, which captures the interplay between ratio $p/n$, loss, regularization and the data generating process. This connection between$(\hat\beta,X\hat\beta)$ and the corresponding proximal operators require solving fixed-point equations that typically involve unobservable quantities such as the prior distribution on the index or the link function. This paper develops a different theory to describe the empirical distribution of $\hat\beta$ and $X\hat\beta$: Approximations of $(\hat\beta,X\hat\beta)$ in terms of proximal operators are provided that only involve observable adjustments. These proposed observable adjustments are data-driven, e.g., do not require prior knowledge of the index or the link function. These new adjustments yield confidence intervals for individual components of the index, as well as estimators of the correlation of $\hat\beta$ with the index. The interplay between loss, regularization and the model is thus captured in a data-driven manner, without solving the fixed-point equations studied in previous works. The results apply to both strongly convex regularizers and unregularized M-estimation. Simulations are provided for the square and logistic loss in single index models including logistic regression and 1-bit compressed sensing with 20\% corrupted bits.

bellec observable adjustment, right-hand side, theorem 5, (15 more...)

2204.0699

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Ahrens, Achim, Hansen, Christian B., Schaffer, Mark E., Wiemann, Thomas

Model Averaging and Double Machine Learning

arXiv.org Machine LearningJan-3-2024

This paper discusses pairing double/debiased machine learning (DDML) with stacking, a model averaging method for combining multiple candidate learners, to estimate structural parameters. We introduce two new stacking approaches for DDML: short-stacking exploits the cross-fitting step of DDML to substantially reduce the computational burden and pooled stacking enforces common stacking weights over cross-fitting folds. Using calibrated simulation studies and two applications estimating gender gaps in citations and wages, we show that DDML with stacking is more robust to partially unknown functional forms than common alternative approaches based on single pre-selected learners. We provide Stata and R software implementing our proposals.

candidate learner, learner, regularization, (14 more...)

2401.01645

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Mexico > Oaxaca (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Artificial IntelligenceJan-3-2024

Generalization Error Curves for Analytic Spectral Algorithms under Power-law Decay

Li, Yicheng, Gan, Weiye, Shi, Zuoqiang, Lin, Qian

The neural tangent kernel (NTK) theory (Jacot et al., 2018), which shows that the gradient kernel regression approximates the over-parametrized neural network trained by gradient descent well (Jacot et al., 2018; Allen-Zhu et al., 2019; Lee et al., 2019), brings us a natural surrogate to understand the generalization behavior of the neural networks in certain circumstances. This surrogate has led to recent renaissance of the study of kernel methods. For example, one would ask whether overfitting could harm the generalization (Bartlett et al., 2020), how the smoothness of the underlying regression function would affect the generalization error (Li et al., 2023), or if one can determine the lower bound of the generalization error at a specific function? All these problems can be answered by the generalization error curve which aims at determining the exact generalization error of a certain kernel regression method with respect to the kernel, the regression function, the noise level and the choice of the regularization parameter. It is clear that such a generalization error curve would provide a comprehensive picture of the generalization ability of the corresponding kernel regression method (Bordelon et al., 2020; Cui et al., 2021; Li et al., 2023).

generalization, nullt 1 2, spectral algorithm, (12 more...)

2401.01599

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Fiedler, Felix, Lucia, Sergio

Improved uncertainty quantification for neural networks with Bayesian last layer

arXiv.org Artificial IntelligenceJan-3-2024

Uncertainty quantification is an important task in machine learning - a task in which standardneural networks (NNs) have traditionally not excelled. This can be a limitation for safety-critical applications, where uncertainty-aware methods like Gaussian processes or Bayesian linear regression are often preferred. Bayesian neural networks are an approach to address this limitation. They assume probability distributions for all parameters and yield distributed predictions. However, training and inference are typically intractable and approximations must be employed. A promising approximation is NNs with Bayesian last layer (BLL). They assume distributed weights only in the linear output layer and yield a normally distributed prediction. To approximate the intractable Bayesian neural network, point estimates of the distributed weights in all but the last layer should be obtained by maximizing the marginal likelihood. This has previously been challenging, as the marginal likelihood is expensive to evaluate in this setting. We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation. Furthermore, we address the challenge of uncertainty quantification for extrapolation points. We provide a metric to quantify the degree of extrapolation and derive a method to improve the uncertainty quantification for these points. Our methods are derived for the multivariate case and demonstrated in a simulation study. In comparison to Bayesian linear regression with fixed features, and a Bayesian neural network trained with variational inference, our proposed method achieves the highest log-predictive density on test data.

affine cost, extrapolation, neural network, (16 more...)

doi: 10.1109/ACCESS.2023.3329685

2302.10975

Country: Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

arXiv.org Machine LearningJan-2-2024

Efficient Sparse Least Absolute Deviation Regression with Differential Privacy

Liu, Weidong, Mao, Xiaojun, Zhang, Xiaofei, Zhang, Xin

In recent years, privacy-preserving machine learning algorithms have attracted increasing attention because of their important applications in many scientific fields. However, in the literature, most privacy-preserving algorithms demand learning objectives to be strongly convex and Lipschitz smooth, which thus cannot cover a wide class of robust loss functions (e.g., quantile/least absolute loss). In this work, we aim to develop a fast privacy-preserving learning solution for a sparse robust regression problem. Our learning loss consists of a robust least absolute loss and an $\ell_1$ sparse penalty term. To fast solve the non-smooth loss under a given privacy budget, we develop a Fast Robust And Privacy-Preserving Estimation (FRAPPE) algorithm for least absolute deviation regression. Our algorithm achieves a fast estimation by reformulating the sparse LAD problem as a penalized least square estimation problem and adopts a three-stage noise injection to guarantee the $(\epsilon,\delta)$-differential privacy. We show that our algorithm can achieve better privacy and statistical accuracy trade-off compared with the state-of-the-art privacy-preserving regression algorithms. In the end, we conduct experiments to verify the efficiency of our proposed FRAPPE algorithm.

algorithm, privacy, regression, (16 more...)

doi: 10.1109/TIFS.2023.3349054

2401.01294

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Iowa (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

arXiv.org Artificial IntelligenceJan-2-2024

Uncertainty Regularized Evidential Regression

Ye, Kai, Chen, Tiejin, Wei, Hua, Zhan, Liang

The Evidential Regression Network (ERN) represents a novel approach that integrates deep learning with Dempster-Shafer's theory to predict a target and quantify the associated uncertainty. Guided by the underlying theory, specific activation functions must be employed to enforce non-negative values, which is a constraint that compromises model performance by limiting its ability to learn from all samples. This paper provides a theoretical analysis of this limitation and introduces an improvement to overcome it. Initially, we define the region where the models can't effectively learn from the samples. Following this, we thoroughly analyze the ERN and investigate this constraint. Leveraging the insights from our analysis, we address the limitation by introducing a novel regularization term that empowers the ERN to learn from the whole training set. Our extensive experiments substantiate our theoretical findings and demonstrate the effectiveness of the proposed solution.

ern, hua, prediction, (16 more...)

2401.01484

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.48)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)