AITopics

2212.00856

Country:

North America > Canada (0.27)
North America > United States > New York (0.14)
North America > United States > California (0.14)
(2 more...)

Genre: Overview (1.00)

Industry:

Banking & Finance (1.00)
Transportation (0.68)
Information Technology (0.67)
(3 more...)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.93)
(3 more...)

arXiv.org Artificial IntelligenceSep-27-2023

Vertical Federated Learning: Concepts, Advances and Challenges

Liu, Yang, Kang, Yan, Zou, Tianyuan, Pu, Yanhong, He, Yuanqin, Ye, Xiaozhou, Ouyang, Ye, Zhang, Ya-Qin, Yang, Qiang

Federated Learning (FL) [1] is a novel machine learning paradigm where multiple parties collaboratively build machine learning models without centralizing their data. The concept of FL was first proposed by Google in 2016 [2] to describe a cross-device scenario where millions of mobile devices are coordinated by a central server while local data are not transferred. This concept is soon extended to a cross-silo collaboration scenario among organizations [3], where a small number of reliable organizations join a federation to train a machine learning model. In [3], FL is, for the first time, categorized into three categories based on how data is partitioned in the sample and feature space: Horizontal Federated Learning (HFL), Vertical Federated Learning (VFL) and Federated Transfer Learning (FTL) (See Figure 1). HFL refers to the FL setting where participants share the same feature space while holding different samples. For example, Google uses HFL to allow mobile phone users to use their dataset to collaboratively train a next-word prediction model [2]. VFL refers to the FL setting where datasets share the same samples/users while holding different features. For example, Webank uses VFL to collaborate with an invoice agency to build financial risk models for their enterprise customers [4].

federated learning, learning, vertical federated learning, (14 more...)

doi: 10.1109/TKDE.2024.3352628

2211.12814

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > China > Hong Kong (0.04)
Asia > China > Beijing > Beijing (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
(3 more...)

Vilucchio, Matteo, Troiani, Emanuele, Erba, Vittorio, Krzakala, Florent

Asymptotic Characterisation of Robust Empirical Risk Minimisation Performance in the Presence of Outliers

arXiv.org Machine LearningSep-27-2023

We study robust linear regression in high-dimension, when both the dimension $d$ and the number of data points $n$ diverge with a fixed ratio $\alpha=n/d$, and study a data model that includes outliers. We provide exact asymptotics for the performances of the empirical risk minimisation (ERM) using $\ell_2$-regularised $\ell_2$, $\ell_1$, and Huber losses, which are the standard approach to such problems. We focus on two metrics for the performance: the generalisation error to similar datasets with outliers, and the estimation error of the original, unpolluted function. Our results are compared with the information theoretic Bayes-optimal estimation bound. For the generalization error, we find that optimally-regularised ERM is asymptotically consistent in the large sample complexity limit if one perform a simple calibration, and compute the rates of convergence. For the estimation error however, we show that due to a norm calibration mismatch, the consistency of the estimator requires an oracle estimate of the optimal norm, or the presence of a cross-validation set not corrupted by the outliers. We examine in detail how performance depends on the loss function and on the degree of outlier corruption in the training set and identify a region of parameters where the optimal performance of the Huber loss is identical to that of the $\ell_2$ loss, offering insights into the use cases of different loss functions.

artificial intelligence, generalisation error, machine learning, (16 more...)

2305.18974

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Schmidt-Hieber, Johannes, Koolen, Wouter M

Hebbian learning inspired estimation of the linear regression parameters from queries

Local learning rules in biological neural networks (BNNs) are commonly referred to as Hebbian learning. [26] links a biologically motivated Hebbian learning rule to a specific zeroth-order optimization method. In this work, we study a variation of this Hebbian learning rule to recover the regression vector in the linear regression model. Zeroth-order optimization methods are known to converge with suboptimal rate for large parameter dimension compared to first-order methods like gradient descent, and are therefore thought to be in general inferior. By establishing upper and lower bounds, we show, however, that such methods achieve near-optimal rates if only queries of the linear regression loss are available. Moreover, we prove that this Hebbian learning rule can achieve considerably faster rates than any non-adaptive method that selects the queries independently of the data.

estimation, hebbian, linear regression parameter, (1 more...)

2311.03483

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

Pacchiardi, Lorenzo, Chan, Alex J., Mindermann, Sören, Moscovitz, Ilan, Pan, Alexa Y., Gal, Yarin, Evans, Owain, Brauner, Jan

Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM's activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by asking a predefined set of unrelated follow-up questions after a suspected lie, and feeding the LLM's yes/no answers into a logistic regression classifier. Despite its simplicity, this lie detector is highly accurate and surprisingly general. When trained on examples from a single setting -- prompting GPT-3.5 to lie about factual questions -- the detector generalises out-of-distribution to (1) other LLM architectures, (2) LLMs fine-tuned to lie, (3) sycophantic lies, and (4) lies emerging in real-life scenarios such as sales. These results indicate that LLMs have distinctive lie-related behavioural patterns, consistent across architectures and contexts, which could enable general-purpose lie detection.

black-box llm, lie detection, unrelated question, (1 more...)

2309.1584

Genre: Research Report (0.89)

Industry: Transportation > Air (0.60)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.87)

Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave Optimization

Achab, Mastane

This paper extends the classic theory of convex optimization to the minimization of functions that are equal to the negated logarithm of what we term as a sum-log-concave function, i.e., a sum of log-concave functions. In particular, we show that such functions are in general not convex but still satisfy generalized convexity inequalities. These inequalities unveil the key importance of a certain vector that we call the cross-gradient and that is, in general, distinct from the usual gradient. Thus, we propose the Cross Gradient Descent (XGD) algorithm moving in the opposite direction of the cross-gradient and derive a convergence analysis. As an application of our sum-log-concave framework, we introduce the so-called checkered regression method relying on a sum-log-concave function. This classifier extends (multiclass) logistic regression to non-linearly separable problems since it is capable of tessellating the feature space by using any given number of hyperplanes, creating a checkerboard-like pattern of decision regions.

log-concavity, sum-log-concave optimization, theory and algorithm

2309.15298

Genre: Research Report (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)

Measurement Models For Sailboats Price vs. Features And Regional Areas

Weng, Jiaqi, Feng, Chunlin, Shao, Yihan

In this study, we investigated the relationship between sailboat technical specifications and their prices, as well as regional pricing influences. Utilizing a dataset encompassing characteristics like length, beam, draft, displacement, sail area, and waterline, we applied multiple machine learning models to predict sailboat prices. The gradient descent model demonstrated superior performance, producing the lowest MSE and MAE. Our analysis revealed that monohulled boats are generally more affordable than catamarans, and that certain specifications such as length, beam, displacement, and sail area directly correlate with higher prices. Interestingly, lower draft was associated with higher listing prices. We also explored regional price determinants and found that the United States tops the list in average sailboat prices, followed by Europe, Hong Kong, and the Caribbean. Contrary to our initial hypothesis, a country's GDP showed no direct correlation with sailboat prices. Utilizing a 50% cross-validation method, our models yielded consistent results across test groups. Our research offers a machine learning-enhanced perspective on sailboat pricing, aiding prospective buyers in making informed decisions.

listing price, regression model, sailboat, (12 more...)

2309.14994

Country:

North America > United States (0.49)
Asia > China > Hong Kong (0.29)
Europe (0.25)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Sports > Sailing (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

Jamshidi, Shahrzad, Kang, Eric, Petrović, Sonja

Predicting the cardinality and maximum degree of a reduced Gr\"obner basis

arXiv.org Artificial IntelligenceSep-25-2023

We construct neural network regression models to predict key metrics of complexity for Gr\"obner bases of binomial ideals. This work illustrates why predictions with neural networks from Gr\"obner computations are not a straightforward process. Using two probabilistic models for random binomial ideals, we generate and make available a large data set that is able to capture sufficient variability in Gr\"obner complexity. We use this data to train neural networks and predict the cardinality of a reduced Gr\"obner basis and the maximum total degree of its elements. While the cardinality prediction problem is unlike classical problems tackled by machine learning, our simulations show that neural networks, providing performance statistics such as $r^2 = 0.401$, outperform naive guess or multiple regression models with $r^2 = 0.180$.

algorithm, bner basis, neural network, (15 more...)

2302.05364

Country:

North America > United States > Illinois (0.04)
North America > United States > California > Yolo County > Davis (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Vanderbeek, Alyssa M., Vidovszky, Anna A., Ross, Jessica L., Sabbaghi, Arman, Walsh, Jonathan R., Fisher, Charles K., Disease, the Critical Path for Alzheimer's, Initiative, the Alzheimer's Disease Neuroimaging, Disease, the European Prevention of Alzheimer's, Consortium, null, Study, the Alzheimer's Disease Cooperative

A Weighted Prognostic Covariate Adjustment Method for Efficient and Powerful Treatment Effect Inferences in Randomized Controlled Trials

arXiv.org Machine LearningSep-25-2023

A crucial task for a randomized controlled trial (RCT) is to specify a statistical method that can yield an efficient estimator and powerful test for the treatment effect. A novel and effective strategy to obtain efficient and powerful treatment effect inferences is to incorporate predictions from generative artificial intelligence (AI) algorithms into covariate adjustment for the regression analysis of a RCT. Training a generative AI algorithm on historical control data enables one to construct a digital twin generator (DTG) for RCT participants, which utilizes a participant's baseline covariates to generate a probability distribution for their potential control outcome. Summaries of the probability distribution from the DTG are highly predictive of the trial outcome, and adjusting for these features via regression can thus improve the quality of treatment effect inferences, while satisfying regulatory guidelines on statistical analyses, for a RCT. However, a critical assumption in this strategy is homoskedasticity, or constant variance of the outcome conditional on the covariates. In the case of heteroskedasticity, existing covariate adjustment methods yield inefficient estimators and underpowered tests. We propose to address heteroskedasticity via a weighted prognostic covariate adjustment methodology (Weighted PROCOVA) that adjusts for both the mean and variance of the regression model using information obtained from the DTG. We prove that our method yields unbiased treatment effect estimators, and demonstrate via comprehensive simulation studies and case studies from Alzheimer's disease that it can reduce the variance of the treatment effect estimator, maintain the Type I error rate, and increase the power of the test for the treatment effect from 80% to 85%~90% when the variances from the DTG can explain 5%~10% of the variation in the RCT participants' outcomes.

artificial intelligence, deep learning, machine learning, (17 more...)

2309.14256

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > Canada (0.04)
(2 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)

van de Wiel, Mark A., Amestoy, Matteo, Hoogland, Jeroen

Linked shrinkage to improve estimation of interaction effects in regression models

arXiv.org Machine LearningSep-25-2023

We address a classical problem in statistics: adding two-way interaction terms to a regression model. As the covariate dimension increases quadratically, we develop an estimator that adapts well to this increase, while providing accurate estimates and appropriate inference. Existing strategies overcome the dimensionality problem by only allowing interactions between relevant main effects. Building on this philosophy, we implement a softer link between the two types of effects using a local shrinkage model. We empirically show that borrowing strength between the amount of shrinkage for main effects and their interactions can strongly improve estimation of the regression coefficients. Moreover, we evaluate the potential of the model for inference, which is notoriously hard for selection strategies. Large-scale cohort data are used to provide realistic illustrations and evaluations. Comparisons with other methods are provided. The evaluation of variable importance is not trivial in regression models with many interaction terms. Therefore, we derive a new analytical formula for the Shapley value, which enables rapid assessment of individual-specific variable importance scores and their uncertainties. Finally, while not targeting for prediction, we do show that our models can be very competitive to a more advanced machine learner, like random forest, even for fairly large sample sizes. The implementation of our method in RStan is fairly straightforward, allowing for adjustments to specific needs.

artificial intelligence, machine learning, noise, (18 more...)

2309.13998

Country: Europe > Netherlands > North Holland > Amsterdam (0.05)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Health Care Providers & Services (0.46)
Health & Medicine > Therapeutic Area (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)