AITopics

Country: North America > United States (0.27)

Genre: Research Report > Experimental Study (0.40)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.60)

#artificialintelligenceOct-16-2021, 23:37:29 GMT

La veille de la cybersécurité

On top of recommending the excellent autobiography of Stanislaw Ulam, this post is about using the software Stan, but not directly to perform inference, instead to obtain R functions to evaluate a target's probability density function and its gradient. With which, one can implement custom methods, while still benefiting from the great work of the Stan team on the "modeling language" side. As a proof of concept I have implemented a plain Hamiltonian Monte Carlo sampler for a random effect logistic regression model (taken from a course on Multilevel Models by Germán Rodríguez), a coupling of that HMC algorithm (as in "Unbiased Hamiltonian Monte Carlo with couplings", see also this very recent article on the topic of coupling HMC), and then upper bounds on the total variation distance between the chain and its limiting distribution, as in "Estimating Convergence of Markov chains with L-Lag Couplings".

coupling, veille

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

#artificialintelligenceOct-16-2021, 14:45:57 GMT

Linear regression simple approach.

This article will discuss how to graph, organize, and set-up data using sklearn, pandas, and NumPy in reference to the Kaggle project. I am going to be using Jupyter Labs, and the code will be based on that. Sklearn: Sklearn is a machine learning software in Python's library. The main features are used for statistical modeling for topics such as regression. The sklearn API can be referenced here.

dataset, library, linear regression simple approach, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.54)

arXiv.org Machine LearningOct-16-2021

Noise-Augmented Privacy-Preserving Empirical Risk Minimization with Dual-purpose Regularizer and Privacy Budget Retrieval and Recycling

Li, Yinan, Liu, Fang

Empirical risk minimization (ERM) is a principle in statistical learning. Through ERM, we can measure the performance of a family of learning algorithms based on a set of observed training data empirically without knowing the true distribution of the data and derive theoretical bounds on the performance. ERM is routinely applied in a wide range of learning problems such as regression, classification, and clustering. In recent years, with the increasing popularity in privacy-preserving machine learning that satisfies formal privacy guarantees such as differential privacy (DP) [10], the topic of privacy-preserving ERM has also been investigated. Generally speaking, differentially private empirical risk minimization (DP-ERM) can be realized by perturbing the output (estimation or prediction), the objective function (input), or iteratively during the algorithmic optimization, given an ERM problem. For output perturbation, randomization mechanisms need to be applied every time a new output is released; for iterative algorithmic perturbation, each iteration incurs a privacy loss, careful planning and implementation of privacy accounting methods to minimize the overall privacy loss is critical. In this paper, we focus on differentially private perturbation of objective functions. Once an objective function is perturbed, the subsequent optimization does not incur additional privacy loss and all outputs generated from the optimization are also differentially private.

positive rate true positive rate, rate true positive rate, regularization, (11 more...)

2110.08676

Country: North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Meng, Xiangming, Obuchi, Tomoyuki, Kabashima, Yoshiyuki

On Model Selection Consistency of Lasso for High-Dimensional Ising Models on Tree-like Graphs

arXiv.org Machine LearningOct-16-2021

We consider the problem of high-dimensional Ising model selection using neighborhood-based least absolute shrinkage and selection operator (Lasso). It is rigorously proved that under some mild coherence conditions on the population covariance matrix of the Ising model, consistent model selection can be achieved with sample sizes $n=\Omega{(d^3\log{p})}$ for any tree-like graph in the paramagnetic phase, where $p$ is the number of variables and $d$ is the maximum node degree. When the same conditions are imposed directly on the sample covariance matrices, it is shown that a reduced sample size $n=\Omega{(d^2\log{p})}$ suffices. The obtained sufficient conditions for consistent model selection with Lasso are the same in the scaling of the sample complexity as that of $\ell_1$-regularized logistic regression. Given the popularity and efficiency of Lasso, our rigorous analysis provides a theoretical backing for its practical use in Ising model selection.

graph, ising model, tanh 2, (13 more...)

2110.085

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.14)
Asia > Middle East > Jordan (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

#artificialintelligenceOct-15-2021, 08:01:03 GMT

Population modeling with machine learning can enhance measures of mental health

Figure 1 – Figure supplement 1: Learning curves on the random split-half validation used for model building. To facilitate comparisons, we evaluated predictions of age, fluid intelligence and neuroticism from a complete set of socio-demographic variables without brain imaging using the coefficient of determination R2 metric (y-axis) to compare results obtained from 100 to 3000 training samples (x-axis). The cross-validation (CV) distribution was obtained from 100 Monte Carlo splits. Across targets, performance started to plateau after around 1000 training samples with scores virtually identical to the final model used in subsequent analyses. These benchmarks suggest that inclusion of additional training samples would not have led to substantial improvements in performance.

neuroticism, prediction, proxy measure, (14 more...)

Country:

Europe > United Kingdom (0.04)
North America > Canada (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Biswas, Arpan, Fuentes, Claudio, Hoyle, Christopher

A Nested Weighted Tchebycheff Multi-Objective Bayesian Optimization Approach for Flexibility of Unknown Utopia Estimation in Expensive Black-box Design Problems

arXiv.org Machine LearningOct-15-2021

We propose a nested weighted Tchebycheff Multi-objective Bayesian optimization framework where we build a regression model selection procedure from an ensemble of models, towards better estimation of the uncertain parameters of the weighted-Tchebycheff expensive black-box multi-objective function. In existing work, a weighted Tchebycheff MOBO approach has been demonstrated which attempts to estimate the unknown utopia in formulating acquisition function, through calibration using a priori selected regression model. However, the existing MOBO model lacks flexibility in selecting the appropriate regression models given the guided sampled data and therefore, can under-fit or over-fit as the iterations of the MOBO progress, reducing the overall MOBO performance. As it is too complex to a priori guarantee a best model in general, this motivates us to consider a portfolio of different families of predictive models fitted with current training data, guided by the WTB MOBO; the best model is selected following a user-defined prediction root mean-square-error-based approach. The proposed approach is implemented in optimizing a multi-modal benchmark problem and a thin tube design under constant loading of temperature-pressure, with minimizing the risk of creep-fatigue failure and design cost. Finally, the nested weighted Tchebycheff MOBO model performance is compared with different MOBO frameworks with respect to accuracy in parameter estimation, Pareto-optimal solutions and function evaluation cost. This method is generalized enough to consider different families of predictive models in the portfolio for best model selection, where the overall design architecture allows for solving any high-dimensional (multiple functions) complex black-box problems and can be extended to any other global criterion multi-objective optimization methods where prior knowledge of utopia is required.

architecture, objective, regression model, (17 more...)

2110.1107

Country:

North America > United States > Oregon > Benton County > Corvallis (0.04)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > British Columbia (0.04)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Transportation > Air (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

#artificialintelligenceOct-14-2021, 04:58:38 GMT

Deep Learning Prerequisites: Linear Regression in Python

We show you how one might code their own linear regression module in Python. Linear regression is the simplest machine learning model you can learn, y

data science, linear regression, python, (5 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)

Jhunjhunwala, Divyansh, Mallick, Ankur, Gadhikar, Advait, Kadhe, Swanand, Joshi, Gauri

Leveraging Spatial and Temporal Correlations in Sparsified Mean Estimation

arXiv.org Machine LearningOct-14-2021

We study the problem of estimating at a central server the mean of a set of vectors distributed across several nodes (one vector per node). When the vectors are high-dimensional, the communication cost of sending entire vectors may be prohibitive, and it may be imperative for them to use sparsification techniques. While most existing work on sparsified mean estimation is agnostic to the characteristics of the data vectors, in many practical applications such as federated learning, there may be spatial correlations (similarities in the vectors sent by different nodes) or temporal correlations (similarities in the data sent by a single node over different iterations of the algorithm) in the data vectors. We leverage these correlations by simply modifying the decoding method used by the server to estimate the mean. We provide an analysis of the resulting estimation error as well as experiments for PCA, K-Means and Logistic Regression, which show that our estimators consistently outperform more sophisticated and expensive sparsification methods.

estimator, rand-k-spatial, vector, (12 more...)