AITopics

#artificialintelligenceOct-30-2019, 14:35:11 GMT

Statistical Modeling -- The Full Pragmatic Guide

Continuing Our series of posts on how to interpret Machine Learning algorithms and predictions. Part 0 (optional) -- What is Data Science and the Data Scientist Part 1 -- Introduction to Interpretability Part 1.5 (optional) -- A Brief History of Statistics (May be useful to understand this post) Part 2 -- (this post) Interpreting models of high bias and low variance. Part 4 -- Is it possible to resolve the trade-off between bias and variance? Using Shapley to finally open the black box! In this post we will focus on the interpretation of high bias and low variance models, as we explained in the previous post, these algorithms are the easiest to interpret so assume several prerequisites in the data.

brazil, correlation, regression, (14 more...)

#artificialintelligence

Country:

South America > Brazil (0.09)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Krishnamurthy, Akshay, Mazumdar, Arya, McGregor, Andrew, Pal, Soumyabrata

Sample Complexity of Learning Mixtures of Sparse Linear Regressions

arXiv.org Machine LearningOct-30-2019

In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly from this collection. This setting is quite expressive and has been studied both in terms of practical applications and for the sake of establishing theoretical guarantees. In this paper, we consider the case where the signal vectors are sparse; this generalizes the popular compressed sensing paradigm. We improve upon the state-of-the-art results as follows: In the noisy case, we resolve an open question of Yin et al. (IEEE Transactions on Information Theory, 2019) by showing how to handle collections of more than two vectors and present the first robust reconstruction algorithm, i.e., if the signals are not perfectly sparse, we still learn a good sparse approximation of the signals. In the noiseless case, as well as in the noisy case, we show how to circumvent the need for a restrictive assumption required in the previous work. Our techniques are quite different from those in the previous work: for the noiseless case, we rely on a property of sparse polynomials and for the noisy case, we provide new connections to learning Gaussian mixtures and use ideas from the theory of error-correcting codes.

probability, triplet, vector, (16 more...)

1910.14106

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)

arXiv.org Machine LearningOct-29-2019

Bounding Data-driven Model Errors in Power Grid Analysis

Liu, Yuxiao, Xu, Bolun, Botterud, Audun, Zhang, Ning, Kang, Chongqing

Data-driven models analyze power grids under incomplete physical information, and their accuracy has been mostly validated empirically using certain training and testing datasets. This paper explores error bounds for data-driven models under all possible training and testing scenarios, and proposes an evaluation implementation based on Rademacher complexity theory. We answer key questions for data-driven models: how much training data is required to guarantee a certain error bound, and how partial physical knowledge can be utilized to reduce the required amount of data. Our results are crucial for the evaluation and application of data-driven models in power grid analysis. We demonstrate the proposed method by finding generalization error bounds for two applications, i.e. branch flow linearization and external network equivalent under different degrees of physical knowledge. Results identify how the bounds decrease with additional power grid physical knowledge or more training data.

complexity, generalization error, rademacher complexity, (14 more...)

1910.13613

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)

Genre: Research Report (0.70)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Bernstein, Garrett, Sheldon, Daniel

Differentially Private Bayesian Linear Regression

arXiv.org Machine LearningOct-29-2019

Linear regression is an important tool across many fields that work with sensitive human-sourced data. Significant prior work has focused on producing differentially private point estimates, which provide a privacy guarantee to individuals while still allowing modelers to draw insights from data by estimating regression coefficients. We investigate the problem of Bayesian linear regression, with the goal of computing posterior distributions that correctly quantify uncertainty given privately released statistics. We show that a naive approach that ignores the noise injected by the privacy mechanism does a poor job in realistic data settings. We then develop noise-aware methods that perform inference over the privacy mechanism and produce correct posteriors across a wide range of scenarios.

posterior, statistics, sufficient statistics, (16 more...)

1910.13153

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

He, Yuzi, Burghardt, Keith, Lerman, Kristina

Learning Fair and Interpretable Representations via Linear Orthogonalization

To reduce human error and prejudice, many high-stakes decisions have been turned over to machine algorithms. However, recent research suggests that this does not remove discrimination, and can perpetuate harmful stereotypes. While algorithms have been developed to improve fairness, they typically face at least one of three shortcomings: they are not interpretable, they lose significant accuracy compared to unbiased equivalents, or they are not transferable across models. To address these issues, we propose a geometric method that removes correlations between data and any number of protected variables. Further, we can control the strength of debi-asing through an adjustable parameter to address the tradeoff between model accuracy and fairness. The resulting features are interpretable and can be used with many popular models, such as linear regression, random forest and multilayer perceptrons. The resulting predictions are found to be more accurate and fair than several comparable fair AI algorithms across a variety of benchmark datasets. Our work shows that debiasing data is a simple and effective solution toward improving fairness.

accuracy, fairness, representation, (15 more...)

1910.12854

Country:

North America > United States > California (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Li, Jiajin, Huang, Sen, So, Anthony Man-Cho

A First-Order Algorithmic Framework for Wasserstein Distributionally Robust Logistic Regression

Wasserstein distance-based distributionally robust optimization (DRO) has received much attention lately due to its ability to provide a robustness interpretation of various learning models. Moreover, many of the DRO problems that arise in the learning context admits exact convex reformulations and hence can be tackled by off-the-shelf solvers. Nevertheless, the use of such solvers severely limits the applicability of DRO in large-scale learning problems, as they often rely on general purpose interior-point algorithms. On the other hand, there are very few works that attempt to develop fast iterative methods to solve these DRO problems, which typically possess complicated structures. In this paper, we take a first step towards resolving the above difficulty by developing a first-order algorithmic framework for tackling a class of Wasserstein distance-based distributionally robust logistic regression (DRLR) problem. Specifically, we propose a novel linearized proximal ADMM to solve the DRLR problem, whose objective is convex but consists of a smooth term plus two non-separable non-smooth terms. We prove that our method enjoys a sublinear convergence rate. Furthermore, we conduct three different experiments to show its superb performance on both synthetic and real-world datasets. In particular, our method can achieve the same accuracy up to 800+ times faster than the standard off-the-shelf solver.

algorithmic framework, lp-admm, sequence, (12 more...)

1910.12778

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Nam, Seung Joon, Kim, Han Min, Kang, Thomas, Park, Cheol Young

The Study of Machine Learning Models in Predicting the Intention of Adolescents to Smoke Cigarettes

The use of electronic cigarette (e-cigarette) is increasing among adolescents. This is problematic since consuming nicotine at an early age can cause harmful effects in developing teenager's brain and health. Additionally, the use of e-cigarette has a possibility of leading to the use of cigarettes, which is more severe. There were many researches about e-cigarette and cigarette that mostly focused on finding and analyzing causes of smoking using conventional statistics. However, there is a lack of research on developing prediction models, which is more applicable to anti-smoking campaign, about e-cigarette and cigarette. In this paper, we research the prediction models that can be used to predict an individual e-cigarette user's (including non-e-cigarette users) intention to smoke cigarettes, so that one can be early informed about the risk of going down the path of smoking cigarettes. To construct the prediction models, five machine learning (ML) algorithms are exploited and tested for their accuracy in predicting the intention to smoke cigarettes among never smokers using data from the 2018 National Youth Tobacco Survey (NYTS). In our investigation, the Gradient Boosting Classifier, one of the prediction models, shows the highest accuracy out of all the other models. Also, with the best prediction model, we made a public website that enables users to input information to predict their intentions of smoking cigarettes.

cigarette, classifier, intention, (14 more...)

1910.12748

Country:

North America > United States > District of Columbia (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (0.95)

Industry:

Health & Medicine > Public Health (1.00)
Health & Medicine > Consumer Health (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Exploring Kernel Functions in the Softmax Layer for Contextual Word Classification

Gao, Yingbo, Herold, Christian, Wang, Weiyue, Ney, Hermann

Prominently used in support vector machines and logistic regressions, kernel functions (kernels) can implicitly map data points into high dimensional spaces and make it easier to learn complex decision boundaries. In this work, by replacing the inner product function in the softmax layer, we explore the use of kernels for contextual word classification. In order to compare the individual kernels, experiments are conducted on standard language modeling and machine translation tasks. We observe a wide range of performances across different kernel settings. Extending the results, we look at the gradient properties, investigate various mixture strategies and examine the disambiguation abilities.

individual kernel, international conference, kernel, (12 more...)

1910.12554

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(5 more...)

Genre: Research Report > New Finding (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

#artificialintelligenceOct-27-2019, 21:21:39 GMT

A Gentle Introduction to Logistic Regression With Maximum Likelihood Estimation

Logistic regression is a model for binary classification predictive modeling. The parameters of a logistic regression model can be estimated by the probabilistic framework called maximum likelihood estimation. Under this framework, a probability distribution for the target variable (class label) must be assumed and then a likelihood function defined that calculates the probability of observing the outcome given the input data and the model. This function can then be optimized to find the set of parameters that results in the largest sum likelihood over the training dataset. The maximum likelihood approach to fitting a logistic regression model both aids in better understanding the form of the logistic regression model and provides a template that can be used for fitting classification models more generally.

logistic regression model, probability, regression, (9 more...)

#artificialintelligence

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)