AITopics

2006.16619

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Oklahoma > Payne County > Cushing (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
(2 more...)

Chhaya, Rachit, Dasgupta, Anirban, Shit, Supratim

On Coresets For Regularized Regression

arXiv.org Machine LearningJun-30-2020

We study the effect of norm based regularization on the size of coresets for regression problems. Specifically, given a matrix $ \mathbf{A} \in {\mathbb{R}}^{n \times d}$ with $n\gg d$ and a vector $\mathbf{b} \in \mathbb{R} ^ n $ and $\lambda > 0$, we analyze the size of coresets for regularized versions of regression of the form $\|\mathbf{Ax}-\mathbf{b}\|_p^r + \lambda\|{\mathbf{x}}\|_q^s$ . Prior work has shown that for ridge regression (where $p,q,r,s=2$) we can obtain a coreset that is smaller than the coreset for the unregularized counterpart i.e. least squares regression (Avron et al). We show that when $r \neq s$, no coreset for regularized regression can have size smaller than the optimal coreset of the unregularized version. The well known lasso problem falls under this category and hence does not allow a coreset smaller than the one for least squares regression. We propose a modified version of the lasso problem and obtain for it a coreset of size smaller than the least square regression. We empirically show that the modified version of lasso also induces sparsity in solution, similar to the original lasso. We also obtain smaller coresets for $\ell_p$ regression with $\ell_p$ regularization. We extend our methods to multi response regularized regression. Finally, we empirically demonstrate the coreset performance for the modified lasso and the $\ell_1$ regression with $\ell_1$ regularization.

artificial intelligence, coreset, machine learning, (17 more...)

2006.0544

Country:

Asia > India > Gujarat > Gandhinagar (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Huber, Florian, Rossini, Luca

Inference in Bayesian Additive Vector Autoregressive Tree Models

Vector autoregressive (VAR) models assume linearity between the endogenous variables and their lags. This linearity assumption might be overly restrictive and could have a deleterious impact on forecasting accuracy. As a solution, we propose combining VAR with Bayesian additive regression tree (BART) models. The resulting Bayesian additive vector autoregressive tree (BAVART) model is capable of capturing arbitrary non-linear relations between the endogenous variables and the covariates without much input from the researcher. Since controlling for heteroscedasticity is key for producing precise density forecasts, our model allows for stochastic volatility in the errors. Using synthetic and real data, we demonstrate the advantages of our methods. For Eurozone data, we show that our nonparametric approach improves upon commonly used forecasting models and that it produces impulse responses to an uncertainty shock that are consistent with established findings in the literature.

bayesian inference, decision tree learning, forecast, (23 more...)

2006.16333

Country:

Europe > Netherlands (0.14)
Europe > Austria (0.14)

Genre: Research Report (0.82)

Industry:

Banking & Finance > Economy (0.95)
Energy > Oil & Gas (0.68)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Fast OSCAR and OWL Regression via Safe Screening Rules

Bao, Runxue, Gu, Bin, Huang, Heng

Ordered Weighted $L_{1}$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning. Proximal gradient methods are used as standard approaches to solve OWL regression. However, it is still a burning issue to solve OWL regression due to considerable computational cost and memory usage when the feature or sample size is large. In this paper, we propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure via an iterative strategy, which overcomes the difficulties of tackling the non-separable regularizer. It effectively avoids the updates of the parameters whose coefficients must be zero during the learning process. More importantly, the proposed screening rule can be easily applied to standard and stochastic proximal gradient methods. Moreover, we prove that the algorithms with our screening rule are guaranteed to have identical results with the original algorithms. Experimental results on a variety of datasets show that our screening rule leads to a significant computational gain without any loss of accuracy, compared to existing competitive algorithms.

algorithm, artificial intelligence, machine learning, (13 more...)

2006.16433

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Esposito, Roberto, Cerrato, Mattia, Locatelli, Marco

Partitioned Least Squares

In this paper we propose a variant of the linear least squares model allowing practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result. The output allows practitioners to assess the importance of each group and of each variable in the group. We formally show that the new formulation is not convex and provide two alternative methods to deal with the problem: one non-exact method based on an alternating least squares approach; and one exact method based on a reformulation of the problem using an exponential number of sub-problems whose minimum is guaranteed to be the optimal solution. We formally show the correctness of the exact method and also compare the two solutions showing that the exact solution provides better results in a fraction of the time required by the alternating least squares solution (assuming that the number of partitions is small). For the sake of completeness, we also provide an alternative branch and bound algorithm that can be used in place of the exact method when the number of partitions is too large, and a proof of NP-completeness of the optimization problem introduced in this paper.

algorithm, artificial intelligence, machine learning, (17 more...)

2006.16202

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Berg, Axel, Oskarsson, Magnus, O'Connor, Mark

Deep Ordinal Regression with Label Diversity

Regression via classification (RvC) is a common method used for regression problems in deep learning, where the target variable belongs to a set of continuous values. By discretizing the target into a set of non-overlapping classes, it has been shown that training a classifier can improve neural network accuracy compared to using a standard regression approach. However, it is not clear how the set of discrete classes should be chosen and how it affects the overall solution. In this work, we propose that using several discrete data representations simultaneously can improve neural network learning compared to a single representation. Our approach is end-to-end differentiable and can be added as a simple extension to conventional learning methods, such as deep neural networks. We test our method on three challenging tasks and show that our method reduces the prediction error compared to a baseline RvC approach while maintaining a similar model complexity.

artificial intelligence, estimation, machine learning, (19 more...)

2006.15864

Country: Europe > Sweden (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.55)

#artificialintelligenceJun-28-2020, 20:28:20 GMT

Why Do So Many Practicing Data Scientists Not Understand Logistic Regression?

The U.S. Weather Service has always phrased rain forecasts as probabilities. I do not want a classification of "it will rain today." There is a slight loss/disutility of carrying an umbrella, and I want to be the one to make the tradeoff. This is coming from personal experience and from multiple contexts, but it seems that many data scientists simply do not understand logistic regression, or binomials and multinomials in general. The problem arises from logistic regression often being taught as a "classification" algorithm in the machine learning world.

artificial intelligence, logistic regression, machine learning, (16 more...)

Country: North America > United States (0.75)

Genre:

Research Report > New Finding (0.98)
Research Report > Experimental Study (0.98)

Industry: Government > Regional Government > North America Government > United States Government (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.98)

#artificialintelligenceJun-28-2020, 20:25:54 GMT

Accelerating Linear Models for Machine Learning

If you have ever used Python and scikit-learn to build machine learning (ML) models from large data sets, you may have also wished that you could make these computations go faster. What if I told you that altering a single line of code could accelerate your ML computations? What if I also told you that getting faster results doesn't require specialized hardware? In this article, I will teach you how to train ridge regression models using a version of scikit-learn that is optimized for Intel CPUs, then compare the performance and accuracy of these models trained with the vanilla scikit-learn library. This article continues our series on accelerated ML algorithms.

artificial intelligence, intel distribution, machine learning, (13 more...)

Country: North America > United States (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.42)

#artificialintelligenceJun-28-2020, 16:56:22 GMT

Machine Learning Simplified

As we discussed previously, Machine Learning refers to algorithms that are used to identify patterns within data. But what exactly do we mean by "patterns", what all can we do with ML, and what is all this jargon about "models" and "training" them. In this article, I'll try to explain all this without getting too technical, and what you, as a business-user, should know about Machine Learning. Supervised Learning implies use-cases where we have a target we're trying to predict given the data. Supervised algorithms enable us to predict the target (for example the estimated credit limit, tractor sales, if the customer will churn, or the mail category) using the input data (customer's credit history, weather and macroeconomic conditions, customer's activity on the platform, mail specifications). There are models both for Regression and Classification problems, i.e. algorithms which can solve these types of problems.

artificial intelligence, machine learning, regression, (13 more...)

Industry: Banking & Finance > Credit (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

#artificialintelligenceJun-28-2020, 13:25:29 GMT

Handling Missing Data For Advanced Machine Learning

Throughout this article, you will become good at spotting, understanding, and imputing missing data. We demonstrate various imputation techniques on a real-world logistic regression task using Python. Properly handling missing data has an improving effect on inferences and predictions. This is not to be ignored. The first part of this article presents the framework for understanding missing data.

data quality, machine learning, predictor, (18 more...)

Industry: Health & Medicine (0.77)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.53)