AITopics

2006.05158

Country:

North America > United States > New York (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.37)

#artificialintelligenceJun-8-2020, 12:32:08 GMT

Geometric Interpretation of Linear Regression

Linear regression is a linear approach to modeling the relationship between a scalar response (or dependent variable) and one or more explanatory variables (or independent variables). In geometric interpretation terms, the linear regression algorithm tries to find a plane or line that best fits the data points as well as possible. Linear regression is a regression technique that predicts real value. What does the term "finding plane that best fits the data points" mean? For the above-given sample 2-dimension dataset (Image 1), the general equation of the line that covers as numbers of points as possible is y m*x c, where m is the slope of the line, and c is the intercept term.

artificial intelligence, equation, machine learning, (10 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

CLAIMED: A CLAssification-Incorporated Minimum Energy Design to explore a multivariate response surface with feasibility constraints

Song, Yao, Sengul, Mert Y., He, Linglin, van Duin, Adri C. T., Hung, Ying, Dasgupta, Tirthankar

Motivated by the problem of optimization of force-field systems in physics using large-scale computer simulations, we consider exploration of a deterministic complex multivariate response surface. The objective is to find input combinations that generate output close to some desired or "target" vector. In spite of reducing the problem to exploration of the input space with respect to a one-dimensional loss function, the search is nontrivial and challenging due to infeasible input combinations, high dimensionalities of the input and output space and multiple "desirable" regions in the input space and the difficulty of emulating the objective function well with a surrogate model. We propose an approach that is based on combining machine learning techniques with smart experimental design ideas to locate multiple good regions in the input space.

algorithm, artificial intelligence, machine learning, (18 more...)

2006.05021

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

A Semiparametric Approach to Interpretable Machine Learning

Sani, Numair, Lee, Jaron, Nabi, Razieh, Shpitser, Ilya

Black-box models in machine learning have demonstrated excellent predictive performance in complex problems and high-dimensional settings. However, their lack of transparency and interpretability restrict the applicability of such models in critical decision-making processes. In order to combat this shortcoming, we propose a novel approach to trading off interpretability and performance in prediction models using ideas from semiparametric statistics, allowing us to combine the interpretability of parametric regression models with performance of nonparametric methods. We achieve this by utilizing a two-piece model: the first piece is interpretable and parametric, to which a second, uninterpretable residual piece is added. The performance of the overall model is optimized using methods from the sufficient dimension reduction literature. Influence function based estimators are derived and shown to be doubly robust. This allows for use of approaches such as Double Machine Learning in estimating our model parameters. We illustrate the utility of our approach via simulation studies and a data application based on predicting the length of stay in the intensive care unit among surgery patients.

estimator, int, uint, (17 more...)

2006.04732

Country:

North America > United States > Maryland > Baltimore (0.04)
North America > United States > New York (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Liao, Zhenyu, Couillet, Romain, Mahoney, Michael W.

A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent

This article characterizes the exact asymptotics of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$, their dimension $p$, and the dimension of feature space $N$ are all large and comparable. In this regime, the random RFF Gram matrix no longer converges to the well-known limiting Gaussian kernel matrix (as it does when $N \to \infty$ alone), but it still has a tractable behavior that is captured by our analysis. This analysis also provides accurate estimates of training and test regression errors for large $n,p,N$. Based on these estimates, a precise characterization of two qualitatively different phases of learning, including the phase transition between them, is provided; and the corresponding double descent test error curve is derived from this phase transition behavior. These results do not depend on strong assumptions on the data distribution, and they perfectly match empirical results on finite-dimensional real-world data sets.

artificial intelligence, co 1, machine learning, (16 more...)

2006.05013

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Liu, Frederick, Najmi, Amir, Sundararajan, Mukund

The Penalty Imposed by Ablated Data Augmentation

There is a set of data augmentation techniques that ablate parts of the input at random. These include input dropout, cutout, and random erasing. We term these techniques ablated data augmentation. Though these techniques seems similar in spirit and have shown success in improving model performance in a variety of domains, we do not yet have a mathematical understanding of the differences between these techniques like we do for other regularization techniques like L1 or L2. First, we study a formal model of mean ablated data augmentation and inverted dropout for linear regression. We prove that ablated data augmentation is equivalent to optimizing the ordinary least squares objective along with a penalty that we call the Contribution Covariance Penalty and inverted dropout, a more common implementation than dropout in popular frameworks, is equivalent to optimizing the ordinary least squares objective along with Modified L2. For deep networks, we demonstrate an empirical version of the result if we replace contributions with attributions and coefficients with average gradients, i.e., the Contribution Covariance Penalty and Modified L2 Penalty drop with the increase of the corresponding ablated data augmentation across a variety of networks.

data augmentation, machine learning, natural language, (14 more...)

2006.04769

Genre: Research Report (0.50)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

arXiv.org Machine LearningJun-7-2020

Sparse learning with CART

Klusowski, Jason M.

Decision trees with binary splits are popularly constructed using Classification and Regression Trees (CART) methodology. For regression models, this approach recursively divides the data into two near-homogenous daughter nodes according to a split point that maximizes the reduction in sum of squares error (the impurity) along a particular variable. This paper aims to study the statistical properties of regression trees constructed with CART methodology. In doing so, we find that the training error is governed by the Pearson correlation between the optimal decision stump and response data in each node, which we bound by constructing a prior distribution on the split points and solving a nonlinear optimization problem. We leverage this connection between the training error and Pearson correlation to show that CART with cost-complexity pruning achieves an optimal complexity/goodnessof-fit tradeoff when the depth scales with the logarithm of the sample size. Data dependent quantities, which adapt to the dimensionality and latent structure of the regression model, are seen to govern the rates of convergence of the prediction error.

artificial intelligence, machine learning, training error, (18 more...)

2006.04266

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Trivedi, Shubhendu, Wang, J.

The Expected Jacobian Outerproduct: Theory and Empirics

arXiv.org Machine LearningJun-5-2020

The expected gradient outerproduct (EGOP) of an unknown regression function is an operator that arises in the theory of multi-index regression, and is known to recover those directions that are most relevant to predicting the output. However, work on the EGOP, including that on its cheap estimators, is restricted to the regression setting. In this work, we adapt this operator to the multi-class setting, which we dub the expected Jacobian outerproduct (EJOP). Moreover, we propose a simple rough estimator of the EJOP and show that somewhat surprisingly, it remains statistically consistent under mild assumptions. Furthermore, we show that the eigenvalues and eigenspaces also remain consistent. Finally, we show that the estimated EJOP can be used as a metric to yield improvements in real-world non-parametric classification tasks: both by its use as a metric, and also as cheap initialization in metric learning tasks.

ejop, matrix, probability, (16 more...)

2006.0355

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Gurumoorthy, Karthik S., Sanyal, Subhajit, Chaoji, Vineet

Think out of the package: Recommending package types for e-commerce shipments

arXiv.org Machine LearningJun-5-2020

Multiple product attributes like dimensions, weight, fragility, liquid content etc. determine the package type used by e-commerce companies to ship products. Sub-optimal package types lead to damaged shipments, incurring huge damage related costs and adversely impacting the company's reputation for safe delivery. Items can be shipped in more protective packages to reduce damage costs, however this increases the shipment costs due to expensive packaging and higher transportation costs. In this work, we propose a multi-stage approach that trades-off between shipment and damage costs for each product, and accurately assigns the optimal package type using a scalable, computationally efficient linear time algorithm. A simple binary search algorithm is presented to find the hyper-parameter that balances between the shipment and damage costs. Our approach when applied to choosing package type for Amazon shipments, leads to significant cost savings of tens of millions of dollars in emerging marketplaces, by decreasing both the overall shipment cost and the number of in-transit damages. Our algorithm is live and deployed in the production system where, package types for more than 130,000 products have been modified based on the model's recommendation, realizing a reduction in damage rate of 24%.

artificial intelligence, machine learning, package type, (16 more...)

2006.03239

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Spain > Castilla-La Mancha (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Services > e-Commerce Services (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

arXiv.org Machine LearningJun-4-2020

Median regression with differential privacy

Chen, E, Miao, Ying, Tang, Yu

Personal privacy information may be exposed with the unprecedented availability of datasets, so there is increasing requirement that statistical analysis of such datasets should protect individual privacy. As [6] describes, differential privacy addresses the paradox of learning nothing about an individual while learning useful information about a population. Over the past few years, differential privacy has been investigated in machine learning [1] and has been applied in the real world, see for example [8]. Recently, [3] formulates a general lower bound argument for minimax risks with differential privacy constraints, and applies this argument to high-dimensional mean estimation and linear regression problems. In this paper, three privacy preserving methods are proposed for median regression, which is a special case of quantile regression. Quantile regression was first introduced in [12], which aims to estimate and conduct inference about conditional quantile functions. In recent years, quantile regression has become a comprehensive method for statistical analysis of response models and it has been widely used in reality, such as survival analysis and economics, see for example, [14], [20] and [15]. The fact that the median regression takes least absolute deviation as its objective function to estimate parameters has been known among statisticians [12].

algorithm, privacy, regression, (15 more...)

2006.02983

Country:

Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)