AITopics

1903.09688

Country:

Europe (0.68)
North America > United States (0.28)
Oceania > Australia (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Renewable (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.64)

arXiv.org Machine LearningMar-22-2019

Implicit Regularization via Hadamard Product Over-Parametrization in High-Dimensional Linear Regression

Zhao, Peng, Yang, Yun, He, Qiao-Chu

We consider Hadamard product parametrization as a change-of-variable (over-parametrization) technique for solving least square problems in the context of linear regression. Despite the non-convexity and exponentially many saddle points induced by the change-of-variable, we show that under certain conditions, this over-parametrization leads to implicit regularization: if we directly apply gradient descent to the residual sum of squares with sufficiently small initial values, then under proper early stopping rule, the iterates converge to a nearly sparse rate-optimal solution with relatively better accuracy than explicit regularized approaches. In particular, the resulting estimator does not suffer from extra bias due to explicit penalties, and can achieve the parametric root-$n$ rate (independent of the dimension) under proper conditions on the signal-to-noise ratio. We perform simulations to compare our methods with high dimensional linear regression with explicit regularizations. Our results illustrate advantages of using implicit regularization via gradient descent after over-parametrization in sparse vector estimation.

artificial intelligence, gradient descent, machine learning, (16 more...)

1903.09367

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.90)

Wang, Taiyao, Paschalidis, Ioannis Ch.

Convergence of Parameter Estimates for Regularized Mixed Linear Regression Models

arXiv.org Machine LearningMar-21-2019

We consider {\em Mixed Linear Regression (MLR)}, where training data have been generated from a mixture of distinct linear models (or clusters) and we seek to identify the corresponding coefficient vectors. We introduce a {\em Mixed Integer Programming (MIP)} formulation for MLR subject to regularization constraints on the coefficient vectors. We establish that as the number of training samples grows large, the MIP solution converges to the true coefficient vectors in the absence of noise. Subject to slightly stronger assumptions, we also establish that the MIP identifies the clusters from which the training samples were generated. In the special case where training data come from a single cluster, we establish that the corresponding MIP yields a solution that converges to the true coefficient vector even when training data are perturbed by (martingale difference) noise. We provide a counterexample indicating that in the presence of noise, the MIP may fail to produce the true coefficient vectors for more than one clusters. We also provide numerical results testing the MIP solutions in synthetic examples with noise.

artificial intelligence, machine learning, regression, (15 more...)

1903.09235

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Suggala, Arun Sai, Bhatia, Kush, Ravikumar, Pradeep, Jain, Prateek

Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression

arXiv.org Machine LearningMar-19-2019

We study the problem of robust linear regression with response variable corruptions. We consider the oblivious adversary model, where the adversary corrupts a fraction of the responses in complete ignorance of the data. We provide a nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions. Existing results in this setting either don't guarantee consistent estimates or can only handle a small fraction of corruptions. We also extend our estimator to robust sparse linear regression and show that similar guarantees hold in this setting. Finally, we apply our estimator to the problem of linear regression with heavy-tailed noise and show that our estimator consistently estimates the regression vector even when the noise has unbounded variance (e.g., Cauchy distribution), for which most existing results don't even apply. Our estimator is based on a novel variant of outlier removal via hard thresholding in which the threshold is chosen adaptively and crucially relies on randomness to escape bad fixed points of the non-convex hard thresholding operation.

artificial intelligence, estimator, machine learning, (17 more...)

1903.08192

Country: North America > United States (0.45)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Rezaei, Ashkan, Fathony, Rizal, Memarrast, Omid, Ziebart, Brian

Fair Logistic Regression: An Adversarial Perspective

arXiv.org Machine LearningMar-19-2019

Fair prediction methods have primarily been built around existing classification techniques using In this paper we focus on group fairness measures, pre-processing methods, post-hoc adjustments, namely the three prevalent measures of demographic parity reduction-based constructions, or deep learning (Calders et al., 2009), equalized odds (Hardt et al., 2016), procedures. We investigate a new approach to and equalized opportunity (Hardt et al., 2016). Techniques fair data-driven decision making by designing for constructing predictors that provide these fairness guarantees predictors with fairness requirements integrated largely leverage existing classification methods as into their core formulations. We augment a black boxes. Preprocessing methods such as reweighting game-theoretic construction of the logistic regression and relabeling (Kamiran & Calders, 2012) transform model with fairness constraints, producing the input data to remove dependence between the class a novel prediction model that robustly and protected attribute according to a predefined fairness and fairly minimizes the logarithmic loss.

artificial intelligence, constraint, machine learning, (16 more...)

1903.0391

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.50)

Industry:

Health & Medicine (1.00)
Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)

Afrabandpey, Homayun, Peltola, Tomi, Kaski, Samuel

Human-in-the-loop Active Covariance Learning for Improving Prediction in Small Data Sets

Learning predictive models from small high-dimensional data sets is a key problem in high-dimensional statistics. Expert knowledge elicitation can help, and a strong line of work focuses on directly eliciting informative prior distributions for parameters. This either requires considerable statistical expertise or is laborious, as the emphasis has been on accuracy and not on efficiency of the process. Another line of work queries about importance of features one at a time, assuming them to be independent and hence missing covariance information. In contrast, we propose eliciting expert knowledge about pairwise feature similarities, to borrow statistical strength in the predictions, and using sequential decision making techniques to minimize the effort of the expert. Empirical results demonstrate improvement in predictive performance on both simulated and real data, in high-dimensional linear regression tasks, where we learn the covariance structure with a Gaussian process, based on sequential elicitation.

artificial intelligence, knowledge, machine learning, (19 more...)

1902.09834

Country:

North America > United States (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Dixon, Matthew F., Polson, Nicholas G.

Deep Fundamental Factor Models

Deep fundamental factor models are developed to interpret and capture non-linearity, interaction effects and non-parametric shocks in financial econometrics. Uncertainty quantification provides interpretability with interval estimation, ranking of factor importances and estimation of interaction effects. Estimating factor realizations under either homoscedastic or heteroscedastic error is also available. With no hidden layers we recover a linear factor model and for one or more hidden layers, uncertainty bands for the sensitivity to each input naturally arise from the network weights. To illustrate our methodology, we construct a six-factor model of assets in the S\&P 500 index and generate information ratios that are three times greater than generalized linear regression. We show that the factor importances are materially different from the linear factor model when accounting for non-linearity. Finally, we conclude with directions for future research

artificial intelligence, machine learning, neural network, (19 more...)

1903.07677

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Yang, Yingrui, Wang, Molin

Semiparametric Methods for Exposure Misclassification in Propensity Score-Based Time-to-Event Data Analysis

In epidemiology, identifying the effect of exposure variables in relation to a time-to-event outcome is a classical research area of practical importance. Incorporating propensity score in the Cox regression model, as a measure to control for confounding, has certain advantages when outcome is rare. However, in situations involving exposure measured with moderate to substantial error, identifying the exposure effect using propensity score in Cox models remains a challenging yet unresolved problem. In this paper, we propose an estimating equation method to correct for the exposure misclassification-caused bias in the estimation of exposure-outcome associations. We also discuss the asymptotic properties and derive the asymptotic variances of the proposed estimators. We conduct a simulation study to evaluate the performance of the proposed estimators in various settings. As an illustration, we apply our method to correct for the misclassification-caused bias in estimating the association of PM2.5 level with lung cancer mortality using a nationwide prospective cohort, the Nurses' Health Study (NHS). The proposed methodology can be applied using our user-friendly R function published online.

artificial intelligence, machine learning, validation study, (15 more...)

1903.07782

Country: Europe > United Kingdom (0.48)

Genre:

Research Report > Strength Medium (1.00)
Research Report > Experimental Study (1.00)
Research Report > New Finding (0.89)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Hierarchical Routing Mixture of Experts

Zhao, Wenbo, Gao, Yang, Memon, Shahan Ali, Raj, Bhiksha, Singh, Rita

In regression tasks the distribution of the data is often too complex to be fitted by a single model. In contrast, partition-based models are developed where data is divided and fitted by local models. These models partition the input space and do not leverage the input-output dependency of multimodal-distributed data, and strong local models are needed to make good predictions. Addressing these problems, we propose a binary tree-structured hierarchical routing mixture of experts (HRME) model that has classifiers as non-leaf node experts and simple regression models as leaf node experts. The classifier nodes jointly soft-partition the input-output space based on the natural separateness of multimodal data. This enables simple leaf experts to be effective for prediction. Further, we develop a probabilistic framework for the HRME model, and propose a recursive Expectation-Maximization (EM) based algorithm to learn both the tree structure and the expert models. Experiments on a collection of regression tasks validate the effectiveness of our method compared to a variety of other regression models.

artificial intelligence, machine learning, prediction, (17 more...)

1903.07756

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Ordiano, Jorge Ángel González, Gröll, Lutz, Mikut, Ralf, Hagenmeyer, Veit

Probabilistic Energy Forecasting using Quantile Regressions based on a new Nearest Neighbors Quantile Filter

Parametric quantile regressions are a useful tool for creating probabilistic energy forecasts. Nonetheless, since classical quantile regressions are trained using a non-differentiable cost function, their creation using complex data mining techniques (e.g., artificial neural networks) may be complicated. This article presents a method that uses a new nearest neighbors quantile filter to obtain quantile regressions independently of the utilized data mining technique and without the non-differentiable cost function. Thereafter, a validation of the presented method using the dataset of the Global Energy Forecasting Competition of 2014 is undertaken. The results show that the presented method is able to solve the competition's task with a similar accuracy and in a similar time as the competition's winner, but requiring a much less powerful computer. This property may be relevant in an online forecasting service for which the fast computation of probabilistic forecasts using not so powerful machines is required.

artificial intelligence, machine learning, modeling & simulation, (15 more...)

1903.0739

Country: Europe (0.28)

Genre: Research Report (0.84)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)
Energy > Renewable > Wind (0.69)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)